mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-22 12:37:53 +00:00
test: Add one-step integration testing with server auto-start (#2580)
Some checks failed
Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 4s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.13, inference) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 12s
Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 13s
Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 12s
Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 22s
Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 20s
Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 14s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 20s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 19s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 11s
Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 16s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 22s
Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 15s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 14s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s
Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s
Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 13s
Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s
Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 21s
Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Python Package Build Test / build (3.12) (push) Failing after 1m3s
Python Package Build Test / build (3.13) (push) Failing after 1m3s
Test External Providers / test-external-providers (venv) (push) Failing after 1m7s
Unit Tests / unit-tests (3.12) (push) Failing after 1m15s
Unit Tests / unit-tests (3.13) (push) Failing after 19s
Pre-commit / pre-commit (push) Successful in 2m42s
Some checks failed
Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 4s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.13, inference) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 12s
Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 13s
Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 12s
Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 22s
Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 20s
Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 14s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 20s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 19s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 11s
Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 16s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 22s
Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 15s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 14s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s
Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s
Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 13s
Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s
Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 21s
Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Python Package Build Test / build (3.12) (push) Failing after 1m3s
Python Package Build Test / build (3.13) (push) Failing after 1m3s
Test External Providers / test-external-providers (venv) (push) Failing after 1m7s
Unit Tests / unit-tests (3.12) (push) Failing after 1m15s
Unit Tests / unit-tests (3.13) (push) Failing after 19s
Pre-commit / pre-commit (push) Successful in 2m42s
## Summary Add support for `server:<config>` format in `--stack-config` option to enable seamless one-step integration testing. This eliminates the need to manually start servers in separate terminals before running tests. ## Key Features - **Auto-start server**: Automatically launches `llama stack run <config>` if target port is available - **Smart reuse**: Reuses existing server if port is already occupied - **Health check polling**: Waits up to 2 minutes for server readiness via `/v1/health` endpoint - **Custom port support**: Use `server:<config>:<port>` for non-default ports - **Clean output**: Server runs quietly in background without cluttering test output - **Backward compatibility**: All existing `--stack-config` formats continue to work ## Usage Examples ```bash # Auto-start server with default port 8321 pytest tests/integration/inference/ --stack-config=server:fireworks # Use custom port pytest tests/integration/safety/ --stack-config=server:together:8322 # Run multiple test suites seamlessly pytest tests/integration/inference/ tests/integration/agents/ --stack-config=server:starter ``` ## Implementation Details - Enhanced `llama_stack_client` fixture with server management - Updated documentation with cleaner organization and comprehensive examples - Added utility functions for port checking, server startup, and health verification ## Test Plan - Verified server auto-start when port 8321 is available - Verified server reuse when port 8321 is occupied - Tested health check polling via `/v1/health` endpoint - Confirmed custom port configuration works correctly - Verified backward compatibility with existing config formats ## Before/After Comparison **Before (2 steps):** ```bash # Terminal 1: Start server manually llama stack run fireworks --port 8321 # Terminal 2: Wait for startup, then run tests pytest tests/integration/inference/ --stack-config=http://localhost:8321 ``` **After (1 step):** ```bash # Single command handles everything pytest tests/integration/inference/ --stack-config=server:fireworks ```
This commit is contained in:
parent
958600a5c1
commit
fc735a414e
2 changed files with 110 additions and 5 deletions
|
@ -9,7 +9,9 @@ pytest --help
|
|||
```
|
||||
|
||||
Here are the most important options:
|
||||
- `--stack-config`: specify the stack config to use. You have three ways to point to a stack:
|
||||
- `--stack-config`: specify the stack config to use. You have four ways to point to a stack:
|
||||
- **`server:<config>`** - automatically start a server with the given config (e.g., `server:fireworks`). This provides one-step testing by auto-starting the server if the port is available, or reusing an existing server if already running.
|
||||
- **`server:<config>:<port>`** - same as above but with a custom port (e.g., `server:together:8322`)
|
||||
- a URL which points to a Llama Stack distribution server
|
||||
- a template (e.g., `fireworks`, `together`) or a path to a `run.yaml` file
|
||||
- a comma-separated list of api=provider pairs, e.g. `inference=fireworks,safety=llama-guard,agents=meta-reference`. This is most useful for testing a single API surface.
|
||||
|
@ -26,12 +28,39 @@ Model parameters can be influenced by the following options:
|
|||
Each of these are comma-separated lists and can be used to generate multiple parameter combinations. Note that tests will be skipped
|
||||
if no model is specified.
|
||||
|
||||
Experimental, under development, options:
|
||||
- `--record-responses`: record new API responses instead of using cached ones
|
||||
|
||||
|
||||
## Examples
|
||||
|
||||
### Testing against a Server
|
||||
|
||||
Run all text inference tests by auto-starting a server with the `fireworks` config:
|
||||
|
||||
```bash
|
||||
pytest -s -v tests/integration/inference/test_text_inference.py \
|
||||
--stack-config=server:fireworks \
|
||||
--text-model=meta-llama/Llama-3.1-8B-Instruct
|
||||
```
|
||||
|
||||
Run tests with auto-server startup on a custom port:
|
||||
|
||||
```bash
|
||||
pytest -s -v tests/integration/inference/ \
|
||||
--stack-config=server:together:8322 \
|
||||
--text-model=meta-llama/Llama-3.1-8B-Instruct
|
||||
```
|
||||
|
||||
Run multiple test suites with auto-server (eliminates manual server management):
|
||||
|
||||
```bash
|
||||
# Auto-start server and run all integration tests
|
||||
export FIREWORKS_API_KEY=<your_key>
|
||||
|
||||
pytest -s -v tests/integration/inference/ tests/integration/safety/ tests/integration/agents/ \
|
||||
--stack-config=server:fireworks \
|
||||
--text-model=meta-llama/Llama-3.1-8B-Instruct
|
||||
```
|
||||
|
||||
### Testing with Library Client
|
||||
|
||||
Run all text inference tests with the `together` distribution:
|
||||
|
||||
```bash
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue