forked from phoenix-oss/llama-stack-mirror

History

Ashwin Bharambe 314ee09ae3 chore: move all Llama Stack types from llama-models to llama-stack (#1098 ) llama-models should have extremely minimal cruft. Its sole purpose should be didactic -- show the simplest implementation of the llama models and document the prompt formats, etc. This PR is the complement to https://github.com/meta-llama/llama-models/pull/279 ## Test Plan Ensure all `llama` CLI `model` sub-commands work: ```bash llama model list llama model download --model-id ... llama model prompt-format -m ... ``` Ran tests: ```bash cd tests/client-sdk LLAMA_STACK_CONFIG=fireworks pytest -s -v inference/ LLAMA_STACK_CONFIG=fireworks pytest -s -v vector_io/ LLAMA_STACK_CONFIG=fireworks pytest -s -v agents/ ``` Create a fresh venv `uv venv && source .venv/bin/activate` and run `llama stack build --template fireworks --image-type venv` followed by `llama stack run together --image-type venv` <-- the server runs Also checked that the OpenAPI generator can run and there is no change in the generated files as a result. ```bash cd docs/openapi_generator sh run_openapi_generator.sh ```		2025-02-14 09:10:59 -08:00
..
agents	test: add test for Agent.create_turn non-streaming response (#1078 )	2025-02-13 16:17:50 -08:00
inference	test: Enable logprobs top_k tests for remote::vllm (#1080 )	2025-02-13 13:44:57 -05:00
safety	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
tool_runtime	build: format codebase imports using ruff linter (#1028 )	2025-02-13 10:06:21 -08:00
vector_io	fix: disable sqlite-vec test (#1090 )	2025-02-13 18:40:16 -08:00
__init__.py	[tests] add client-sdk pytests & delete client.py (#638 )	2024-12-16 12:04:56 -08:00
conftest.py	build: format codebase imports using ruff linter (#1028 )	2025-02-13 10:06:21 -08:00
metadata.py	Report generation minor fixes (#884 )	2025-01-28 04:58:12 -08:00
README.md	test: Split inference tests to text and vision (#1008 )	2025-02-07 09:35:49 -08:00
report.py	chore: move all Llama Stack types from llama-models to llama-stack (#1098 )	2025-02-14 09:10:59 -08:00

README.md

Llama Stack Integration Tests

You can run llama stack integration tests on either a Llama Stack Library or a Llama Stack endpoint.

To test on a Llama Stack library with certain configuration, run

LLAMA_STACK_CONFIG=./llama_stack/templates/cerebras/run.yaml
pytest -s -v tests/client-sdk/inference/

or just the template name

LLAMA_STACK_CONFIG=together
pytest -s -v tests/client-sdk/inference/

To test on a Llama Stack endpoint, run

LLAMA_STACK_BASE_URL=http//localhost:8089
pytest -s -v tests/client-sdk/inference

Report Generation

To generate a report, run with --report option

LLAMA_STACK_CONFIG=together pytest -s -v report.md tests/client-sdk/ --report

Common options

Depending on the API, there are custom options enabled

For tests in inference/ and agents/, we support --inference-model(to be used in text inference tests) and--vision-inference-model` (only used in image inference tests) overrides
For tests in vector_io/, we support --embedding-model override
For tests in safety/, we support --safety-shield override
The param can be --report or --report <path> If path is not provided, we do a best effort to infer based on the config / template name. For url endpoints, path is required.