mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-16 09:58:10 +00:00

History

Hardik Shah 28a0fe57cc fix: Update rag examples to use fresh faiss index every time (#998 ) # What does this PR do? In several examples we use the same faiss index , which means running it multiple times fills up the index with duplicates which eventually degrades the model performance on RAG as multiple copies of the same irrelevant chunks might be picked up several times. Fix is to ensure we create a new index each time. Resolves issue in this discussion - https://github.com/meta-llama/llama-stack/discussions/995 ## Test Plan Re-ran the getting started guide multiple times to see the same output Co-authored-by: Hardik Shah <hjshah@fb.com>		2025-02-06 16:12:29 -08:00
..
agents	fix: Update rag examples to use fresh faiss index every time (#998 )	2025-02-06 16:12:29 -08:00
inference	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
safety	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
tool_runtime	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
vector_io	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
__init__.py	[tests] add client-sdk pytests & delete client.py (#638 )	2024-12-16 12:04:56 -08:00
conftest.py	Update client-sdk test config option handling	2025-01-31 15:30:07 -08:00
metadata.py	Report generation minor fixes (#884 )	2025-01-28 04:58:12 -08:00
README.md	Fix report generation for url endpoints (#876 )	2025-01-24 13:15:44 -08:00
report.py	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00

README.md

Llama Stack Integration Tests

You can run llama stack integration tests on either a Llama Stack Library or a Llama Stack endpoint.

To test on a Llama Stack library with certain configuration, run

LLAMA_STACK_CONFIG=./llama_stack/templates/cerebras/run.yaml
pytest -s -v tests/client-sdk/inference/test_inference.py

or just the template name

LLAMA_STACK_CONFIG=together
pytest -s -v tests/client-sdk/inference/test_inference.py

To test on a Llama Stack endpoint, run

LLAMA_STACK_BASE_URL=http//localhost:8089
pytest -s -v tests/client-sdk/inference/test_inference.py

Report Generation

To generate a report, run with --report option

LLAMA_STACK_CONFIG=together pytest -s -v report.md tests/client-sdk/ --report

Common options

Depending on the API, there are custom options enabled

For tests in inference/ and agents/, we support --inference-model(to be used in text inference tests) and--vision-inference-model` (only used in image inference tests) overrides
For tests in vector_io/, we support --embedding-model override
For tests in safety/, we support --safety-shield override
The param can be --report or --report <path> If path is not provided, we do a best effort to infer based on the config / template name. For url endpoints, path is required.