mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-28 02:53:30 +00:00
top_k supported was added in https://github.com/meta-llama/llama-stack/pull/1074. The tests should be enabled as well. Verified that tests pass for remote::vllm: ``` LLAMA_STACK_BASE_URL=http://localhost:5003 pytest -v tests/client-sdk/inference/test_text_inference.py -k " test_completion_log_probs_non_streaming or test_completion_log_probs_streaming" ================================================================ test session starts ================================================================ platform linux -- Python 3.10.16, pytest-8.3.4, pluggy-1.5.0 -- /home/yutang/.conda/envs/distribution-myenv/bin/python3.10 cachedir: .pytest_cache rootdir: /home/yutang/repos/llama-stack configfile: pyproject.toml plugins: anyio-4.8.0 collected 14 items / 12 deselected / 2 selected tests/client-sdk/inference/test_text_inference.py::test_completion_log_probs_non_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED [ 50%] tests/client-sdk/inference/test_text_inference.py::test_completion_log_probs_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED [100%] =================================================== 2 passed, 12 deselected, 1 warning in 10.03s ==================================================== ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> |
||
---|---|---|
.. | ||
agents | ||
inference | ||
safety | ||
tool_runtime | ||
vector_io | ||
__init__.py | ||
conftest.py | ||
metadata.py | ||
README.md | ||
report.py |
Llama Stack Integration Tests
You can run llama stack integration tests on either a Llama Stack Library or a Llama Stack endpoint.
To test on a Llama Stack library with certain configuration, run
LLAMA_STACK_CONFIG=./llama_stack/templates/cerebras/run.yaml
pytest -s -v tests/client-sdk/inference/
or just the template name
LLAMA_STACK_CONFIG=together
pytest -s -v tests/client-sdk/inference/
To test on a Llama Stack endpoint, run
LLAMA_STACK_BASE_URL=http//localhost:8089
pytest -s -v tests/client-sdk/inference
Report Generation
To generate a report, run with --report
option
LLAMA_STACK_CONFIG=together pytest -s -v report.md tests/client-sdk/ --report
Common options
Depending on the API, there are custom options enabled
- For tests in
inference/
andagents/, we support
--inference-model(to be used in text inference tests) and
--vision-inference-model` (only used in image inference tests) overrides - For tests in
vector_io/
, we support--embedding-model
override - For tests in
safety/
, we support--safety-shield
override - The param can be
--report
or--report <path>
If path is not provided, we do a best effort to infer based on the config / template name. For url endpoints, path is required.