llama-stack-mirror/tests/client-sdk
Yuan Tang efdd60014d
test: Enable logprobs top_k tests for remote::vllm (#1080)
top_k supported was added in
https://github.com/meta-llama/llama-stack/pull/1074. The tests should be
enabled as well.

Verified that tests pass for remote::vllm:

```
LLAMA_STACK_BASE_URL=http://localhost:5003 pytest -v tests/client-sdk/inference/test_text_inference.py -k " test_completion_log_probs_non_streaming or test_completion_log_probs_streaming"
================================================================ test session starts ================================================================
platform linux -- Python 3.10.16, pytest-8.3.4, pluggy-1.5.0 -- /home/yutang/.conda/envs/distribution-myenv/bin/python3.10
cachedir: .pytest_cache
rootdir: /home/yutang/repos/llama-stack
configfile: pyproject.toml
plugins: anyio-4.8.0
collected 14 items / 12 deselected / 2 selected                                                                                                     

tests/client-sdk/inference/test_text_inference.py::test_completion_log_probs_non_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED           [ 50%]
tests/client-sdk/inference/test_text_inference.py::test_completion_log_probs_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED               [100%]

=================================================== 2 passed, 12 deselected, 1 warning in 10.03s ====================================================
```

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-13 13:44:57 -05:00
..
agents build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
inference test: Enable logprobs top_k tests for remote::vllm (#1080) 2025-02-13 13:44:57 -05:00
safety Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
tool_runtime build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
vector_io feat: Adding sqlite-vec as a vectordb (#1040) 2025-02-12 10:50:03 -08:00
__init__.py [tests] add client-sdk pytests & delete client.py (#638) 2024-12-16 12:04:56 -08:00
conftest.py build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
metadata.py Report generation minor fixes (#884) 2025-01-28 04:58:12 -08:00
README.md test: Split inference tests to text and vision (#1008) 2025-02-07 09:35:49 -08:00
report.py build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00

Llama Stack Integration Tests

You can run llama stack integration tests on either a Llama Stack Library or a Llama Stack endpoint.

To test on a Llama Stack library with certain configuration, run

LLAMA_STACK_CONFIG=./llama_stack/templates/cerebras/run.yaml
pytest -s -v tests/client-sdk/inference/

or just the template name

LLAMA_STACK_CONFIG=together
pytest -s -v tests/client-sdk/inference/

To test on a Llama Stack endpoint, run

LLAMA_STACK_BASE_URL=http//localhost:8089
pytest -s -v tests/client-sdk/inference

Report Generation

To generate a report, run with --report option

LLAMA_STACK_CONFIG=together pytest -s -v report.md tests/client-sdk/ --report

Common options

Depending on the API, there are custom options enabled

  • For tests in inference/ and agents/, we support --inference-model(to be used in text inference tests) and--vision-inference-model` (only used in image inference tests) overrides
  • For tests in vector_io/, we support --embedding-model override
  • For tests in safety/, we support --safety-shield override
  • The param can be --report or --report <path> If path is not provided, we do a best effort to infer based on the config / template name. For url endpoints, path is required.