forked from phoenix-oss/llama-stack-mirror
# What does this PR do? This PR introduces more non-llama model support to llama stack. Providers introduced: openai, anthropic and gemini. All of these providers use essentially the same piece of code -- the implementation works via the `litellm` library. We will expose only specific models for providers we enable making sure they all work well and pass tests. This setup (instead of automatically enabling _all_ providers and models allowed by LiteLLM) ensures we can also perform any needed prompt tuning on a per-model basis as needed (just like we do it for llama models.) ## Test Plan ```bash #!/bin/bash args=("$@") for model in openai/gpt-4o anthropic/claude-3-5-sonnet-latest gemini/gemini-1.5-flash; do LLAMA_STACK_CONFIG=dev pytest -s -v tests/client-sdk/inference/test_text_inference.py \ --embedding-model=all-MiniLM-L6-v2 \ --vision-inference-model="" \ --inference-model=$model "${args[@]}" done ``` |
||
---|---|---|
.. | ||
agents | ||
inference | ||
safety | ||
tool_runtime | ||
vector_io | ||
__init__.py | ||
conftest.py | ||
metadata.py | ||
README.md | ||
report.py |
Llama Stack Integration Tests
You can run llama stack integration tests on either a Llama Stack Library or a Llama Stack endpoint.
To test on a Llama Stack library with certain configuration, run
LLAMA_STACK_CONFIG=./llama_stack/templates/cerebras/run.yaml pytest -s -v tests/client-sdk/inference/
or just the template name
LLAMA_STACK_CONFIG=together pytest -s -v tests/client-sdk/inference/
To test on a Llama Stack endpoint, run
LLAMA_STACK_BASE_URL=http://localhost:8089 pytest -s -v tests/client-sdk/inference
Report Generation
To generate a report, run with --report
option
LLAMA_STACK_CONFIG=together pytest -s -v report.md tests/client-sdk/ --report
Common options
Depending on the API, there are custom options enabled
- For tests in
inference/
andagents/, we support
--inference-model(to be used in text inference tests) and
--vision-inference-model` (only used in image inference tests) overrides - For tests in
vector_io/
, we support--embedding-model
override - For tests in
safety/
, we support--safety-shield
override - The param can be
--report
or--report <path>
If path is not provided, we do a best effort to infer based on the config / template name. For url endpoints, path is required.