forked from phoenix-oss/llama-stack-mirror

History

Ashwin Bharambe 63e6acd0c3 feat: add (openai, anthropic, gemini) providers via litellm (#1267 ) # What does this PR do? This PR introduces more non-llama model support to llama stack. Providers introduced: openai, anthropic and gemini. All of these providers use essentially the same piece of code -- the implementation works via the `litellm` library. We will expose only specific models for providers we enable making sure they all work well and pass tests. This setup (instead of automatically enabling _all_ providers and models allowed by LiteLLM) ensures we can also perform any needed prompt tuning on a per-model basis as needed (just like we do it for llama models.) ## Test Plan ```bash #!/bin/bash args=("$@") for model in openai/gpt-4o anthropic/claude-3-5-sonnet-latest gemini/gemini-1.5-flash; do LLAMA_STACK_CONFIG=dev pytest -s -v tests/client-sdk/inference/test_text_inference.py \ --embedding-model=all-MiniLM-L6-v2 \ --vision-inference-model="" \ --inference-model=$model "${args[@]}" done ```		2025-02-25 22:07:33 -08:00
..
agents	test fix for sometimes tools get called more than once	2025-02-24 13:16:40 -08:00
inference	feat: add (openai, anthropic, gemini) providers via litellm (#1267 )	2025-02-25 22:07:33 -08:00
safety	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
tool_runtime	build: format codebase imports using ruff linter (#1028 )	2025-02-13 10:06:21 -08:00
vector_io	Fix test infra, sentence embeddings mixin	2025-02-21 15:11:46 -08:00
__init__.py	[tests] add client-sdk pytests & delete client.py (#638 )	2024-12-16 12:04:56 -08:00
conftest.py	feat: add (openai, anthropic, gemini) providers via litellm (#1267 )	2025-02-25 22:07:33 -08:00
metadata.py	Report generation minor fixes (#884 )	2025-01-28 04:58:12 -08:00
README.md	Update README.md	2025-02-14 15:45:08 -08:00
report.py	script for running client sdk tests (#895 )	2025-02-19 22:38:06 -08:00

README.md

Llama Stack Integration Tests

You can run llama stack integration tests on either a Llama Stack Library or a Llama Stack endpoint.

To test on a Llama Stack library with certain configuration, run

LLAMA_STACK_CONFIG=./llama_stack/templates/cerebras/run.yaml pytest -s -v tests/client-sdk/inference/

or just the template name

LLAMA_STACK_CONFIG=together pytest -s -v tests/client-sdk/inference/

To test on a Llama Stack endpoint, run

LLAMA_STACK_BASE_URL=http://localhost:8089 pytest -s -v tests/client-sdk/inference

Report Generation

To generate a report, run with --report option

LLAMA_STACK_CONFIG=together pytest -s -v report.md tests/client-sdk/ --report

Common options

Depending on the API, there are custom options enabled

For tests in inference/ and agents/, we support --inference-model(to be used in text inference tests) and--vision-inference-model` (only used in image inference tests) overrides
For tests in vector_io/, we support --embedding-model override
For tests in safety/, we support --safety-shield override
The param can be --report or --report <path> If path is not provided, we do a best effort to infer based on the config / template name. For url endpoints, path is required.