llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 12:07:34 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	ef0736527d	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 ) This is a sweeping change to clean up some gunk around our "Tool" definitions. First, we had two types `Tool` and `ToolDef`. The first of these was a "Resource" type for the registry but we had stopped registering tools inside the Registry long back (and only registered ToolGroups.) The latter was for specifying tools for the Agents API. This PR removes the former and adds an optional `toolgroup_id` field to the latter. Secondly, as pointed out by @bbrowning in https://github.com/llamastack/llama-stack/pull/3003#issuecomment-3245270132, we were doing a lossy conversion from a full JSON schema from the MCP tool specification into our ToolDefinition to send it to the model. There is no necessity to do this -- we ourselves aren't doing any execution at all but merely passing it to the chat completions API which supports this. By doing this (and by doing it poorly), we encountered limitations like not supporting array items, or not resolving $refs, etc. To fix this, we replaced the `parameters` field by `{ input_schema, output_schema }` which can be full blown JSON schemas. Finally, there were some types in our llama-related chat format conversion which needed some cleanup. We are taking this opportunity to clean those up. This PR is a substantial breaking change to the API. However, given our window for introducing breaking changes, this suits us just fine. I will be landing a concurrent `llama-stack-client` change as well since API shapes are changing.	2025-10-02 15:12:03 -07:00
Ben Browning	b6e2934f7b	fix: Gracefully handle errors when listing MCP tools (#2544 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? When listing (and lazily indexing) tools, it's possible for an error to get thrown by individual toolgroups if for example an MCP toolgroup is unable to connect to its `mcp_endpoint`. This logs a warning in the server when that happens, logs a full stack trace of the error if debug logging is enabled, and just returns the list of tools from all working toolgroups instead of throwing an error to the client when a single toolgroup is temporarily or permanently misbehaving. The exception to the above is authentication errors, which we specifically send all the way back to the client as that's how we indicate to the client that it needs to provide authentication data for the remote MCP servers. Closes #2540 ## Test Plan A new unit test was added to test this exception handling, which is run as part of our regular test suite but also manually run to specifically verify this fix via: ``` uv run pytest -sv --asyncio-mode=auto \ tests/unit/distribution/routers/test_routing_tables.py ``` To verify the additional debug logging is printing properly: ``` LLAMA_STACK_LOGGING=core=debug \ uv run pytest -sv --asyncio-mode=auto \ tests/unit/distribution/routers/test_routing_tables.py ``` The mcp integration tests were run as below (and by CI): ``` ollama run llama3.2:3b ENABLE_OLLAMA="ollama" \ OLLAMA_INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ LLAMA_STACK_CONFIG=starter \ uv run pytest -sv tests/integration/tool_runtime/test_mcp.py \ --text-model meta-llama/Llama-3.2-3B-Instruct ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-09-26 18:09:48 +02:00
Derek Higgins	0e43be36e1	fix: handle missing API keys gracefully in model refresh (#3493 ) - Catch Errors from providers without API keys during model refresh - Log as warning instead of exception to avoid a scary startup Closes: #3492 Error message are now warnings instead of several tracebacks ``` INFO 2025-09-19 16:06:55,228 llama_stack.providers.utils.inference.inference_store:74 inference_store: Write queue disabled for SQLite to avoid concurrency issues WARNING 2025-09-19 16:06:59,362 llama_stack.providers.utils.inference.openai_mixin:327 providers::utils: Failed to list models for anthropic: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"anthropic_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-09-19 16:06:59,364 llama_stack.providers.utils.inference.openai_mixin:327 providers::utils: Failed to list models for gemini: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"gemini_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-09-19 16:06:59,367 llama_stack.providers.utils.inference.openai_mixin:327 providers::utils: Failed to list models for groq: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-09-19 16:06:59,372 llama_stack.providers.utils.inference.openai_mixin:327 providers::utils: Failed to list models for sambanova: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. INFO 2025-09-19 16:06:59,533 llama_stack.core.utils.config_resolution:45 core: Using file path: ``` Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-09-22 07:31:30 -04:00
IAN MILLER	ab321739f2	feat: create HTTP DELETE API endpoints to unregister ScoringFn and Benchmark resources in Llama Stack (#3371 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR provides functionality for users to unregister ScoringFn and Benchmark resources for `scoring` and `eval` APIs. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3051 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Updated integration and unit tests via CI workflow	2025-09-15 12:43:38 -07:00
Francisco Arceo	e2fe39aee1	feat!: Migrate Vector DB IDs to Vector Store IDs (breaking change) (#3253 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 35s Details Pre-commit / pre-commit (push) Successful in 1m15s Details # What does this PR do? This change migrates the VectorDB id generation to Vector Stores. This is a breaking change for _some users_ that may have application code using the `vector_db_id` parameter in the request of the VectorDB protocol instead of the `VectorDB.identifier` in the response. By default we will now create a Vector Store every time we register a VectorDB. The caveat with this approach is that this maps the `vector_db_id` → `vector_store.name`. This is a reasonable tradeoff to transition users towards OpenAI Vector Stores. As an added benefit, registering VectorDBs will result in them appearing in the VectorStores admin UI. ### Why? This PR makes the `POST` API call to `/v1/vector-dbs` swap the `vector_db_id` parameter in the request body into the VectorStore's name field and sets the `vector_db_id` to the generated vector store id (e.g., `vs_038247dd-4bbb-4dbb-a6be-d5ecfd46cfdb`). That means that users would have to do something like follows in their application code: ```python res = client.vector_dbs.register( vector_db_id='my-vector-db-id', embedding_model='ollama/all-minilm:l6-v2', embedding_dimension=384, ) vector_db_id = res.identifier ``` And then the rest of their code would behave, including `VectorIO`'s insert protocol using `vector_db_id` in the request. An alternative implementation would be to just delete the `vector_db_id` parameter in `VectorDB` but the end result would still require users having to write `vector_db_id = res.identifier` since `VectorStores.create()` generates the ID for you. So this approach felt the easiest way to migrate users towards VectorStores (subsequent PRs will be added to trigger `files.create()` and `vector_stores.files.create()`). ## Test Plan Unit tests and integration tests have been added. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-09-05 15:40:34 +02:00
Mustafa Elbehery	c3b2b06974	refactor(logging): rename llama_stack logger categories (#3065 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR renames categories of llama_stack loggers. This PR aligns logging categories as per the package name, as well as reviews from initial https://github.com/meta-llama/llama-stack/pull/2868. This is a follow up to #3061. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Replaces https://github.com/meta-llama/llama-stack/pull/2868 Part of https://github.com/meta-llama/llama-stack/issues/2865 cc @leseb @rhuss Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-08-21 17:31:04 -07:00
Ashwin Bharambe	3d90117891	chore(tests): fix responses and vector_io tests (#3119 ) Some fixes to MCP tests. And a bunch of fixes for Vector providers. I also enabled a bunch of Vector IO tests to be used with `LlamaStackLibraryClient` ## Test Plan Run Responses tests with llama stack library client: ``` pytest -s -v tests/integration/non_ci/responses/ --stack-config=server:starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ``` Do the same with `-k openai_client` The rest should be taken care of by CI.	2025-08-12 16:15:53 -07:00
Nathan Weinberg	19123ca957	refactor: standardize InferenceRouter model handling (#2965 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s Details Python Package Build Test / build (3.13) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s Details Test External API and Providers / test-external (venv) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s Details Unit Tests / unit-tests (3.12) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 21s Details Unit Tests / unit-tests (3.13) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s Details Pre-commit / pre-commit (push) Successful in 1m19s Details	2025-08-12 04:20:39 -06:00
Matthew Farrellee	8faff92591	chore: remove redundant code in unregister_toolgroup (#3092 ) # What does this PR do? removes redundant code ## Test Plan ci	2025-08-11 07:38:54 -07:00
IAN MILLER	e12524af85	feat: create unregister shield API endpoint in Llama Stack (#2853 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Integration Tests (Replay) / discover-tests (push) Successful in 13s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 24s Details Test External API and Providers / test-external (venv) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 27s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 21s Details Unit Tests / unit-tests (3.12) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 35s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 39s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 35s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 35s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 1m2s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 1m4s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 1m2s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Pre-commit / pre-commit (push) Successful in 2m21s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Extend the Shields Protocol and implement the capability to unregister previously registered shields and CLI for shields management. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2581 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> First of, test API for shields 1. Install and start Ollama: `ollama serve` 2. Pull Llama Guard Model in Ollama: `ollama pull llama-guard3:8b` 3. Configure env variables: ``` export ENABLE_OLLAMA=ollama export OLLAMA_URL=http://localhost:11434 ``` 4. Build Llama Stack distro: `llama stack build --template starter --image-type venv ` 5. Start Llama Stack server: `llama stack run starter --port 8321` 6. Check if Ollama model is available: `curl -X GET http://localhost:8321/v1/models \| jq '.data[] \| select(.provider_id=="ollama")'` 7. Register a new Shield using Ollama provider: ``` curl -X POST http://localhost:8321/v1/shields \ -H "Content-Type: application/json" \ -d '{ "shield_id": "test-shield", "provider_id": "llama-guard", "provider_shield_id": "ollama/llama-guard3:8b", "params": {} }' ``` `{"identifier":"test-shield","provider_resource_id":"ollama/llama-guard3:8b","provider_id":"llama-guard","type":"shield","owner":{"principal":"","attributes":{}},"params":{}}% ` 8. Check if shield was registered: `curl -X GET http://localhost:8321/v1/shields/test-shield` `{"identifier":"test-shield","provider_resource_id":"ollama/llama-guard3:8b","provider_id":"llama-guard","type":"shield","owner":{"principal":"","attributes":{}},"params":{}}% ` 9. Run shield: ``` curl -X POST http://localhost:8321/v1/safety/run-shield \ -H "Content-Type: application/json" \ -d '{ "shield_id": "test-shield", "messages": [ { "role": "user", "content": "How can I hack into someone computer?" } ], "params": {} }' ``` `{"violation":{"violation_level":"error","user_message":"I can't answer that. Can I help with something else?","metadata":{"violation_type":"S2"}}}% ` 10. Unregister shield: `curl -X DELETE http://localhost:8321/v1/shields/test-shield` `null% ` 11. Verify shield was deleted: `curl -X GET http://localhost:8321/v1/shields/test-shield` `{"detail":"Invalid value: Shield 'test-shield' not found"}%` All tests passed ✅ ``` ========================================================================== 430 passed, 194 warnings in 19.54s ========================================================================== /Users/iamiller/GitHub/llama-stack/.venv/lib/python3.12/site-packages/litellm/llms/custom_httpx/async_client_cleanup.py:78: RuntimeWarning: coroutine 'close_litellm_async_clients' was never awaited loop.close() RuntimeWarning: Enable tracemalloc to get the object allocation traceback Wrote HTML report to htmlcov-3.12/index.html ```	2025-08-05 07:33:46 -07:00
Nathan Weinberg	05cfa213b6	chore: standardize tool group not found error (#2986 ) # What does this PR do? 1. Creates a new `ToolGroupNotFoundError` class 2. Implements the new class where appropriate Relates to #2379 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-04 11:41:33 -07:00
Nathan Weinberg	ffb6306fbd	fix: remove redundant code from unregister_vector_db (#2983 ) get_vector_db() will raise an exception if a vector store won't be returned client handling is redundant Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-31 09:22:04 -07:00
Ashwin Bharambe	2665f00102	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 ) We would like to rename the term `template` to `distribution`. To prepare for that, this is a precursor. cc @leseb	2025-07-30 23:30:53 -07:00

13 commits