llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-03 19:57:35 +00:00

Author	SHA1	Message	Date
Sébastien Han	79ced0c85b	Merge `2a34226727` into `ea15f2a270`	2025-10-01 15:47:54 +02:00
Matthew Farrellee	ea15f2a270	chore: use openai_chat_completion for llm as a judge scoring (#3635 ) # What does this PR do? update llm as a judge to use openai_chat_completion, instead of deprecated chat_completion ## Test Plan ci	2025-10-01 09:44:31 -04:00
Jaideep Rao	ca47d90926	fix: Ensure that tool calls with no arguments get handled correctly (#3560 ) # What does this PR do? When a model decides to use an MCP tool call that requires no arguments, it sets the `arguments` field to `None`. This causes the user to see a `400 bad requst error` due to validation errors down the stack because this field gets removed when being parsed by an openai compatible inference provider like vLLM This PR ensures that, as soon as the tool call args are accumulated while streaming, we check to ensure no tool call function arguments are set to None - if they are we replace them with "{}" <!-- If resolving an issue, uncomment and update the line below --> Closes #3456 ## Test Plan Added new unit test to verify that any tool calls with function arguments set to `None` get handled correctly --------- Signed-off-by: Jaideep Rao <jrao@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-01 08:36:57 -04:00
Ashwin Bharambe	42414a1a1b	fix(logging): disable console telemetry sink by default (#3623 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 21s Details Test Llama Stack Build / build-single-provider (push) Failing after 25s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 27s Details Unit Tests / unit-tests (3.12) (push) Failing after 22s Details API Conformance Tests / check-schema-compatibility (push) Successful in 33s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m12s Details The current span processing dumps so much junk on the console that it makes actual understanding of what is going on in the server impossible. I am killing the console sink as a default. If you want, you are always free to change your run.yaml to add it. Before: <img width="1877" height="1107" alt="image" src="https://github.com/user-attachments/assets/3a7ad261-e2ba-4d40-9820-fcc282c8df37" /> After: <img width="1919" height="470" alt="image" src="https://github.com/user-attachments/assets/bc7cf763-fba9-4e95-a4b5-f65f6d1c5332" />	2025-09-30 14:58:05 -07:00
ehhuang	ac7c35fbe6	fix: don't pass default response format in Responses (#3614 ) # What does this PR do? Fireworks doesn't allow repsonse_format with tool use. The default response format is 'text' anyway, so we can safely omit. ## Test Plan Below script failed without the change, runs after. ``` #!/usr/bin/env python3 """ Script to test Responses API with kubernetes-mcp-server. This script: 1. Connects to the llama stack server 2. Uses the Responses API with MCP tools 3. Asks for the list of Kubernetes namespaces using the kubernetes-mcp-server """ import json from openai import OpenAI # Connect to the llama stack server base_url = "http://localhost:8321/v1" client = OpenAI(base_url=base_url, api_key="fake") # Define the MCP tool pointing to the kubernetes-mcp-server # The kubernetes-mcp-server is running on port 3000 with SSE endpoint at /sse mcp_server_url = "http://localhost:3000/sse" tools = [ { "type": "mcp", "server_label": "k8s", "server_url": mcp_server_url, } ] # Create a response request asking for k8s namespaces print("Sending request to list Kubernetes namespaces...") print(f"Using MCP server at: {mcp_server_url}") print("Available tools will be listed automatically by the MCP server.") print() response = client.responses.create( # model="meta-llama/Llama-3.2-3B-Instruct", # Using the vllm model model="fireworks/accounts/fireworks/models/llama4-scout-instruct-basic", # model="openai/gpt-4o", input="what are all the Kubernetes namespaces? Use tool call to `namespaces_list`. make sure to adhere to the tool calling format UNDER ALL CIRCUMSTANCES.", tools=tools, stream=False, ) print("\n" + "=" * 80) print("RESPONSE OUTPUT:") print("=" * 80) # Print the output for i, output in enumerate(response.output): print(f"\n[Output {i + 1}] Type: {output.type}") if output.type == "mcp_list_tools": print(f" Server: {output.server_label}") print(f" Tools available: {[t.name for t in output.tools]}") elif output.type == "mcp_call": print(f" Tool called: {output.name}") print(f" Arguments: {output.arguments}") print(f" Result: {output.output}") if output.error: print(f" Error: {output.error}") elif output.type == "message": print(f" Role: {output.role}") print(f" Content: {output.content}") print("\n" + "=" * 80) print("FINAL RESPONSE TEXT:") print("=" * 80) print(response.output_text) ```	2025-09-30 14:52:24 -07:00
grs	d350e3662b	feat: add support for require_approval argument when creating response (#3608 ) # What does this PR do? This PR adds support for the require_approval on an mcp tool definition passed to create response in the Responses API. This allows the caller to indicate whether they want to approve calls to that server, or let them be called without approval. Closes #3443 ## Test Plan Tested both approval and denial. Added automated integration test for both cases. --------- Signed-off-by: Gordon Sim <gsim@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>	2025-09-30 14:18:34 -07:00
Alexey Rybak	0837fa7bef	docs: update safety notebook (#3617 ) # What does this PR do? * Updates the safety guide in Zero to Hero series to use Moderations API and the latest safety models * Fixes an image link Closes #2557 ## Test Plan * Manual testing	2025-09-30 14:11:12 -07:00
Alexey Rybak	c4c980b056	docs: frontpage update (#3620 ) # What does this PR do? * Adds canonical project information and links to client SDK / k8s operator / app examples repos to the front page * Fixes some button rendering errors Closes #3618 ## Test Plan Local rebuild of the documentation server	2025-09-30 14:11:00 -07:00
Ashwin Bharambe	606f4cf281	fix(expires_after): make sure multipart/form-data is properly parsed (#3612 ) https://github.com/llamastack/llama-stack/pull/3604 broke multipart form data field parsing for the Files API since it changed its shape -- so as to match the API exactly to the OpenAI spec even in the generated client code. The underlying reason is that multipart/form-data cannot transport structured nested fields. Each field must be str-serialized. The client (specifically the OpenAI client whose behavior we must match), transports sub-fields as `expires_after[anchor]` and `expires_after[seconds]`, etc. We must be able to handle these fields somehow on the server without compromising the shape of the YAML spec. This PR "fixes" this by adding a dependency to convert the data. The main trade-off here is that we must add this `Depends()` annotation on every provider implementation for Files. This is a headache, but a much more reasonable one (in my opinion) given the alternatives. ## Test Plan Tests as shown in https://github.com/llamastack/llama-stack/pull/3604#issuecomment-3351090653 pass.	2025-09-30 16:14:03 -04:00
Ashwin Bharambe	73de235ef1	fix(eval): use client.alpha for eval tests	2025-09-30 13:02:33 -07:00
slekkala1	cc64093ae4	feat(api): Add Vector Store File batches api stub (#3615 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 34s Details Pre-commit / pre-commit (push) Successful in 1m14s Details # What does this PR do? Adding api stubs for vector store file batches apis https://github.com/llamastack/llama-stack/issues/3533 API Ref: https://platform.openai.com/docs/api-reference/vector-stores-file-batches ## Test Plan CI	2025-09-30 12:07:33 -07:00
Charlie Doern	1e25a72ece	feat(api): level /agents as `v1alpha` (#3610 ) # What does this PR do? agents is likely to be deprecated in favor of responses. Lets level it as alpha to indicate the lack of longterm support keep v1 route for backwards compat. Closes #3611 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-30 11:15:04 -07:00
Matthew Farrellee	2de4e6c900	feat: use /v1/chat/completions for safety model inference (#3591 ) # What does this PR do? migrate safety api implementation from /inference/chat-completion to /v1/chat/completions ## Test Plan ci w/ recordings --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-30 11:01:44 -07:00
Matthew Farrellee	cb33f45c11	chore: unpublish /inference/chat-completion (#3609 ) # What does this PR do? BREAKING CHANGE: removes /inference/chat-completion route and updates relevant documentation ## Test Plan 🤷	2025-09-30 11:00:42 -07:00
Kai Wu	62e302613f	feat: add llamastack + CrewAI integration example notebook (#3275 ) # What does this PR do? Add llamastack + CrewAI integration example notebook <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Tested in local jupyternotebook and it works.	2025-09-30 10:23:57 -07:00
ehhuang	6cce553c93	fix: mcp tool with array type should include items (#3602 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 11s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 23s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 28s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details API Conformance Tests / check-schema-compatibility (push) Successful in 32s Details UI Tests / ui-tests (22) (push) Successful in 57s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? Fixes error: ``` [ERROR] Error executing endpoint route='/v1/openai/v1/responses' method='post': Error code: 400 - {'error': {'message': "Invalid schema for function 'pods_exec': In context=('properties', 'command'), array schema missing items.", 'type': 'invalid_request_error', 'param': 'tools[7].function.parameters', 'code': 'invalid_function_parameters'}} ``` From script: ``` #!/usr/bin/env python3 """ Script to test Responses API with kubernetes-mcp-server. This script: 1. Connects to the llama stack server 2. Uses the Responses API with MCP tools 3. Asks for the list of Kubernetes namespaces using the kubernetes-mcp-server """ import json from openai import OpenAI # Connect to the llama stack server base_url = "http://localhost:8321/v1/openai/v1" client = OpenAI(base_url=base_url, api_key="fake") # Define the MCP tool pointing to the kubernetes-mcp-server # The kubernetes-mcp-server is running on port 3000 with SSE endpoint at /sse mcp_server_url = "http://localhost:3000/sse" tools = [ { "type": "mcp", "server_label": "k8s", "server_url": mcp_server_url, } ] # Create a response request asking for k8s namespaces print("Sending request to list Kubernetes namespaces...") print(f"Using MCP server at: {mcp_server_url}") print("Available tools will be listed automatically by the MCP server.") print() response = client.responses.create( # model="meta-llama/Llama-3.2-3B-Instruct", # Using the vllm model model="openai/gpt-4o", input="what are all the Kubernetes namespaces? Use tool call to `namespaces_list`. make sure to adhere to the tool calling format.", tools=tools, stream=False, ) print("\n" + "=" * 80) print("RESPONSE OUTPUT:") print("=" * 80) # Print the output for i, output in enumerate(response.output): print(f"\n[Output {i + 1}] Type: {output.type}") if output.type == "mcp_list_tools": print(f" Server: {output.server_label}") print(f" Tools available: {[t.name for t in output.tools]}") elif output.type == "mcp_call": print(f" Tool called: {output.name}") print(f" Arguments: {output.arguments}") print(f" Result: {output.output}") if output.error: print(f" Error: {output.error}") elif output.type == "message": print(f" Role: {output.role}") print(f" Content: {output.content}") print("\n" + "=" * 80) print("FINAL RESPONSE TEXT:") print("=" * 80) print(response.output_text) ``` ## Test Plan new unit tests script now runs successfully	2025-09-29 23:11:41 -07:00
Ashwin Bharambe	56b625d18a	feat(openai_movement)!: Change URL structures to kill /openai/v1 (part 2) (#3605 )	2025-09-29 22:57:37 -07:00
Ashwin Bharambe	3a09f00cdb	feat(files): fix expires_after API shape (#3604 ) This was just quite incorrect. See source here: https://platform.openai.com/docs/api-reference/files/create	2025-09-29 21:29:15 -07:00
Ashwin Bharambe	5e7fed8bbb	feat(openai_movement): Change URL structures to kill /openai/v1 (part 1) (#3587 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m19s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 38s Details The `/v1/openai/v1` prefix is annoying and now unnecessary given our clearer focus on how to think about the API surface. Let's kill it for the 0.3.0 update. To make client-side changes feasible, we will do this in two parts. This part adds a new route (sans `/openai/v1`) so the existing client continues to work since the server supports both. The next PR will be client-side (Stainless) changes which I will be making shortly. The final PR will remove the `/openai/v1` routes. Note that all these changes will happen rapidly within this release cycle. The entire set _will be backwards incompatible_.	2025-09-29 16:14:35 -07:00
Michael Dawson	ddf3f1735a	fix: ensure usage is requested if telemetry is enabled (#3571 ) # What does this PR do? Refs: https://github.com/llamastack/llama-stack/issues/3420 When telemetry is enabled the router uncondionally expects the usage attribute to be availble and fails if it is not present. Usage is not currently being requested by litellm_openai_mixin.py for streaming requests when using the responses API which means that providers like vertexai fail if telemetry is enabled and streaming is used. This is part of the required fix. Other part is in liteLLM, will plan to submit PR for that soon. ## Test Plan I applied this change along with the change for litellm in a llama stack deployment and validated that I could make streaming requests through the responses API to a gemini model and they would succeed instead of failing due to the missing usage attribute when telemetry is enabled. Signed-off-by: Michael Dawson <midawson@redhat.com>	2025-09-29 14:09:08 -07:00
slekkala1	455579a88e	fix: Remove deprecated user param in OpenAIResponseObject (#3596 ) # What does this PR do? Just removing the deprecated User param in `OpenAIResponseObject` Closing https://github.com/llamastack/llama-stack/issues/3482 ## Test Plan CI	2025-09-29 13:55:59 -07:00
Matthew Farrellee	e9eb004bf8	fix: remove inference.completion from docs (#3589 ) # What does this PR do? now that /v1/inference/completion has been removed, no docs should refer to it this cleans up remaining references ## Test Plan ci Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:14:41 -07:00
Alexey Rybak	498be131a1	docs: update image paths (#3599 ) # What does this PR do? * Updates image paths for images in docs/resources/ to proper static image locations ## Test Plan * `npm run build` builds documentation properly	2025-09-29 13:14:05 -07:00
Matthew Farrellee	7c888fc0da	feat: update eval runner to use openai endpoints (#3588 ) # What does this PR do? move the eval=inline::meta-reference implementation to use openai_completion/openai_chat_completion note: this breaks backward compatibility if eval setup used sampling params' repetition_penalty or strategy ## Test Plan ci w/ new recordings Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:13:53 -07:00
Matthew Farrellee	45f438c027	chore: skip safety tests when shield not available (#3592 ) # What does this PR do? we skip embedding tests when the embedding_model_id isn't provided. same for completion / chat tests when text_model_id isn't given. instead of failing safety tests when a shield_id isn't provided, we'll skip them too. ## Test Plan ci Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:11:37 -07:00
Charlie Doern	aac42ddcc2	feat(api): level inference/rerank and remove experimental (#3565 ) # What does this PR do? inference/rerank is the one route in the API intended to not be deprecated. Level it as v1alpha. Additionally, remove `experimental` and opt to instead use `v1alpha` which itself implies an experimental state based on the original proposal Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-29 12:42:09 -07:00
Matthew Farrellee	975ead1d6a	chore(api): remove deprecated embeddings impls (#3301 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m25s Details # What does this PR do? remove deprecated embeddings implementations	2025-09-29 14:45:09 -04:00
Kai Wu	aab22dc759	fix: adding mime type of application/json support (#3452 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fix #3300 by adding mime type of application/json support in [agent_instance.py](`4a59961a6c/llama_stack/providers/inline/agents/meta_reference/agent_instance.py (L923)`) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[3300] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> all related pytest passed, see log: ``` ./scripts/unit-tests.sh tests/unit/providers/agent/test_get_raw_document_text.py -vvv /Users/kaiwu/work/kaiwu/llama-stack/.venv/bin/python3 Uninstalled 22 packages in 5.65s Installed 47 packages in 1.24s ================= test session starts ================= platform darwin -- Python 3.12.9, pytest-8.4.2, pluggy-1.6.0 -- /Users/kaiwu/work/kaiwu/llama-stack/.venv/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.9', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}} rootdir: /Users/kaiwu/work/kaiwu/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 14 items tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_text_mime_types PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_yaml_mime_type PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_deprecated_text_yaml_with_warning PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_deprecated_text_yaml_with_url PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_deprecated_text_yaml_with_text_content_item PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_json_mime_type PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_json_url PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_json_text_content_item PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unsupported_mime_types PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_url_content PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_yaml_url PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_text_content_item PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_yaml_text_content_item PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unexpected_content_type PASSED ================ slowest 10 durations ================= 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_deprecated_text_yaml_with_url 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unsupported_mime_types 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unexpected_content_type 0.00s setup tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_text_mime_types 0.00s teardown tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_text_mime_types 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_yaml_url 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_url_content 0.00s teardown tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unsupported_mime_types 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_json_url 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_text_mime_types ================= 14 passed in 0.14s ================== Generating coverage report... Wrote HTML report to htmlcov-3.12/index.html ```	2025-09-29 11:27:31 -07:00
Ashwin Bharambe	fdb144f009	revert: feat(ci): use @next branch from llama-stack-client (#3593 ) Reverts llamastack/llama-stack#3576 When I edit Stainless and codegen succeeds, the `next` branch is updated directly. It provides us no chance to see if there might be something unideal going on. If something is wrong, all CI will start breaking immediately. This is not ideal. I will likely create another staging branch `next-release` or something to accomodate the special workflow that Stainless requires.	2025-09-29 10:41:04 -07:00
ehhuang	8ab6684a94	chore: introduce write queue for response_store (#3497 ) # What does this PR do? Mirroring the same changes that was used for inference_store: https://github.com/llamastack/llama-stack/pull/3383 Will follow up with a shared internal API for managing these write queues. ## Test Plan existing tests	2025-09-29 10:36:16 -07:00
Matthew Farrellee	7c466a7ec5	chore: skip nvidia datastore tests when nvidia datastore is not enabled (#3590 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 25s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 28s Details API Conformance Tests / check-schema-compatibility (push) Successful in 33s Details UI Tests / ui-tests (22) (push) Successful in 58s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? the nvidia datastore tests were running when the datastore was not configured. they would always fail. this introduces a skip when the nvidia datastore is not configured. ## Test Plan ci	2025-09-29 05:15:58 -04:00
dependabot[bot]	90bb9cfb0a	chore(github-deps): bump actions/cache from 4.2.4 to 4.3.0 (#3577 ) Bumps [actions/cache](https://github.com/actions/cache) from 4.2.4 to 4.3.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/cache/releases">actions/cache's releases</a>.</em></p> <blockquote> <h2>v4.3.0</h2> <h2>What's Changed</h2> <ul> <li>Add note on runner versions by <a href="https://github.com/GhadimiR"><code>@GhadimiR</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1642">actions/cache#1642</a></li> <li>Prepare <code>v4.3.0</code> release by <a href="https://github.com/Link"><code>@Link</code></a>- in <a href="https://redirect.github.com/actions/cache/pull/1655">actions/cache#1655</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/GhadimiR"><code>@GhadimiR</code></a> made their first contribution in <a href="https://redirect.github.com/actions/cache/pull/1642">actions/cache#1642</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/cache/compare/v4...v4.3.0">https://github.com/actions/cache/compare/v4...v4.3.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/cache/blob/main/RELEASES.md">actions/cache's changelog</a>.</em></p> <blockquote> <h1>Releases</h1> <h3>4.3.0</h3> <ul> <li>Bump <code>@actions/cache</code> to <a href="https://redirect.github.com/actions/toolkit/pull/2132">v4.1.0</a></li> </ul> <h3>4.2.4</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.5</li> </ul> <h3>4.2.3</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.3 (obfuscates SAS token in debug logs for cache entries)</li> </ul> <h3>4.2.2</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.2</li> </ul> <h3>4.2.1</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.1</li> </ul> <h3>4.2.0</h3> <p>TLDR; The cache backend service has been rewritten from the ground up for improved performance and reliability. <a href="https://github.com/actions/cache">actions/cache</a> now integrates with the new cache service (v2) APIs.</p> <p>The new service will gradually roll out as of <strong>February 1st, 2025</strong>. The legacy service will also be sunset on the same date. Changes in these release are <strong>fully backward compatible</strong>.</p> <p><strong>We are deprecating some versions of this action</strong>. We recommend upgrading to version <code>v4</code> or <code>v3</code> as soon as possible before <strong>February 1st, 2025.</strong> (Upgrade instructions below).</p> <p>If you are using pinned SHAs, please use the SHAs of versions <code>v4.2.0</code> or <code>v3.4.0</code></p> <p>If you do not upgrade, all workflow runs using any of the deprecated <a href="https://github.com/actions/cache">actions/cache</a> will fail.</p> <p>Upgrading to the recommended versions will not break your workflows.</p> <h3>4.1.2</h3> <ul> <li>Add GitHub Enterprise Cloud instances hostname filters to inform API endpoint choices - <a href="https://redirect.github.com/actions/cache/pull/1474">#1474</a></li> <li>Security fix: Bump braces from 3.0.2 to 3.0.3 - <a href="https://redirect.github.com/actions/cache/pull/1475">#1475</a></li> </ul> <h3>4.1.1</h3> <ul> <li>Restore original behavior of <code>cache-hit</code> output - <a href="https://redirect.github.com/actions/cache/pull/1467">#1467</a></li> </ul> <h3>4.1.0</h3> <ul> <li>Ensure <code>cache-hit</code> output is set when a cache is missed - <a href="https://redirect.github.com/actions/cache/pull/1404">#1404</a></li> <li>Deprecate <code>save-always</code> input - <a href="https://redirect.github.com/actions/cache/pull/1452">#1452</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0057852bfa`"><code>0057852</code></a> Merge pull request <a href="https://redirect.github.com/actions/cache/issues/1655">#1655</a> from actions/Link-/prepare-4.3.0</li> <li><a href="`4f5ea67f1c`"><code>4f5ea67</code></a> Update licensed cache</li> <li><a href="`9fcad95d03`"><code>9fcad95</code></a> Upgrade actions/cache to 4.1.0 and prepare 4.3.0 release</li> <li><a href="`638ed79f9d`"><code>638ed79</code></a> Merge pull request <a href="https://redirect.github.com/actions/cache/issues/1642">#1642</a> from actions/GhadimiR-patch-1</li> <li><a href="`3862dccb17`"><code>3862dcc</code></a> Add note on runner versions</li> <li>See full diff in <a href="`0400d5f644...0057852bfa`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/cache&package-manager=github_actions&previous-version=4.2.4&new-version=4.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-29 10:04:58 +02:00
dependabot[bot]	9fdfd3a2ad	chore(ui-deps): bump tw-animate-css from 1.2.9 to 1.4.0 in /llama_stack/ui (#3583 ) Bumps [tw-animate-css](https://github.com/Wombosvideo/tw-animate-css) from 1.2.9 to 1.4.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/Wombosvideo/tw-animate-css/releases">tw-animate-css's releases</a>.</em></p> <blockquote> <h2>v1.4.0</h2> <h2>Changelog</h2> <p>902e37a019ffd165ba078e0b3c02634526c54bf0: fix: remove support for prefix, add new export for prefixed version. Closes <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/58">#58</a>. fab2a5bf817605be1976e159976718a83489fc1c: chore: bump version to 1.4.0 and update dependencies c20dc32e2b532a8e74546879b4ce7d9ce89ba710: fix(build): make transform.ts accept two arguments</p> <h2>⚠️ BREAKING CHANGE ⚠️</h2> <p>Support for Tailwind CSS's prefix option was moved to <code>tw-animate-css/prefix</code> because it was breaking the <code>--spacing</code> function. Users requiring prefixes should replace their import:</p> <pre lang="diff"><code>- import "tw-animate-css"; + import "tw-animate-css/prefix"; </code></pre> <p><em>I do not plan to introduce breaking changes like this to non-major releases in the future. But because more people use spacing rather than prefixes, reverting the previous version's (obviously breaking) change seems reasonable.</em></p> <h2>v1.3.8</h2> <h2>Changelog</h2> <ul> <li>b5ff23a: fix: add support for global CSS variable prefix. Closes <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/48">#48</a></li> <li>03e5f12: feat: add support for ng-primitives height variables <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/56">#56</a> (thanks <a href="https://github.com/immohammadjaved"><code>@immohammadjaved</code></a>)</li> <li>b076cfb: docs: fix various issues in accordion and collapsible docs</li> <li>9485e33: chore: bump version to 1.3.8 and update dependencies</li> </ul> <h2>⚠️ BREAKING CHANGE ⚠️</h2> <p>Adding support for prefixes broke custom spacing. It is recommended that you skip this version if you do not use Tailwind CSS's prefix option, and use v1.4.0 instead. If you are actually using prefixes, you can use a special version supporting prefixes:</p> <pre lang="diff"><code>- import "tw-animate-css"; /* Version with spacing support / + import "tw-animate-css/prefix"; / Version with prefix support */ </code></pre> <p><em>I do not plan to fix the incompatibility between the spacing and prefix versions due to time constraints. Feel free to investigate and open a pull request if you manage to fix it.</em></p> <h2>v1.3.7</h2> <h2>Changelog</h2> <ul> <li>80dbfcc: feat: add utilities for blur transitions <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/54">#54</a> (thanks <a href="https://github.com/coffeeispower"><code>@coffeeispower</code></a>)</li> <li>dc294f9: docs: add upcoming changes warning</li> <li>c640bb8: chore: update dependencies and package manager version</li> <li>9e63e34: chore: bump version to 1.3.7</li> </ul> <h2>v1.3.6</h2> <h2>Changelog</h2> <ul> <li>58f3396: fix: allow changing animation parameters for ready-to-use animations</li> <li>8313476: chore: update dependencies nd package manager version</li> <li>f81346c: chore: bump version to 1.3.6</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`c20dc32e2b`"><code>c20dc32</code></a> fix(build): make transform.ts accept two arguments</li> <li><a href="`fab2a5bf81`"><code>fab2a5b</code></a> chore: bump version to 1.4.0 and update dependencies</li> <li><a href="`902e37a019`"><code>902e37a</code></a> fix: remove support for prefix, add new export for prefixed version</li> <li><a href="`9485e33d99`"><code>9485e33</code></a> chore: bump version to 1.3.8 and update dependencies</li> <li><a href="`b076cfb04a`"><code>b076cfb</code></a> docs: fix various issues in accordion and collapsible docs</li> <li><a href="`03e5f12418`"><code>03e5f12</code></a> feat: add support for ng-primitives height variables (<a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/56">#56</a>)</li> <li><a href="`b5ff23a0d5`"><code>b5ff23a</code></a> fix: add support for global CSS variable prefix. Closes <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/48">#48</a></li> <li><a href="`9e63e34286`"><code>9e63e34</code></a> chore: bump version to 1.3.7</li> <li><a href="`c640bb8933`"><code>c640bb8</code></a> chore: update dependencies and package manager version</li> <li><a href="`dc294f990a`"><code>dc294f9</code></a> docs: add upcoming changes warning</li> <li>Additional commits viewable in <a href="https://github.com/Wombosvideo/tw-animate-css/compare/v1.2.9...v1.4.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tw-animate-css&package-manager=npm_and_yarn&previous-version=1.2.9&new-version=1.4.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-29 10:03:26 +02:00
dependabot[bot]	d95853d784	chore(ui-deps): bump shiki from 1.29.2 to 3.13.0 in /llama_stack/ui (#3585 ) Bumps [shiki](https://github.com/shikijs/shiki/tree/HEAD/packages/shiki) from 1.29.2 to 3.13.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/shikijs/shiki/releases">shiki's releases</a>.</em></p> <blockquote> <h2>v3.13.0</h2> <h3> 🚀 Features</h3> <ul> <li><strong>transformers</strong>: Render indent guides - by <a href="https://github.com/KazariEX"><code>@KazariEX</code></a> and <a href="https://github.com/antfu"><code>@antfu</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1060">shikijs/shiki#1060</a> <a href="`aecd1617`"><!-- raw HTML omitted -->(aecd1)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.12.3...v3.13.0">View changes on GitHub</a></h5> <h2>v3.12.3</h2> <h3> 🐞 Bug Fixes</h3> <ul> <li><code>@shikijs/twoslash</code> version specifier - by <a href="https://github.com/9romise"><code>@9romise</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1078">shikijs/shiki#1078</a> <a href="`a1cdea41`"><!-- raw HTML omitted -->(a1cde)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.12.2...v3.12.3">View changes on GitHub</a></h5> <h2>v3.12.2</h2> <h3> 🐞 Bug Fixes</h3> <ul> <li><strong>twoslash</strong>: Fix <code>onTwoslashError</code> return value handling - by <a href="https://github.com/Karibash"><code>@Karibash</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1070">shikijs/shiki#1070</a> <a href="`e86b0a7c`"><!-- raw HTML omitted -->(e86b0)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.12.1...v3.12.2">View changes on GitHub</a></h5> <h2>v3.12.1</h2> <p><em>No significant changes</em></p> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.12.0...v3.12.1">View changes on GitHub</a></h5> <h2>v3.12.0</h2> <h3> 🚀 Features</h3> <ul> <li><strong>vitepress-twoslash</strong>: <ul> <li>Improve UX for option customization - by <a href="https://github.com/9romise"><code>@9romise</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1066">shikijs/shiki#1066</a> <a href="`e3cfdeca`"><!-- raw HTML omitted -->(e3cfd)<!-- raw HTML omitted --></a></li> <li>Twoslash inline type cache for markdown - by <a href="https://github.com/serkodev"><code>@serkodev</code></a> and <a href="https://github.com/antfu"><code>@antfu</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1063">shikijs/shiki#1063</a> <a href="`dc7fbc70`"><!-- raw HTML omitted -->(dc7fb)<!-- raw HTML omitted --></a></li> </ul> </li> </ul> <h3> 🐞 Bug Fixes</h3> <ul> <li><strong>remove-notation-escape</strong>: Correct escape sequence - by <a href="https://github.com/sor4chi"><code>@sor4chi</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1065">shikijs/shiki#1065</a> <a href="`22d0c780`"><!-- raw HTML omitted -->(22d0c)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.11.0...v3.12.0">View changes on GitHub</a></h5> <h2>v3.11.0</h2> <h3> 🚀 Features</h3> <ul> <li><strong>core</strong>: Add <code>enforce</code> options to <code>ShikiTransformer</code> - by <a href="https://github.com/serkodev"><code>@serkodev</code></a> and <a href="https://github.com/antfu"><code>@antfu</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1062">shikijs/shiki#1062</a> <a href="`8ad05bd8`"><!-- raw HTML omitted -->(8ad05)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.10.0...v3.11.0">View changes on GitHub</a></h5> <h2>v3.10.0</h2> <h3> 🚀 Features</h3> <ul> <li>Add funding links to playground - by <a href="https://github.com/jtbandes"><code>@jtbandes</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1054">shikijs/shiki#1054</a> <a href="`e36eb4d8`"><!-- raw HTML omitted -->(e36eb)<!-- raw HTML omitted --></a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`fd7326a82f`"><code>fd7326a</code></a> chore: release v3.13.0</li> <li><a href="`5cbb05219e`"><code>5cbb052</code></a> chore: release v3.12.3</li> <li><a href="`e462618190`"><code>e462618</code></a> chore: release v3.12.2</li> <li><a href="`793d71e68f`"><code>793d71e</code></a> chore: release v3.12.1</li> <li><a href="`9260f3fd10`"><code>9260f3f</code></a> chore: release v3.12.0</li> <li><a href="`d05f39b1e8`"><code>d05f39b</code></a> chore: release v3.11.0</li> <li><a href="`bda1a76743`"><code>bda1a76</code></a> chore: release v3.10.0</li> <li><a href="`09921f1cb8`"><code>09921f1</code></a> chore: release v3.9.2</li> <li><a href="`854eddf2ed`"><code>854eddf</code></a> chore: release v3.9.1</li> <li><a href="`950ede5ae5`"><code>950ede5</code></a> chore: release v3.9.0</li> <li>Additional commits viewable in <a href="https://github.com/shikijs/shiki/commits/v3.13.0/packages/shiki">compare view</a></li> </ul> </details> <details> <summary>Maintainer changes</summary> <p>This version was pushed to npm by [GitHub Actions](<a href="https://www.npmjs.com/~GitHub">https://www.npmjs.com/~GitHub</a> Actions), a new releaser for shiki since your current version.</p> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=shiki&package-manager=npm_and_yarn&previous-version=1.29.2&new-version=3.13.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-29 10:02:51 +02:00
Sébastien Han	2a34226727	revert: do not use MySecretStr We don't need this if we can set it to empty string. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-29 09:58:41 +02:00
Sébastien Han	bc64635835	feat: load config class when doing variable substitution When using bash style substitution env variable in distribution template, we are processing the string and convert it to the type associated with the provider's config class. This allows us to return the proper type. This is crucial for api key since they are not strings anymore but SecretStr. If the key is unset we will get an empty string which will result in a Pydantic error like: ``` ERROR 2025-09-25 21:40:44,565 __main__:527 core::server: Error creating app: 1 validation error for AnthropicConfig api_key Input should be a valid string For further information visit https://errors.pydantic.dev/2.11/v/string_type ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-29 09:55:19 +02:00
Sébastien Han	4af141292f	chore: use empty SecretStr values as default Better than using SecretStr \| None so we centralize the null handling. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-29 09:55:00 +02:00
Sébastien Han	c4cb6aa8d9	fix: prevent telemetry from leaking sensitive info Prevent sensitive information from being logged in telemetry output by assigning SecretStr type to sensitive fields. API keys, password from KV store are now covered. All providers have been converted. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-29 09:54:41 +02:00
Ashwin Bharambe	8dc9fd6844	feat(ci): use @next branch from llama-stack-client (#3576 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m16s Details When we update Stainless (editor changes), the `next` branch gets updated. Eventually when one decides on a release, you land changes into `main`. This is the Stainless workflow. This PR makes sure we follow that workflow by pulling from the `next` branch for our integration tests.	2025-09-27 12:56:51 -07:00
Tami Takamiya	65f7b81e98	feat: Add items and title to ToolParameter/ToolParamDefinition (#3003 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details Python Package Build Test / build (3.12) (push) Failing after 17s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 19s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (push) Failing after 20s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Python Package Build Test / build (3.13) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 16s Details API Conformance Tests / check-schema-compatibility (push) Successful in 25s Details UI Tests / ui-tests (22) (push) Successful in 50s Details Pre-commit / pre-commit (push) Successful in 1m16s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Add items and title to ToolParameter/ToolParamDefinition. Adding items will resolve the issue that occurs with Gemini LLM when an MCP tool has array-type properties. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Unite test cases will be added. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Kai Wu <kaiwu@meta.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-27 11:35:29 -07:00
Sébastien Han	1a8d3ed315	chore: MANIFEST maintenance (#3454 ) `b4789c59` chore: exclude ci-test distro from the package `86a85da8` chore: re-add files in the package commit `b4789c5941` Author: Sébastien Han <seb@redhat.com> Date: Tue Sep 16 14:34:06 2025 +0200 chore: exclude ci-test distro from the package This is a CI artifact, we shouldn't package it. Proof it works, when building ci-tests is not added: ``` adding 'llama_stack/core/utils/serialize.py' adding 'llama_stack/distributions/__init__.py' adding 'llama_stack/distributions/template.py' adding 'llama_stack/distributions/dell/__init__.py' adding 'llama_stack/distributions/dell/build.yaml' adding 'llama_stack/distributions/dell/dell.py' adding 'llama_stack/distributions/dell/run-with-safety.yaml' adding 'llama_stack/distributions/dell/run.yaml' adding 'llama_stack/distributions/meta-reference-gpu/__init__.py' adding 'llama_stack/distributions/meta-reference-gpu/build.yaml' adding 'llama_stack/distributions/meta-reference-gpu/meta_reference.py' adding 'llama_stack/distributions/meta-reference-gpu/run-with-safety.yaml' adding 'llama_stack/distributions/meta-reference-gpu/run.yaml' adding 'llama_stack/distributions/nvidia/__init__.py' adding 'llama_stack/distributions/nvidia/build.yaml' adding 'llama_stack/distributions/nvidia/nvidia.py' adding 'llama_stack/distributions/nvidia/run-with-safety.yaml' adding 'llama_stack/distributions/nvidia/run.yaml' adding 'llama_stack/distributions/open-benchmark/__init__.py' adding 'llama_stack/distributions/open-benchmark/build.yaml' adding 'llama_stack/distributions/open-benchmark/open_benchmark.py' adding 'llama_stack/distributions/open-benchmark/run.yaml' adding 'llama_stack/distributions/postgres-demo/__init__.py' adding 'llama_stack/distributions/postgres-demo/build.yaml' adding 'llama_stack/distributions/postgres-demo/postgres_demo.py' adding 'llama_stack/distributions/postgres-demo/run.yaml' adding 'llama_stack/distributions/starter/__init__.py' adding 'llama_stack/distributions/starter/build.yaml' adding 'llama_stack/distributions/starter/run.yaml' adding 'llama_stack/distributions/starter/starter.py' adding 'llama_stack/distributions/starter-gpu/__init__.py' adding 'llama_stack/distributions/starter-gpu/build.yaml' adding 'llama_stack/distributions/starter-gpu/run.yaml' adding 'llama_stack/distributions/starter-gpu/starter_gpu.py' adding 'llama_stack/distributions/watsonx/__init__.py' adding 'llama_stack/distributions/watsonx/build.yaml' adding 'llama_stack/distributions/watsonx/run.yaml' adding 'llama_stack/distributions/watsonx/watsonx.py' adding 'llama_stack/models/__init__.py' adding 'llama_stack/models/llama/__init__.py' ``` Signed-off-by: Sébastien Han <seb@redhat.com> commit `86a85da877` Author: Sébastien Han <seb@redhat.com> Date: Tue Sep 16 14:45:37 2025 +0200 chore: re-add files in the package These files were not added anymore since the path changed. Signed-off-by: Sébastien Han <seb@redhat.com> --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-27 11:28:11 -07:00
ehhuang	c392f3a0f4	chore: remove extra logging (#3574 ) # What does this PR do? This is already logged by console processor as INFO <img width="1093" height="280" alt="image" src="https://github.com/user-attachments/assets/780b0ac2-6744-49d7-b1d4-b7204050a6dc" /> ## Test Plan	2025-09-27 11:22:54 -07:00
Matthew Farrellee	0d94f3e2c0	chore: recordings for fireworks (inference + openai) (#3573 ) # What does this PR do? recorded for: ./scripts/integration-tests.sh --stack-config server:ci-tests --suite base --setup fireworks --subdirs inference --pattern openai ## Test Plan ./scripts/integration-tests.sh --stack-config server:ci-tests --suite base --setup fireworks --subdirs inference --pattern openai	2025-09-27 11:22:30 -07:00
Matthew Farrellee	53b15725b6	chore(apis): unpublish deprecated /v1/inference apis (#3297 ) # What does this PR do? unpublish (make unavailable to users) the following apis - - `/v1/inference/completion`, replaced by `/v1/openai/v1/completions` - `/v1/inference/chat-completion`, replaced by `/v1/openai/v1/chat/completions` - `/v1/inference/embeddings`, replaced by `/v1/openai/v1/embeddings` - `/v1/inference/batch-completion`, replaced by `/v1/openai/v1/batches` - `/v1/inference/batch-chat-completion`, replaced by `/v1/openai/v1/batches` note: the implementations are still available for internal use, e.g. agents uses chat-completion.	2025-09-27 11:20:06 -07:00
Matthew Farrellee	60484c5c4e	chore(api): remove batch inference (#3261 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? APIs removed: - POST /v1/batch-inference/completion - POST /v1/batch-inference/chat-completion - POST /v1/inference/batch-completion - POST /v1/inference/batch-chat-completion note - - batch-completion & batch-chat-completion were only implemented for inference=inline::meta-reference - batch-inference were not implemented	2025-09-26 14:35:34 -07:00
Matthew Farrellee	b48d5cfed7	feat(internal): add image_url download feature to OpenAIMixin (#3516 ) # What does this PR do? simplify Ollama inference adapter by - - moving image_url download code to OpenAIMixin - being a ModelRegistryHelper instead of having one (mypy blocks check_model_availability method assignment) ## Test Plan - add unit tests for new download feature - add integration tests for openai_chat_completion w/ image_url (close test gap)	2025-09-26 17:32:16 -04:00
github-actions[bot]	4487b88ffe	build: Bump version to 0.2.23	2025-09-26 21:11:51 +00:00
Matthew Farrellee	7a25be633c	fix: Revert "fix: Added a bug fix when registering new models" (#3473 ) the commit to be reverted is an public api behavior change to something we should not support. instead of allowing silent updates (the caller cannot see the log messages), we should be sending an error to the caller that they must first unregister the model before reusing the same name w/ a different backend.	2025-09-26 16:19:21 -04:00
Matthew Farrellee	da5ea107fc	fix: ensure ModelRegistryHelper init for together and fireworks (#3572 ) # What does this PR do? address - ``` ERROR 2025-09-26 10:44:29,450 main:527 core::server: Error creating app: 'FireworksInferenceAdapter' object has no attribute 'alias_to_provider_id_map' ``` ## Test Plan manual startup w/ valid together & fireworks api keys	2025-09-26 16:18:32 -04:00
Ben Browning	b6e2934f7b	fix: Gracefully handle errors when listing MCP tools (#2544 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? When listing (and lazily indexing) tools, it's possible for an error to get thrown by individual toolgroups if for example an MCP toolgroup is unable to connect to its `mcp_endpoint`. This logs a warning in the server when that happens, logs a full stack trace of the error if debug logging is enabled, and just returns the list of tools from all working toolgroups instead of throwing an error to the client when a single toolgroup is temporarily or permanently misbehaving. The exception to the above is authentication errors, which we specifically send all the way back to the client as that's how we indicate to the client that it needs to provide authentication data for the remote MCP servers. Closes #2540 ## Test Plan A new unit test was added to test this exception handling, which is run as part of our regular test suite but also manually run to specifically verify this fix via: ``` uv run pytest -sv --asyncio-mode=auto \ tests/unit/distribution/routers/test_routing_tables.py ``` To verify the additional debug logging is printing properly: ``` LLAMA_STACK_LOGGING=core=debug \ uv run pytest -sv --asyncio-mode=auto \ tests/unit/distribution/routers/test_routing_tables.py ``` The mcp integration tests were run as below (and by CI): ``` ollama run llama3.2:3b ENABLE_OLLAMA="ollama" \ OLLAMA_INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ LLAMA_STACK_CONFIG=starter \ uv run pytest -sv tests/integration/tool_runtime/test_mcp.py \ --text-model meta-llama/Llama-3.2-3B-Instruct ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-09-26 18:09:48 +02:00

1 2 3 4 5 ...

2769 commits