llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-06 02:30:58 +00:00

Author	SHA1	Message	Date
Eric Huang	eacb7226a6	merge commit for archive created by Sapling	2025-09-30 14:27:57 -07:00
Eric Huang	4f7c177c62	fix: don't pass default response format in Responses # What does this PR do? ## Test Plan	2025-09-30 14:27:09 -07:00
grs	d350e3662b	feat: add support for require_approval argument when creating response (#3608 ) # What does this PR do? This PR adds support for the require_approval on an mcp tool definition passed to create response in the Responses API. This allows the caller to indicate whether they want to approve calls to that server, or let them be called without approval. Closes #3443 ## Test Plan Tested both approval and denial. Added automated integration test for both cases. --------- Signed-off-by: Gordon Sim <gsim@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>	2025-09-30 14:18:34 -07:00
Alexey Rybak	0837fa7bef	docs: update safety notebook (#3617 ) # What does this PR do? * Updates the safety guide in Zero to Hero series to use Moderations API and the latest safety models * Fixes an image link Closes #2557 ## Test Plan * Manual testing	2025-09-30 14:11:12 -07:00
Alexey Rybak	c4c980b056	docs: frontpage update (#3620 ) # What does this PR do? * Adds canonical project information and links to client SDK / k8s operator / app examples repos to the front page * Fixes some button rendering errors Closes #3618 ## Test Plan Local rebuild of the documentation server	2025-09-30 14:11:00 -07:00
Ashwin Bharambe	606f4cf281	fix(expires_after): make sure multipart/form-data is properly parsed (#3612 ) https://github.com/llamastack/llama-stack/pull/3604 broke multipart form data field parsing for the Files API since it changed its shape -- so as to match the API exactly to the OpenAI spec even in the generated client code. The underlying reason is that multipart/form-data cannot transport structured nested fields. Each field must be str-serialized. The client (specifically the OpenAI client whose behavior we must match), transports sub-fields as `expires_after[anchor]` and `expires_after[seconds]`, etc. We must be able to handle these fields somehow on the server without compromising the shape of the YAML spec. This PR "fixes" this by adding a dependency to convert the data. The main trade-off here is that we must add this `Depends()` annotation on every provider implementation for Files. This is a headache, but a much more reasonable one (in my opinion) given the alternatives. ## Test Plan Tests as shown in https://github.com/llamastack/llama-stack/pull/3604#issuecomment-3351090653 pass.	2025-09-30 16:14:03 -04:00
Ashwin Bharambe	73de235ef1	fix(eval): use client.alpha for eval tests	2025-09-30 13:02:33 -07:00
slekkala1	cc64093ae4	feat(api): Add Vector Store File batches api stub (#3615 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 34s Details Pre-commit / pre-commit (push) Successful in 1m14s Details # What does this PR do? Adding api stubs for vector store file batches apis https://github.com/llamastack/llama-stack/issues/3533 API Ref: https://platform.openai.com/docs/api-reference/vector-stores-file-batches ## Test Plan CI	2025-09-30 12:07:33 -07:00
ehhuang	4c04d9250b	Merge `f034004ae6` into sapling-pr-archive-ehhuang	2025-09-30 12:04:57 -07:00
Eric Huang	f034004ae6	fix: don't pass default response format in Responses # What does this PR do? ## Test Plan	2025-09-30 12:04:51 -07:00
ehhuang	d1021dc1c3	Merge `f387e4023f` into sapling-pr-archive-ehhuang	2025-09-30 11:34:06 -07:00
Eric Huang	f387e4023f	fix: don't pass default response format in Responses # What does this PR do? ## Test Plan	2025-09-30 11:33:58 -07:00
Eric Huang	5c0ffea6c7	merge commit for archive created by Sapling	2025-09-30 11:31:33 -07:00
Eric Huang	28cc185cbb	fix: don't pass default response format in Responses # What does this PR do? ## Test Plan	2025-09-30 11:31:26 -07:00
ehhuang	48818eb754	Merge `a03f0cabfd` into sapling-pr-archive-ehhuang	2025-09-30 11:28:37 -07:00
Eric Huang	a03f0cabfd	fix: don't pass default response format in Responses # What does this PR do? ## Test Plan	2025-09-30 11:28:32 -07:00
ehhuang	8254c96135	Merge `0cc072dcaf` into sapling-pr-archive-ehhuang	2025-09-30 11:24:44 -07:00
Eric Huang	0cc072dcaf	fix: don't pass default response format in Responses # What does this PR do? ## Test Plan	2025-09-30 11:24:28 -07:00
Charlie Doern	1e25a72ece	feat(api): level /agents as `v1alpha` (#3610 ) # What does this PR do? agents is likely to be deprecated in favor of responses. Lets level it as alpha to indicate the lack of longterm support keep v1 route for backwards compat. Closes #3611 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-30 11:15:04 -07:00
Matthew Farrellee	2de4e6c900	feat: use /v1/chat/completions for safety model inference (#3591 ) # What does this PR do? migrate safety api implementation from /inference/chat-completion to /v1/chat/completions ## Test Plan ci w/ recordings --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-30 11:01:44 -07:00
Matthew Farrellee	cb33f45c11	chore: unpublish /inference/chat-completion (#3609 ) # What does this PR do? BREAKING CHANGE: removes /inference/chat-completion route and updates relevant documentation ## Test Plan 🤷	2025-09-30 11:00:42 -07:00
Kai Wu	62e302613f	feat: add llamastack + CrewAI integration example notebook (#3275 ) # What does this PR do? Add llamastack + CrewAI integration example notebook <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Tested in local jupyternotebook and it works.	2025-09-30 10:23:57 -07:00
ehhuang	6cce553c93	fix: mcp tool with array type should include items (#3602 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 11s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 23s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 28s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details API Conformance Tests / check-schema-compatibility (push) Successful in 32s Details UI Tests / ui-tests (22) (push) Successful in 57s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? Fixes error: ``` [ERROR] Error executing endpoint route='/v1/openai/v1/responses' method='post': Error code: 400 - {'error': {'message': "Invalid schema for function 'pods_exec': In context=('properties', 'command'), array schema missing items.", 'type': 'invalid_request_error', 'param': 'tools[7].function.parameters', 'code': 'invalid_function_parameters'}} ``` From script: ``` #!/usr/bin/env python3 """ Script to test Responses API with kubernetes-mcp-server. This script: 1. Connects to the llama stack server 2. Uses the Responses API with MCP tools 3. Asks for the list of Kubernetes namespaces using the kubernetes-mcp-server """ import json from openai import OpenAI # Connect to the llama stack server base_url = "http://localhost:8321/v1/openai/v1" client = OpenAI(base_url=base_url, api_key="fake") # Define the MCP tool pointing to the kubernetes-mcp-server # The kubernetes-mcp-server is running on port 3000 with SSE endpoint at /sse mcp_server_url = "http://localhost:3000/sse" tools = [ { "type": "mcp", "server_label": "k8s", "server_url": mcp_server_url, } ] # Create a response request asking for k8s namespaces print("Sending request to list Kubernetes namespaces...") print(f"Using MCP server at: {mcp_server_url}") print("Available tools will be listed automatically by the MCP server.") print() response = client.responses.create( # model="meta-llama/Llama-3.2-3B-Instruct", # Using the vllm model model="openai/gpt-4o", input="what are all the Kubernetes namespaces? Use tool call to `namespaces_list`. make sure to adhere to the tool calling format.", tools=tools, stream=False, ) print("\n" + "=" * 80) print("RESPONSE OUTPUT:") print("=" * 80) # Print the output for i, output in enumerate(response.output): print(f"\n[Output {i + 1}] Type: {output.type}") if output.type == "mcp_list_tools": print(f" Server: {output.server_label}") print(f" Tools available: {[t.name for t in output.tools]}") elif output.type == "mcp_call": print(f" Tool called: {output.name}") print(f" Arguments: {output.arguments}") print(f" Result: {output.output}") if output.error: print(f" Error: {output.error}") elif output.type == "message": print(f" Role: {output.role}") print(f" Content: {output.content}") print("\n" + "=" * 80) print("FINAL RESPONSE TEXT:") print("=" * 80) print(response.output_text) ``` ## Test Plan new unit tests script now runs successfully	2025-09-29 23:11:41 -07:00
Eric Huang	754e58fbcb	merge commit for archive created by Sapling	2025-09-29 22:58:38 -07:00
Eric Huang	19ca0d0d9c	fix: mcp tool with array type should include items # What does this PR do? ## Test Plan	2025-09-29 22:58:31 -07:00
Ashwin Bharambe	56b625d18a	feat(openai_movement)!: Change URL structures to kill /openai/v1 (part 2) (#3605 )	2025-09-29 22:57:37 -07:00
ehhuang	e753dc56dd	Merge `b2694a3620` into sapling-pr-archive-ehhuang	2025-09-29 22:43:27 -07:00
Eric Huang	b2694a3620	fix: mcp tool with array type should include items # What does this PR do? ## Test Plan	2025-09-29 22:43:21 -07:00
ehhuang	fd9cc55732	Merge `d874597908` into sapling-pr-archive-ehhuang	2025-09-29 22:39:15 -07:00
Eric Huang	d874597908	fix: mcp tool with array type should include items # What does this PR do? ## Test Plan	2025-09-29 22:39:07 -07:00
Eric Huang	c6d005be3c	merge commit for archive created by Sapling	2025-09-29 22:13:17 -07:00
Eric Huang	be97c9f9df	fix: mcp tool with array type should include items # What does this PR do? ## Test Plan	2025-09-29 22:13:11 -07:00
Eric Huang	61837bc683	merge commit for archive created by Sapling	2025-09-29 22:11:39 -07:00
Eric Huang	fad9f6c4c9	fix: mcp tool with array type should include items # What does this PR do? ## Test Plan	2025-09-29 22:11:21 -07:00
Ashwin Bharambe	3a09f00cdb	feat(files): fix expires_after API shape (#3604 ) This was just quite incorrect. See source here: https://platform.openai.com/docs/api-reference/files/create	2025-09-29 21:29:15 -07:00
Ashwin Bharambe	5e7fed8bbb	feat(openai_movement): Change URL structures to kill /openai/v1 (part 1) (#3587 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m19s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 38s Details The `/v1/openai/v1` prefix is annoying and now unnecessary given our clearer focus on how to think about the API surface. Let's kill it for the 0.3.0 update. To make client-side changes feasible, we will do this in two parts. This part adds a new route (sans `/openai/v1`) so the existing client continues to work since the server supports both. The next PR will be client-side (Stainless) changes which I will be making shortly. The final PR will remove the `/openai/v1` routes. Note that all these changes will happen rapidly within this release cycle. The entire set _will be backwards incompatible_.	2025-09-29 16:14:35 -07:00
Eric Huang	92a3de581f	merge commit for archive created by Sapling	2025-09-29 15:57:01 -07:00
Eric Huang	1b308fd872	fix: mcp tool with array type should include items # What does this PR do? ## Test Plan	2025-09-29 15:56:54 -07:00
Eric Huang	437a8a4e7c	merge commit for archive created by Sapling	2025-09-29 15:53:21 -07:00
Eric Huang	cd1f6410ce	fix: mcp tool with array type should include items # What does this PR do? ## Test Plan	2025-09-29 15:53:14 -07:00
ehhuang	91898e6598	Merge `b1cbfe99f9` into sapling-pr-archive-ehhuang	2025-09-29 15:52:57 -07:00
Eric Huang	b1cbfe99f9	fix: mcp tool with array type should include items # What does this PR do? ## Test Plan	2025-09-29 15:52:45 -07:00
Michael Dawson	ddf3f1735a	fix: ensure usage is requested if telemetry is enabled (#3571 ) # What does this PR do? Refs: https://github.com/llamastack/llama-stack/issues/3420 When telemetry is enabled the router uncondionally expects the usage attribute to be availble and fails if it is not present. Usage is not currently being requested by litellm_openai_mixin.py for streaming requests when using the responses API which means that providers like vertexai fail if telemetry is enabled and streaming is used. This is part of the required fix. Other part is in liteLLM, will plan to submit PR for that soon. ## Test Plan I applied this change along with the change for litellm in a llama stack deployment and validated that I could make streaming requests through the responses API to a gemini model and they would succeed instead of failing due to the missing usage attribute when telemetry is enabled. Signed-off-by: Michael Dawson <midawson@redhat.com>	2025-09-29 14:09:08 -07:00
slekkala1	455579a88e	fix: Remove deprecated user param in OpenAIResponseObject (#3596 ) # What does this PR do? Just removing the deprecated User param in `OpenAIResponseObject` Closing https://github.com/llamastack/llama-stack/issues/3482 ## Test Plan CI	2025-09-29 13:55:59 -07:00
Matthew Farrellee	e9eb004bf8	fix: remove inference.completion from docs (#3589 ) # What does this PR do? now that /v1/inference/completion has been removed, no docs should refer to it this cleans up remaining references ## Test Plan ci Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:14:41 -07:00
Alexey Rybak	498be131a1	docs: update image paths (#3599 ) # What does this PR do? * Updates image paths for images in docs/resources/ to proper static image locations ## Test Plan * `npm run build` builds documentation properly	2025-09-29 13:14:05 -07:00
Matthew Farrellee	7c888fc0da	feat: update eval runner to use openai endpoints (#3588 ) # What does this PR do? move the eval=inline::meta-reference implementation to use openai_completion/openai_chat_completion note: this breaks backward compatibility if eval setup used sampling params' repetition_penalty or strategy ## Test Plan ci w/ new recordings Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:13:53 -07:00
Matthew Farrellee	45f438c027	chore: skip safety tests when shield not available (#3592 ) # What does this PR do? we skip embedding tests when the embedding_model_id isn't provided. same for completion / chat tests when text_model_id isn't given. instead of failing safety tests when a shield_id isn't provided, we'll skip them too. ## Test Plan ci Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:11:37 -07:00
Charlie Doern	aac42ddcc2	feat(api): level inference/rerank and remove experimental (#3565 ) # What does this PR do? inference/rerank is the one route in the API intended to not be deprecated. Level it as v1alpha. Additionally, remove `experimental` and opt to instead use `v1alpha` which itself implies an experimental state based on the original proposal Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-29 12:42:09 -07:00
Matthew Farrellee	975ead1d6a	chore(api): remove deprecated embeddings impls (#3301 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m25s Details # What does this PR do? remove deprecated embeddings implementations	2025-09-29 14:45:09 -04:00

1 2 3 4 5 ...

2913 commits