llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-08-15 14:08:00 +00:00

Author	SHA1	Message	Date
Nathan Weinberg	90b33aac3a	Merge `f22cd3fac7` into `61582f327c`	2025-08-14 13:54:59 -04:00
Ashwin Bharambe	61582f327c	fix(ci): update triggers for the workflows (#3152 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 8s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 20s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Update ReadTheDocs / update-readthedocs (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s Details Test External API and Providers / test-external (venv) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 1m39s Details	2025-08-14 10:27:25 -07:00
Derek Higgins	c15cc7ed77	fix: use ChatCompletionMessageFunctionToolCall (#3142 ) The OpenAI compatibility layer was incorrectly importing ChatCompletionMessageToolCallParam instead of the ChatCompletionMessageFunctionToolCall class. This caused "Cannot instantiate typing.Union" errors when processing agent requests with tool calls. Closes: #3141 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-08-14 10:27:00 -07:00
Ashwin Bharambe	ee7631b6cf	Revert "feat: add batches API with OpenAI compatibility" (#3149 ) Reverts llamastack/llama-stack#3088 The PR broke integration tests.	2025-08-14 10:08:54 -07:00
Matthew Farrellee	de692162af	feat: add batches API with OpenAI compatibility (#3088 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s Details Python Package Build Test / build (3.12) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 25s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 29s Details Unit Tests / unit-tests (3.12) (push) Failing after 20s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 22s Details Unit Tests / unit-tests (3.13) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s Details Update ReadTheDocs / update-readthedocs (push) Failing after 38s Details Pre-commit / pre-commit (push) Successful in 1m53s Details Add complete batches API implementation with protocol, providers, and tests: Core Infrastructure: - Add batches API protocol using OpenAI Batch types directly - Add Api.batches enum value and protocol mapping in resolver - Add OpenAI "batch" file purpose support - Include proper error handling (ConflictError, ResourceNotFoundError) Reference Provider: - Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list) - Implement background batch processing with configurable concurrency - Add SQLite KVStore backend for persistence - Support /v1/chat/completions endpoint with request validation Comprehensive Test Suite: - Add unit tests for provider implementation with validation - Add integration tests for end-to-end batch processing workflows - Add error handling tests for validation, malformed inputs, and edge cases Configuration: - Add max_concurrent_batches and max_concurrent_requests_per_batch options - Add provider documentation with sample configurations Test with - ``` $ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run & $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK ``` addresses #3066	2025-08-14 09:42:02 -04:00
ehhuang	46ff302d87	chore: Remove Trendshift badge from README (#3137 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 13s Details Python Package Build Test / build (3.12) (push) Failing after 11s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 13s Details Python Package Build Test / build (3.13) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s Details Update ReadTheDocs / update-readthedocs (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 49s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 51s Details Unit Tests / unit-tests (3.12) (push) Failing after 51s Details Pre-commit / pre-commit (push) Successful in 1m36s Details ## Summary - This links to a scammy looking website with ads. ## Test plan	2025-08-13 18:38:34 -07:00
Ashwin Bharambe	e1e161553c	feat(responses): add MCP argument streaming and content part events (#3136 ) # What does this PR do? Adds content part streaming events to the OpenAI-compatible Responses API to support more granular streaming of response content. This introduces: 1. New schema types for content parts: `OpenAIResponseContentPart` with variants for text output and refusals 2. New streaming event types: - `OpenAIResponseObjectStreamResponseContentPartAdded` for when content parts begin - `OpenAIResponseObjectStreamResponseContentPartDone` for when content parts complete 3. Implementation in the reference provider to emit these events during streaming responses. Also emits MCP arguments just like function call ones. ## Test Plan Updated existing streaming tests to verify content part events are properly emitted	2025-08-13 16:34:26 -07:00
Ashwin Bharambe	8638537d14	feat(responses): stream progress of tool calls (#3135 ) # What does this PR do? Enhances tool execution streaming by adding support for real-time progress events during tool calls. This implementation adds streaming events for MCP and web search tools, including in-progress, searching, completed, and failed states. The refactored `_execute_tool_call` method now returns an async iterator that yields streaming events throughout the tool execution lifecycle. ## Test Plan Updated the integration test `test_response_streaming_multi_turn_tool_execution` to verify the presence and structure of new streaming events, including: - Checking for MCP in-progress and completed events - Verifying that progress events contain required fields (item_id, output_index, sequence_number) - Ensuring completed events have the necessary sequence_number field	2025-08-13 16:31:25 -07:00
Ashwin Bharambe	5b312a80b9	feat(responses): improve streaming for function calls (#3124 ) Some checks failed Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 29s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Update ReadTheDocs / update-readthedocs (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 22s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 17s Details Pre-commit / pre-commit (push) Successful in 1m10s Details Test Llama Stack Build / build (push) Failing after 12s Details Emit streaming events for function calls ## Test Plan Improved the test case	2025-08-13 11:23:27 -07:00
ehhuang	d6ae54723d	chore: setup for performance benchmarking (#3096 ) # What does this PR do? 1. Added a simple mock openai-compat server that serves chat/completion 2. Add a benchmark server in EKS that includes mock inference server 3. Add locust (https://locust.io/) file for load testing ## Test Plan bash apply.sh kubectl port-forward service/locust-web-ui 8089:8089 Go to localhost:8089 to start a load test <img width="1392" height="334" alt="image" src="https://github.com/user-attachments/assets/d6aa3deb-583a-42ed-889b-751262b8e91c" /> <img width="1362" height="881" alt="image" src="https://github.com/user-attachments/assets/6a28b9b4-05e6-44e2-b504-07e60c12d35e" />	2025-08-13 10:58:22 -07:00
ehhuang	2f51273215	fix: huge speed boost (#3132 ) # What does this PR do? make llama stack fast again ## Test Plan	2025-08-13 09:51:35 -07:00
slekkala1	25e0553eed	chore: Change moderations api response to Provider returned categories (#3098 ) # What does this PR do? To be compliant with model policies for LLAMA, just return the categories as is from provider, we will lose the OAI compat in moderations api response. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan `SAFETY_MODEL=llama-guard3:8b LLAMA_STACK_CONFIG=starter uv run pytest -v tests/integration/safety/test_safety.py --text-model=llama3.2:3b-instruct-fp16 --embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama`	2025-08-13 09:47:35 -07:00
Ashwin Bharambe	a9081d87b9	feat(ci): update Recording workflow trigger and concurrency group	2025-08-13 09:36:13 -07:00
IAN MILLER	0950168f26	refactor: replace hardcoded status codes by httpx.codes (#3131 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this PR is to eliminate hardcoded status codes in server's responses and replace it by `httpx.codes` functionality for better consistency across the whole project and improvement in code readability. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run `./scripts/unit-tests.sh`	2025-08-13 08:43:41 -07:00
Kelly Brown	0cbd93c5cc	docs: Update blocks formatting in docs/source files (#3120 ) Description: The standard markdown [!NOTE] format is not supported on Sphinx generated documentation, replacing those instances. Also updating other Notes, Tips and Warning blocks throughout the source docs WIP: Working to update the provider code gen	2025-08-13 08:06:31 -07:00
IAN MILLER	c9b78602d3	refactor: modify DELETE API endpoints by returning HTTP 204 No Content + empty body instead of 200 OK + response body with null (#3112 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this PR is to make the behavior DELETE API endpoints be consistent with standard RESTful conventions and eliminate confusion for API consumers. Old Behavior ``` HTTP Status: 200 OK Response Body: null ``` Eg. `curl -X DELETE http://localhost:8321/v1/shields/test-shield` `null% ` `INFO 2025-08-12 16:11:57,932 console_span_processor:65 telemetry: 15:11:57.929 [INFO] ::1:59805 - "DELETE /v1/shields/test-shield HTTP/1.1" 200 ` Updated Behavior ``` HTTP Status: 204 No Content Response Body: empty (no body) ``` Eg. `curl -X DELETE http://localhost:8321/v1/shields/test-shield` `INFO 2025-08-12 16:18:16,645 console_span_processor:62 telemetry: 15:18:16.637 [INFO] ::1:60283 - "DELETE /v1/shields/test-shield HTTP/1.1" 204 ` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3090 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run `./scripts/unit-tests.sh`	2025-08-13 07:56:26 -07:00
Francisco Arceo	92aca434a7	fix: Fix list_sessions() (#3114 ) # What does this PR do? 1. Updates `AgentPersistence.list_sessions()` to properly filter out `Turn` keys from `Session` keys. 2. Adds a suite of unit tests to confirm the `list_sessions()` behavior and tests the failed sample in https://github.com/meta-llama/llama-stack/issues/3048 ## Fixes https://github.com/meta-llama/llama-stack/issues/3048 ## Test Plan Unit tests added. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-13 07:46:26 -07:00
Krzysztof Malczuk	5bd6cb52fb	fix: github action canceling valid tasks for checking semantic pr title (#3127 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR changes the group name from github.ref to github.even.pull_request_number. The reason for this is that github.ref does not act as a unique identifier in the pull_request_target event and only is unique in pull_request. The github action was getting canceled was because the group name was not unique in the concurrency section. <!-- If resolving an issue, uncomment and update the line below --> Closes #3102 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> To test this I have created a fake github action and ran it trough act to see what the github.ref variable produced and what alternatives can be used. This confirmed that the github.ref was not unique and that github.event.pull_request_number is unique to the PR.	2025-08-13 07:14:03 -07:00
Chacksu	fffdab4f5c	fix: Dell distribution missing kvstore (#3113 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Integration Tests (Replay) / discover-tests (push) Successful in 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 27s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 29s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 16s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s Details Test External API and Providers / test-external (venv) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s Details Test Llama Stack Build / build (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 37s Details Pre-commit / pre-commit (push) Successful in 1m44s Details # What does this PR do? - Added kvstore config to ChromaDB provider config for Dell distribution similar to [starter config](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/distributions/starter/run.yaml#L110-L112) - Fixed [error](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_generated/_async_client.py#L3424-L3425) getting endpoint information by adding `hf-inference` as the provider to the `AsyncInferenceClient` (TGI client). ## Test Plan ``` export INFERENCE_PORT=8181 export DEH_URL=http://0.0.0.0:$INFERENCE_PORT export INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct export CHROMADB_HOST=localhost export CHROMADB_PORT=8000 export CHROMA_URL=http://$CHROMADB_HOST:$CHROMADB_PORT export CUDA_VISIBLE_DEVICES=0 export LLAMA_STACK_PORT=8321 export HF_TOKEN=[redacted] # TGI Server docker run --rm -it \ --pull always \ --network host \ -v $HOME/.cache/huggingface:/data \ -e HF_TOKEN=$HF_TOKEN \ -e PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \ -p $INFERENCE_PORT:$INFERENCE_PORT \ --gpus all \ ghcr.io/huggingface/text-generation-inference:latest \ --dtype float16 \ --usage-stats off \ --sharded false \ --cuda-memory-fraction 0.8 \ --model-id meta-llama/Llama-3.2-3B-Instruct \ --port $INFERENCE_PORT \ --hostname 0.0.0.0 # Chrome DB docker run --rm -it \ --name chromadb \ --net=host -p 8000:8000 \ -v ~/chroma:/chroma/chroma \ -e IS_PERSISTENT=TRUE \ -e ANONYMIZED_TELEMETRY=FALSE \ chromadb/chroma:latest # Llama Stack llama stack run dell \ --port $LLAMA_STACK_PORT \ --env INFERENCE_MODEL=$INFERENCE_MODEL \ --env DEH_URL=$DEH_URL \ --env CHROMA_URL=$CHROMA_URL ``` --------- Co-authored-by: Connor Hack <connorhack@fb.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-08-13 06:18:25 -07:00
Kelly Brown	6358d0a478	docs: reorganize contributor guide (#3110 ) Some checks failed Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 22s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 19s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 11s Details Pre-commit / pre-commit (push) Successful in 1m48s Details Description: Restructures contribution guide and move some sections into categories <img width="1399" height="527" alt="Screenshot 2025-08-12 at 9 28 44 AM" src="https://github.com/user-attachments/assets/404e23b4-0001-4174-b662-593e0173ef7d" />	2025-08-12 16:17:03 -07:00
Ashwin Bharambe	3d90117891	chore(tests): fix responses and vector_io tests (#3119 ) Some fixes to MCP tests. And a bunch of fixes for Vector providers. I also enabled a bunch of Vector IO tests to be used with `LlamaStackLibraryClient` ## Test Plan Run Responses tests with llama stack library client: ``` pytest -s -v tests/integration/non_ci/responses/ --stack-config=server:starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ``` Do the same with `-k openai_client` The rest should be taken care of by CI.	2025-08-12 16:15:53 -07:00
Ashwin Bharambe	1721aafc1f	feat(responses): type file results properly (#3117 ) Some checks failed Python Package Build Test / build (3.13) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 15s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 28s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 26s Details Test Llama Stack Build / build (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s Details Pre-commit / pre-commit (push) Successful in 1m16s Details Another thing our tests implicitly depended on.	2025-08-12 10:39:09 -07:00
Ashwin Bharambe	4fec49dfdb	feat(responses): add include parameter (#3115 ) Well our Responses tests use it so we better include it in the API, no? I discovered it because I want to make sure `llama-stack-client` can be used always instead of `openai-python` as the client (we do want to be _truly_ compatible.)	2025-08-12 10:24:01 -07:00
Nathan Weinberg	6812aa1e1e	chore: bump min python version in docs and tests (#3103 ) # What does this PR do? the minimum python version for the project was bumped to 3.12 a couple months ago, but there remains some artifacts in the repo suggesting we support >=3.10 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-12 08:52:57 -07:00
dependabot[bot]	88c4fdc5d7	chore(python-deps): bump chromadb from 1.0.15 to 1.0.16 (#3083 ) Bumps [chromadb](https://github.com/chroma-core/chroma) from 1.0.15 to 1.0.16. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/chroma-core/chroma/releases">chromadb's releases</a>.</em></p> <blockquote> <h2>1.0.16</h2> <p>Version: <code>1.0.16</code> Git ref: <code>refs/tags/1.0.16</code> Build Date: <code>2025-08-08T00:26</code> PIP Package: <code>chroma-1.0.16.tar.gz</code> Github Container Registry Image: <code>:1.0.16</code> DockerHub Image: <code>:1.0.16</code></p> <h2>What's Changed</h2> <ul> <li>[ENH]: add cache mount & tolerations to garbage collector template in Helm chart by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5016">chroma-core/chroma#5016</a></li> <li>[DOC] Fix docs typo by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5018">chroma-core/chroma#5018</a></li> <li>[CLN] Change GenericQuotaError from 429 to 422 by <a href="https://github.com/drewkim"><code>@drewkim</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5022">chroma-core/chroma#5022</a></li> <li>[CHORE] Fix type error in batch_utils by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5024">chroma-core/chroma#5024</a></li> <li>[ENH] Add block-level metrics by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/4801">chroma-core/chroma#4801</a></li> <li>[ENH]: return error on /add if embeddings are not provided by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5033">chroma-core/chroma#5033</a></li> <li>[DOC] Docs Polish 07/2025 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5032">chroma-core/chroma#5032</a></li> <li>[DOC] Flatten public txt files by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5040">chroma-core/chroma#5040</a></li> <li>[ENH]: require embeddings & require min embedding dimension on /add by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5037">chroma-core/chroma#5037</a></li> <li>[ENH] - Adds in dark mode support for hero image by <a href="https://github.com/tjkrusinskichroma"><code>@tjkrusinskichroma</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5042">chroma-core/chroma#5042</a></li> <li>[BLD] Use 8core runners for all our windows jobs by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5027">chroma-core/chroma#5027</a></li> <li>[TST] More benchmark queries for regex by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/4910">chroma-core/chroma#4910</a></li> <li>[BUG]: refactor otel/tracing initialization in the frontend to be independent of hosted entry point by <a href="https://github.com/c-gamble"><code>@c-gamble</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5028">chroma-core/chroma#5028</a></li> <li>[BUG] js client: handle 422 billing errors as QuotaExceeded instead of ChromaConnectionError by <a href="https://github.com/philipithomas"><code>@philipithomas</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5049">chroma-core/chroma#5049</a></li> <li>[BUG] RLS should use 32MB GRPC payload size limit by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5044">chroma-core/chroma#5044</a></li> <li>[BUG] Sync protoc arch and version in dockerfile by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5045">chroma-core/chroma#5045</a></li> <li>[BLD] Fix windows runner label by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5052">chroma-core/chroma#5052</a></li> <li>[PERF]: Prefetch segments in get and query by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5053">chroma-core/chroma#5053</a></li> <li>[PERF]: Parallelize fetching blocks for brute force regex by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5051">chroma-core/chroma#5051</a></li> <li>[RELEASE] JS 3.0.7 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5059">chroma-core/chroma#5059</a></li> <li>[ENH] Add a delete_many call to the storage API. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5020">chroma-core/chroma#5020</a></li> <li>[ENH] Consume delete_many from the wal3 garbage collector. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5021">chroma-core/chroma#5021</a></li> <li>[ENH]: limit number of concurrent get_all_block_ids() when using buffer_unordered() by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5062">chroma-core/chroma#5062</a></li> <li>[ENH]: use new <code>delete_many()</code> storage method in DeleteUnusedFiles operator by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5061">chroma-core/chroma#5061</a></li> <li>[BUG]: Disable aws stalled stream protection by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5063">chroma-core/chroma#5063</a></li> <li>[DOC] Update manage collections docs with correct delete collection info by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5066">chroma-core/chroma#5066</a></li> <li>[BUG] Improve wal3 robustness with better shutdown handling and error recovery by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5046">chroma-core/chroma#5046</a></li> <li>[ENH] Do not do any mutations of the manifest from within GC. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5050">chroma-core/chroma#5050</a></li> <li>[CHORE]: enable change notifier otel/tracing by <a href="https://github.com/c-gamble"><code>@c-gamble</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5073">chroma-core/chroma#5073</a></li> <li>[CHORE] Add pprof server to query service by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5072">chroma-core/chroma#5072</a></li> <li>[ENH]: Dedup inserts to the same key in foyer by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5074">chroma-core/chroma#5074</a></li> <li>[ENH] "Failed to fetch: status: NotFound" be gone. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5064">chroma-core/chroma#5064</a></li> <li>[CLN] Remove the the top most spammy log lines from rls/wal3. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5071">chroma-core/chroma#5071</a></li> <li>[DOC] Fix badge in readme by <a href="https://github.com/kylediaz"><code>@kylediaz</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5025">chroma-core/chroma#5025</a></li> <li>[ENH] A tool for patching logs that were deleted before a new manifest was installed. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5083">chroma-core/chroma#5083</a></li> <li>[BUG] Add billing errors to JS client by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5084">chroma-core/chroma#5084</a></li> <li>[CHORE]: Add s3 get metrics and pod name to tracing spans by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5086">chroma-core/chroma#5086</a></li> <li>[RELEASE] JS 3.0.8 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5087">chroma-core/chroma#5087</a></li> <li>[ENH] A tool to purge the cache. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5085">chroma-core/chroma#5085</a></li> <li>[DOC] Update PR template for migration and observability by <a href="https://github.com/HammadB"><code>@HammadB</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5089">chroma-core/chroma#5089</a></li> <li>[CHORE]: Fix s3 get metric name by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5091">chroma-core/chroma#5091</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`dff3a786db`"><code>dff3a78</code></a> [RELEASE] CLI 1.1.5, Python 1.0.16, JS 3.0.11 (<a href="https://redirect.github.com/chroma-core/chroma/issues/5227">#5227</a>)</li> <li><a href="`f60f932b8d`"><code>f60f932</code></a> [ENH]: Increase nprobe for smaller collections (<a href="https://redirect.github.com/chroma-core/chroma/issues/5226">#5226</a>)</li> <li><a href="`f593a43b5d`"><code>f593a43</code></a> [ENH] Add <code>InsertRecordSet</code> to JS client (<a href="https://redirect.github.com/chroma-core/chroma/issues/5225">#5225</a>)</li> <li><a href="`76a14c226a`"><code>76a14c2</code></a> [DOC] Made light/dark mode for Chroma logo (<a href="https://redirect.github.com/chroma-core/chroma/issues/5215">#5215</a>)</li> <li><a href="`d80817ede4`"><code>d80817e</code></a> [ENH]: Add more tracing in the filter path (<a href="https://redirect.github.com/chroma-core/chroma/issues/5219">#5219</a>)</li> <li><a href="`73abfdc51a`"><code>73abfdc</code></a> [ENH] Handle when the garbage doesn't overlap the manifest. (<a href="https://redirect.github.com/chroma-core/chroma/issues/5207">#5207</a>)</li> <li><a href="`fa392226ba`"><code>fa39222</code></a> [BUG] Revert accidentally commited code (<a href="https://redirect.github.com/chroma-core/chroma/issues/5205">#5205</a>)</li> <li><a href="`815c3ac561`"><code>815c3ac</code></a> [ENH]: Fix CI flake with adaptive nsearch (<a href="https://redirect.github.com/chroma-core/chroma/issues/5203">#5203</a>)</li> <li><a href="`ea66d6929c`"><code>ea66d69</code></a> [BUG] Switch to rust-tls (<a href="https://redirect.github.com/chroma-core/chroma/issues/5204">#5204</a>)</li> <li><a href="`04aeb22139`"><code>04aeb22</code></a> [ENH]: Calculate cache weight of block size instead of hardcoding (<a href="https://redirect.github.com/chroma-core/chroma/issues/5201">#5201</a>)</li> <li>Additional commits viewable in <a href="https://github.com/chroma-core/chroma/compare/1.0.15...1.0.16">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=chromadb&package-manager=uv&previous-version=1.0.15&new-version=1.0.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-12 08:44:39 -07:00
dependabot[bot]	393f3714b0	chore(python-deps): bump torch from 2.7.1 to 2.8.0 (#3082 ) Bumps [torch](https://github.com/pytorch/pytorch) from 2.7.1 to 2.8.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pytorch/pytorch/releases">torch's releases</a>.</em></p> <blockquote> <h1>PyTorch 2.8.0 Release Notes</h1> <ul> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#highlights">Highlights</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#backwards-incompatible-changes">Backwards Incompatible Changes</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#deprecations">Deprecations</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#new-features">New Features</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#improvements">Improvements</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#bug-fixes">Bug fixes</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#performance">Performance</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#documentation">Documentation</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#developers">Developers</a></li> </ul> <h1>Highlights</h1> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`ba56102387`"><code>ba56102</code></a> Cherrypick: Add the RunLLM widget to the website (<a href="https://redirect.github.com/pytorch/pytorch/issues/159592">#159592</a>)</li> <li><a href="`c525a02c89`"><code>c525a02</code></a> [dynamo, docs] cherry pick torch.compile programming model docs into 2.8 (<a href="https://redirect.github.com/pytorch/pytorch/issues/15">#15</a>...</li> <li><a href="`a1cb3cc05d`"><code>a1cb3cc</code></a> [Release Only] Remove nvshmem from list of preload libraries (<a href="https://redirect.github.com/pytorch/pytorch/issues/158925">#158925</a>)</li> <li><a href="`c76b2356bc`"><code>c76b235</code></a> Move out super large one off foreach_copy test (<a href="https://redirect.github.com/pytorch/pytorch/issues/158880">#158880</a>)</li> <li><a href="`20a0e225a0`"><code>20a0e22</code></a> Revert "[Dynamo] Allow inlining into AO quantization modules (<a href="https://redirect.github.com/pytorch/pytorch/issues/152934">#152934</a>)" (<a href="https://redirect.github.com/pytorch/pytorch/issues/158">#158</a>...</li> <li><a href="`9167ac8c75`"><code>9167ac8</code></a> [MPS] Switch Cholesky decomp to column wise (<a href="https://redirect.github.com/pytorch/pytorch/issues/158237">#158237</a>)</li> <li><a href="`5534685c62`"><code>5534685</code></a> [MPS] Reimplement <code>tri[ul]</code> as Metal shaders (<a href="https://redirect.github.com/pytorch/pytorch/issues/158867">#158867</a>)</li> <li><a href="`d19e08d74b`"><code>d19e08d</code></a> Cherry pick PR 158746 (<a href="https://redirect.github.com/pytorch/pytorch/issues/158801">#158801</a>)</li> <li><a href="`a6c044ab9a`"><code>a6c044a</code></a> [cherry-pick] Unify torch.tensor and torch.ops.aten.scalar_tensor behavior (#...</li> <li><a href="`620ebd0646`"><code>620ebd0</code></a> [Dynamo] Use proper sources for constructing dataclass defaults (<a href="https://redirect.github.com/pytorch/pytorch/issues/158689">#158689</a>)</li> <li>Additional commits viewable in <a href="https://github.com/pytorch/pytorch/compare/v2.7.1...v2.8.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=torch&package-manager=uv&previous-version=2.7.1&new-version=2.8.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-12 08:44:24 -07:00
Matthew Farrellee	b70e2f1f09	fix(dep): update to openai >= 1.99.6 and use new Function location (#3087 ) # What does this PR do? closes #3072 ## Test Plan ci	2025-08-12 08:40:32 -07:00
Mustafa Elbehery	4a13ef45e9	fix: Implement missing `run_moderation` method in `PromptGuardSafetyImpl` (#3101 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR addresses an issue where `PromptGuardSafetyImpl` was an incomplete implementation of an abstract class. The class was missing the required run_moderation method from its parent interface. Currently, running `pre-commit` locally fails with the error below. ``` llama_stack/providers/inline/safety/prompt_guard/__init__.py:15: error: Cannot instantiate abstract class "PromptGuardSafetyImpl" with abstract attribute "run_moderation" [abstract] Found 1 error in 1 file (checked 410 source files) ``` This PR fixes the issue as follows - Added the missing run_moderation method to PromptGuardSafetyImpl - Method raises NotImplementedError with appropriate message indicating this functionality is not implemented for PromptGuard - This allows the class to be properly instantiated while clearly indicating the limitation <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-08-12 08:32:52 -07:00
Nathan Weinberg	f22cd3fac7	chore: add issue template for docs updates to further refine our issue backlog, create a template for maintainers/contributors/users to open issues related to our documentation Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-12 06:43:14 -04:00
Nathan Weinberg	19123ca957	refactor: standardize InferenceRouter model handling (#2965 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s Details Python Package Build Test / build (3.13) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s Details Test External API and Providers / test-external (venv) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s Details Unit Tests / unit-tests (3.12) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 21s Details Unit Tests / unit-tests (3.13) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s Details Pre-commit / pre-commit (push) Successful in 1m19s Details	2025-08-12 04:20:39 -06:00
Ashwin Bharambe	803114180b	chore(logging)!: use comma as a delimiter (#3095 ) Some checks failed Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s Details Test Llama Stack Build / generate-matrix (push) Successful in 11s Details Test Llama Stack Build / build-single-provider (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 11s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Update ReadTheDocs / update-readthedocs (push) Failing after 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 29s Details Test External API and Providers / test-external (venv) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 34s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 26s Details Integration Tests (Replay) / discover-tests (push) Successful in 31s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 30s Details Python Package Build Test / build (3.13) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 32s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 40s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 44s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 32s Details Pre-commit / pre-commit (push) Successful in 1m24s Details Test Llama Stack Build / build (push) Failing after 54s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 13s Details Using commas is much more shell-friendly. A semi-colon is a statement delimiter and must be escaped. This change is backwards incompatible but I imagine not many people are using this. I could be wrong. Looking for feedback.	2025-08-11 11:51:43 -07:00
Francisco Arceo	f7adf58b1b	docs: Add documentation on how to contribute a Vector DB provider and update testing documentation (#3093 ) # What does this PR do? - Adds documentation on how to contribute a Vector DB provider. - Updates the testing section to be a little friendlier to navigate. - Also added new shortcut for search so that `/` and `⌘ K` or `ctrl+K` trigger search <img width="1903" height="1346" alt="Screenshot 2025-08-11 at 10 10 12 AM" src="https://github.com/user-attachments/assets/6995b3b8-a2ab-4200-be72-c5b03a784a29" /> <img width="1915" height="1438" alt="Screenshot 2025-08-11 at 10 10 25 AM" src="https://github.com/user-attachments/assets/1f54d30e-5be1-4f27-b1e9-3c3537dcb8e9" /> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-11 11:11:09 -07:00
Mustafa Elbehery	b5b5f5b9ae	chore: add `mypy` prompt guard (#2678 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-08-11 08:40:40 -07:00
Francisco Arceo	7448a4a88c	chore: Updating UI Sidebar (#3081 ) # What does this PR do? This updates the sidebar to look a little more like other popular ones. <img width="1913" height="1352" alt="Screenshot 2025-08-08 at 11 25 31 PM" src="https://github.com/user-attachments/assets/00738412-1101-48ec-8864-cde4a8733ec1" /> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-11 07:39:52 -07:00
Matthew Farrellee	8faff92591	chore: remove redundant code in unregister_toolgroup (#3092 ) # What does this PR do? removes redundant code ## Test Plan ci	2025-08-11 07:38:54 -07:00
Eran Cohen	a4bad6c0b4	feat: Add Google Vertex AI inference provider support (#2841 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 10s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 12s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s Details Test Llama Stack Build / build-single-provider (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s Details Test Llama Stack Build / build (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 47s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s Details Unit Tests / unit-tests (3.13) (push) Failing after 39s Details Pre-commit / pre-commit (push) Successful in 1m37s Details # What does this PR do? - Add new Vertex AI remote inference provider with litellm integration - Support for Gemini models through Google Cloud Vertex AI platform - Uses Google Cloud Application Default Credentials (ADC) for authentication - Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash. - Updated provider registry to include vertexai provider - Updated starter template to support Vertex AI configuration - Added comprehensive documentation and sample configuration <!-- If resolving an issue, uncomment and update the line below --> relates to https://github.com/meta-llama/llama-stack/issues/2747 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Eran Cohen <eranco@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-08-11 08:22:04 -04:00
Francisco Arceo	78a59a4dbe	chore: Adding GitHub Stars, trends, and contributor shout out to README (#3079 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / discover-tests (push) Successful in 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 13s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s Details Test External API and Providers / test-external (venv) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 50s Details Unit Tests / unit-tests (3.13) (push) Failing after 48s Details Pre-commit / pre-commit (push) Successful in 1m54s Details # What does this PR do? Updates READMe to add 1. GitHub badge highlighting Llama Stack as #1 Repo of the Day 2. GitHub Star History (cumulative stars chart) 3. Contributor shout out <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-10 21:11:14 -04:00
Varsha	69dc789e15	docs: Add unsupported search mode info about FAISS (#3089 )	2025-08-10 17:34:34 -06:00
Varsha	ce72a28525	docs: Update doc on search modes for Milvus (#3078 ) # What does this PR do? Update Milvus doc on using search modes. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-08-10 18:48:36 -04:00
Vlastimil Eliáš	1677d6bffd	feat: Flash-Lite 2.0 and 2.5 models added to Gemini inference provider (#3058 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Python Package Build Test / build (3.13) (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 59s Details Pre-commit / pre-commit (push) Successful in 1m41s Details PR adds Flash-Lite 2.0 and 2.5 models to the Gemini inference provider Closes #3046 ## Test Plan I was not able to locate any existing test for this provider, so I performed manual testing. But the change is really trivial and straightforward.	2025-08-08 13:48:15 -07:00
ehhuang	0b5a794c27	fix: telemetry logger spams when queue is full (#3070 ) # What does this PR do? ## Test Plan Ran a stress test on chat completion endpoint locally: For 10 concurrent users over 3 minutes: Before: <img width="1440" height="201" alt="image" src="https://github.com/user-attachments/assets/24e0d580-186e-4e24-931e-2b936c5859b6" /> After: <img width="1434" height="204" alt="image" src="https://github.com/user-attachments/assets/4b806d88-f822-41e9-b25a-018cc4bec866" /> (Will send scripts in a future PR.)	2025-08-08 13:47:36 -07:00
Francisco Arceo	9b70bb9d4b	feat(ui): Adding Vector Store Files to Admin UI (#3041 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 57s Details Unit Tests / unit-tests (3.12) (push) Failing after 55s Details Pre-commit / pre-commit (push) Successful in 2m10s Details # What does this PR do? This PR updates the UI to create new: 1. `/files/{file_id}` 2. `files/{file_id}/contents` 3. `files/{file_id}/contents/{content_id}` The list of files are clickable which brings the user to the FIles Detail page The File Details page shows all of the content The content details page shows the individual chunk/content parsed These only use our existing OpenAI compatible APIs. I have a separate branch where I expose the embedding and the portal is correctly populated. I included the FE rendering code for that in this PR. 1. `vector-stores/{vector_store_id}/files/{file_id}` <img width="1913" height="1351" alt="Screenshot 2025-08-06 at 10 20 12 PM" src="https://github.com/user-attachments/assets/08010d5e-60c8-4bd9-9f3e-a2731ed1ad55" /> 2. `vector-stores/{vector_store_id}/files/{file_id}/contents` <img width="1920" height="1272" alt="Screenshot 2025-08-06 at 10 21 23 PM" src="https://github.com/user-attachments/assets/3b91e67b-5d64-4fe6-91b6-18f14587e850" /> 3. `vector-stores/{vector_store_id}/files/{file_id}/contents/{content_id}` <img width="1916" height="1273" alt="Screenshot 2025-08-06 at 10 21 45 PM" src="https://github.com/user-attachments/assets/d38ca996-e8d9-460c-9e39-7ff0cb5ec0dd" /> ## Test Plan I tested this locally and reviewed the code. I generated a significant share of the code with Claude and some manual intervention. After this, I'll begin adding tests to the UI. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-08 07:44:06 -07:00
Jiayi Ni	9e78f2da96	docs: fix the docs for NVIDIA Inference Provider (#3055 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Test Llama Stack Build / generate-matrix (push) Successful in 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Test External API and Providers / test-external (venv) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 20s Details Python Package Build Test / build (3.12) (push) Failing after 23s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 21s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 17s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 51s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 58s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 56s Details Pre-commit / pre-commit (push) Successful in 1m40s Details Test Llama Stack Build / build (push) Failing after 14s Details # What does this PR do? Fix the NVIDIA inference docs by updating API methods, model IDs, and embedding example. ## Test Plan N/A	2025-08-08 11:27:55 +02:00
Ashwin Bharambe	e90fe25890	fix(tests): move llama stack client init back to fixture (#3071 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s Details Test External API and Providers / test-external (venv) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 50s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 54s Details Unit Tests / unit-tests (3.13) (push) Failing after 47s Details Pre-commit / pre-commit (push) Successful in 1m44s Details See inline comments	2025-08-07 15:29:53 -07:00
Ashwin Bharambe	5f1ddd35e4	chore(tests): refactor and move responses tests away from verifications (#3068 ) This PR kills the verifications infrastructure which is no longer used. It was relocated to the `llama-stack-evals` (https://github.com/meta-llama/llama-stack-evals) repository previously. Responses tests used this infrastructure but that wasn't quite necessary, just a little useful back when @bbrownin introduced the tests. On Discord, we agreed that tests can be moved to our regular integrations test infra. ## Test Plan Some tests currently do fail (although they run!) I will send a follow-up PR which makes them all pass.	2025-08-07 13:48:16 -07:00
Dean Wampler	342550c1e2	docs: Added comment about a known limitation of AgentEventLogger (#2930 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 7s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 17s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 28s Details Pre-commit / pre-commit (push) Successful in 1m11s Details # What does this PR do? `AgentEventLogger` only supports streaming responses, so I suggest adding a comment near the bottom of `demo_script.py` letting the user know this, e.g., if they change the `stream` value to `False` in the call to `create_turn`, they need to comment out the logging lines. See https://github.com/llamastack/llama-stack-client-python/issues/15 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Dean Wampler <dean.wampler@ibm.com>	2025-08-07 10:09:57 -07:00
Varsha	e3928e6a29	feat: Implement hybrid search in Milvus (#2644 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 5s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s Details Test External API and Providers / test-external (venv) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 57s Details # What does this PR do? This PR implements hybrid search for Milvus DB based on the inbuilt milvus support. To test: ``` pytest tests/unit/providers/vector_io/remote/test_milvus.py -v -s --tb=long --disable-warnings --asyncio-mode=auto ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-08-07 09:42:03 +02:00
Nathan Weinberg	5a2d323eca	docs: add use of custom exceptions to code style guide (#3049 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s Details Integration Tests (Replay) / discover-tests (push) Successful in 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 24s Details Test External API and Providers / test-external (venv) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 26s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 1m3s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m5s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 48s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m0s Details Pre-commit / pre-commit (push) Successful in 1m55s Details # What does this PR do? Adds a blurb to the `CONTRIBUTING.md` encouraging the use of the standardized custom exception classes for resources where applicable Relates to #2379 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-06 14:12:08 -07:00
slekkala1	26d3d25c87	feat: Add moderations create api (#3020 ) # What does this PR do? This PR adds Open AI Compatible moderations api. Currently only implementing for llama guard safety provider Image support, expand to other safety providers and Deprecation of run_shield will be next steps. ## Test Plan Added 2 new tests for safe/ unsafe text prompt examples for the new open ai compatible moderations api usage `SAFETY_MODEL=llama-guard3:8b LLAMA_STACK_CONFIG=starter uv run pytest -v tests/integration/safety/test_safety.py --text-model=llama3.2:3b-instruct-fp16 --embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama` (Had some issue with previous PR https://github.com/meta-llama/llama-stack/pull/2994 while updating and accidentally close it , reopened new one )	2025-08-06 13:51:23 -07:00
Charlie Doern	0caef40e0d	fix: telemetry fixes (inference and core telemetry) (#2733 ) # What does this PR do? I found a few issues while adding new metrics for various APIs: currently metrics are only propagated in `chat_completion` and `completion` since most providers use the `openai_..` routes as the default in `llama-stack-client inference chat-completion`, metrics are currently not working as expected. in order to get them working the following had to be done: 1. get the completion as usual 2. use new `openai_` versions of the metric gathering functions which use `.usage` from the `OpenAI..` response types to gather the metrics which are already populated. 3. define a `stream_generator` which counts the tokens and computes the metrics (only for stream=True) 5. add metrics to response NOTE: I could not add metrics to `openai_completion` where stream=True because that ONLY returns an `OpenAICompletion` not an AsyncGenerator that we can manipulate. acquire the lock, and add event to the span as the other `_log_...` methods do some new output: `llama-stack-client inference chat-completion --message hi` <img width="2416" height="425" alt="Screenshot 2025-07-16 at 8 28 20 AM" src="https://github.com/user-attachments/assets/ccdf1643-a184-4ddd-9641-d426c4d51326" /> and in the client: <img width="763" height="319" alt="Screenshot 2025-07-16 at 8 28 32 AM" src="https://github.com/user-attachments/assets/6bceb811-5201-47e9-9e16-8130f0d60007" /> these were not previously being recorded nor were they being printed to the server due to the improper console sink handling --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-08-06 13:37:40 -07:00

1 2 3 4 5 ...

2472 commits