Swapna Lekkala
9da8bd240e
update run.yaml
2025-08-14 11:56:23 -07:00
Swapna Lekkala
6eb7442eda
skip tests
2025-08-14 11:56:23 -07:00
Swapna Lekkala
ccf12b318b
cleanup
2025-08-14 11:56:23 -07:00
Swapna Lekkala
5c9fcf0f2b
minor change
2025-08-14 11:56:23 -07:00
Swapna Lekkala
7d2dc3e801
minor change
2025-08-14 11:56:23 -07:00
Swapna Lekkala
a64451ad42
minor change
2025-08-14 11:56:23 -07:00
Swapna Lekkala
3eef4c6d0f
minor change
2025-08-14 11:56:23 -07:00
Swapna Lekkala
006e5cbfdf
update run.yaml
2025-08-14 11:56:23 -07:00
Swapna Lekkala
6a5f20e2b5
clean up
2025-08-14 11:56:23 -07:00
Swapna Lekkala
ef15c74307
feat: Moderation api for Code Scanner Provider
2025-08-14 11:56:23 -07:00
Ashwin Bharambe
61582f327c
fix(ci): update triggers for the workflows ( #3152 )
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / discover-tests (push) Successful in 8s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s
Python Package Build Test / build (3.12) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s
Python Package Build Test / build (3.13) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 20s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Update ReadTheDocs / update-readthedocs (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s
Test External API and Providers / test-external (venv) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
Pre-commit / pre-commit (push) Successful in 1m39s
2025-08-14 10:27:25 -07:00
Derek Higgins
c15cc7ed77
fix: use ChatCompletionMessageFunctionToolCall ( #3142 )
...
The OpenAI compatibility layer was incorrectly importing
ChatCompletionMessageToolCallParam instead of the
ChatCompletionMessageFunctionToolCall class. This caused "Cannot
instantiate typing.Union" errors when processing agent requests with
tool calls.
Closes : #3141
Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-08-14 10:27:00 -07:00
Ashwin Bharambe
ee7631b6cf
Revert "feat: add batches API with OpenAI compatibility" ( #3149 )
...
Reverts llamastack/llama-stack#3088
The PR broke integration tests.
2025-08-14 10:08:54 -07:00
Matthew Farrellee
de692162af
feat: add batches API with OpenAI compatibility ( #3088 )
...
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / discover-tests (push) Successful in 12s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s
Python Package Build Test / build (3.12) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s
Python Package Build Test / build (3.13) (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 25s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 29s
Unit Tests / unit-tests (3.12) (push) Failing after 20s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s
Test External API and Providers / test-external (venv) (push) Failing after 22s
Unit Tests / unit-tests (3.13) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s
Update ReadTheDocs / update-readthedocs (push) Failing after 38s
Pre-commit / pre-commit (push) Successful in 1m53s
Add complete batches API implementation with protocol, providers, and
tests:
Core Infrastructure:
- Add batches API protocol using OpenAI Batch types directly
- Add Api.batches enum value and protocol mapping in resolver
- Add OpenAI "batch" file purpose support
- Include proper error handling (ConflictError, ResourceNotFoundError)
Reference Provider:
- Add ReferenceBatchesImpl with full CRUD operations (create, retrieve,
cancel, list)
- Implement background batch processing with configurable concurrency
- Add SQLite KVStore backend for persistence
- Support /v1/chat/completions endpoint with request validation
Comprehensive Test Suite:
- Add unit tests for provider implementation with validation
- Add integration tests for end-to-end batch processing workflows
- Add error handling tests for validation, malformed inputs, and edge
cases
Configuration:
- Add max_concurrent_batches and max_concurrent_requests_per_batch
options
- Add provider documentation with sample configurations
Test with -
```
$ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run &
$ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK
```
addresses #3066
2025-08-14 09:42:02 -04:00
ehhuang
46ff302d87
chore: Remove Trendshift badge from README ( #3137 )
...
Integration Tests (Replay) / discover-tests (push) Successful in 5s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 13s
Python Package Build Test / build (3.12) (push) Failing after 11s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 13s
Python Package Build Test / build (3.13) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s
Update ReadTheDocs / update-readthedocs (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s
Unit Tests / unit-tests (3.13) (push) Failing after 13s
Test External API and Providers / test-external (venv) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 49s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 51s
Unit Tests / unit-tests (3.12) (push) Failing after 51s
Pre-commit / pre-commit (push) Successful in 1m36s
## Summary
- This links to a scammy looking website with ads.
## Test plan
2025-08-13 18:38:34 -07:00
Ashwin Bharambe
e1e161553c
feat(responses): add MCP argument streaming and content part events ( #3136 )
...
# What does this PR do?
Adds content part streaming events to the OpenAI-compatible Responses API to support more granular streaming of response content. This introduces:
1. New schema types for content parts: `OpenAIResponseContentPart` with variants for text output and refusals
2. New streaming event types:
- `OpenAIResponseObjectStreamResponseContentPartAdded` for when content parts begin
- `OpenAIResponseObjectStreamResponseContentPartDone` for when content parts complete
3. Implementation in the reference provider to emit these events during streaming responses. Also emits MCP arguments just like function call ones.
## Test Plan
Updated existing streaming tests to verify content part events are properly emitted
2025-08-13 16:34:26 -07:00
Ashwin Bharambe
8638537d14
feat(responses): stream progress of tool calls ( #3135 )
...
# What does this PR do?
Enhances tool execution streaming by adding support for real-time progress events during tool calls. This implementation adds streaming events for MCP and web search tools, including in-progress, searching, completed, and failed states.
The refactored `_execute_tool_call` method now returns an async iterator that yields streaming events throughout the tool execution lifecycle.
## Test Plan
Updated the integration test `test_response_streaming_multi_turn_tool_execution` to verify the presence and structure of new streaming events, including:
- Checking for MCP in-progress and completed events
- Verifying that progress events contain required fields (item_id, output_index, sequence_number)
- Ensuring completed events have the necessary sequence_number field
2025-08-13 16:31:25 -07:00
Ashwin Bharambe
5b312a80b9
feat(responses): improve streaming for function calls ( #3124 )
...
Test Llama Stack Build / build-single-provider (push) Failing after 5s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 10s
Test Llama Stack Build / generate-matrix (push) Successful in 9s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s
Python Package Build Test / build (3.13) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 21s
Python Package Build Test / build (3.12) (push) Failing after 9s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 29s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Test External API and Providers / test-external (venv) (push) Failing after 13s
Update ReadTheDocs / update-readthedocs (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 22s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 17s
Pre-commit / pre-commit (push) Successful in 1m10s
Test Llama Stack Build / build (push) Failing after 12s
Emit streaming events for function calls
## Test Plan
Improved the test case
2025-08-13 11:23:27 -07:00
ehhuang
d6ae54723d
chore: setup for performance benchmarking ( #3096 )
...
# What does this PR do?
1. Added a simple mock openai-compat server that serves chat/completion
2. Add a benchmark server in EKS that includes mock inference server
3. Add locust (https://locust.io/ ) file for load testing
## Test Plan
bash apply.sh
kubectl port-forward service/locust-web-ui 8089:8089
Go to localhost:8089 to start a load test
<img width="1392" height="334" alt="image"
src="https://github.com/user-attachments/assets/d6aa3deb-583a-42ed-889b-751262b8e91c "
/>
<img width="1362" height="881" alt="image"
src="https://github.com/user-attachments/assets/6a28b9b4-05e6-44e2-b504-07e60c12d35e "
/>
2025-08-13 10:58:22 -07:00
ehhuang
2f51273215
fix: huge speed boost ( #3132 )
...
# What does this PR do?
make llama stack fast again
## Test Plan
2025-08-13 09:51:35 -07:00
slekkala1
25e0553eed
chore: Change moderations api response to Provider returned categories ( #3098 )
...
# What does this PR do?
To be compliant with model policies for LLAMA, just return the
categories as is from provider, we will lose the OAI compat in
moderations api response.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
`SAFETY_MODEL=llama-guard3:8b LLAMA_STACK_CONFIG=starter uv run pytest
-v tests/integration/safety/test_safety.py
--text-model=llama3.2:3b-instruct-fp16
--embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama`
2025-08-13 09:47:35 -07:00
Ashwin Bharambe
a9081d87b9
feat(ci): update Recording workflow trigger and concurrency group
2025-08-13 09:36:13 -07:00
IAN MILLER
0950168f26
refactor: replace hardcoded status codes by httpx.codes ( #3131 )
...
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The purpose of this PR is to eliminate hardcoded status codes in
server's responses and replace it by `httpx.codes` functionality for
better consistency across the whole project and improvement in code
readability.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Run `./scripts/unit-tests.sh`
2025-08-13 08:43:41 -07:00
Kelly Brown
0cbd93c5cc
docs: Update blocks formatting in docs/source files ( #3120 )
...
**Description:**
The standard markdown [!NOTE] format is not supported on Sphinx
generated documentation, replacing those instances. Also updating other
Notes, Tips and Warning blocks throughout the source docs
WIP: Working to update the provider code gen
2025-08-13 08:06:31 -07:00
IAN MILLER
c9b78602d3
refactor: modify DELETE API endpoints by returning HTTP 204 No Content + empty body instead of 200 OK + response body with null ( #3112 )
...
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The purpose of this PR is to make the behavior DELETE API endpoints be
consistent with standard RESTful conventions and eliminate confusion for
API consumers.
Old Behavior
```
HTTP Status: 200 OK
Response Body: null
```
Eg. `curl -X DELETE http://localhost:8321/v1/shields/test-shield `
`null% `
`INFO 2025-08-12 16:11:57,932 console_span_processor:65 telemetry:
15:11:57.929 [INFO] ::1:59805 - "DELETE /v1/shields/test-shield
HTTP/1.1" 200 `
Updated Behavior
```
HTTP Status: 204 No Content
Response Body: empty (no body)
```
Eg. `curl -X DELETE http://localhost:8321/v1/shields/test-shield `
`INFO 2025-08-12 16:18:16,645 console_span_processor:62 telemetry:
15:18:16.637 [INFO] ::1:60283 - "DELETE /v1/shields/test-shield
HTTP/1.1" 204 `
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #3090
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Run `./scripts/unit-tests.sh`
2025-08-13 07:56:26 -07:00
Francisco Arceo
92aca434a7
fix: Fix list_sessions() ( #3114 )
...
# What does this PR do?
1. Updates `AgentPersistence.list_sessions()` to properly filter out
`Turn` keys from `Session` keys.
2. Adds a suite of unit tests to confirm the `list_sessions()` behavior
and tests the failed sample in
https://github.com/meta-llama/llama-stack/issues/3048
## Fixes https://github.com/meta-llama/llama-stack/issues/3048
## Test Plan
Unit tests added.
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-08-13 07:46:26 -07:00
Krzysztof Malczuk
5bd6cb52fb
fix: github action canceling valid tasks for checking semantic pr title ( #3127 )
...
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR changes the group name from github.ref to
github.even.pull_request_number. The reason for this is that github.ref
does not act as a unique identifier in the pull_request_target event and
only is unique in pull_request. The github action was getting canceled
was because the group name was not unique in the concurrency section.
<!-- If resolving an issue, uncomment and update the line below -->
Closes #3102
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
To test this I have created a fake github action and ran it trough act
to see what the github.ref variable produced and what alternatives can
be used. This confirmed that the github.ref was not unique and that
github.event.pull_request_number is unique to the PR.
2025-08-13 07:14:03 -07:00
Chacksu
fffdab4f5c
fix: Dell distribution missing kvstore ( #3113 )
...
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s
Integration Tests (Replay) / discover-tests (push) Successful in 9s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 27s
Test Llama Stack Build / build-single-provider (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 29s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 9s
Python Package Build Test / build (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s
Python Package Build Test / build (3.12) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 16s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s
Test External API and Providers / test-external (venv) (push) Failing after 11s
Unit Tests / unit-tests (3.12) (push) Failing after 13s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s
Test Llama Stack Build / build (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 37s
Pre-commit / pre-commit (push) Successful in 1m44s
# What does this PR do?
- Added kvstore config to ChromaDB provider config for Dell distribution
similar to [starter
config](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/distributions/starter/run.yaml#L110-L112 )
- Fixed
[error](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_generated/_async_client.py#L3424-L3425 )
getting endpoint information by adding `hf-inference` as the provider to
the `AsyncInferenceClient` (TGI client).
## Test Plan
```
export INFERENCE_PORT=8181
export DEH_URL=http://0.0.0.0:$INFERENCE_PORT
export INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct
export CHROMADB_HOST=localhost
export CHROMADB_PORT=8000
export CHROMA_URL=http://$CHROMADB_HOST:$CHROMADB_PORT
export CUDA_VISIBLE_DEVICES=0
export LLAMA_STACK_PORT=8321
export HF_TOKEN=[redacted]
# TGI Server
docker run --rm -it \
--pull always \
--network host \
-v $HOME/.cache/huggingface:/data \
-e HF_TOKEN=$HF_TOKEN \
-e PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \
-p $INFERENCE_PORT:$INFERENCE_PORT \
--gpus all \
ghcr.io/huggingface/text-generation-inference:latest \
--dtype float16 \
--usage-stats off \
--sharded false \
--cuda-memory-fraction 0.8 \
--model-id meta-llama/Llama-3.2-3B-Instruct \
--port $INFERENCE_PORT \
--hostname 0.0.0.0
# Chrome DB
docker run --rm -it \
--name chromadb \
--net=host -p 8000:8000 \
-v ~/chroma:/chroma/chroma \
-e IS_PERSISTENT=TRUE \
-e ANONYMIZED_TELEMETRY=FALSE \
chromadb/chroma:latest
# Llama Stack
llama stack run dell \
--port $LLAMA_STACK_PORT \
--env INFERENCE_MODEL=$INFERENCE_MODEL \
--env DEH_URL=$DEH_URL \
--env CHROMA_URL=$CHROMA_URL
```
---------
Co-authored-by: Connor Hack <connorhack@fb.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-08-13 06:18:25 -07:00
Kelly Brown
6358d0a478
docs: reorganize contributor guide ( #3110 )
...
Test Llama Stack Build / generate-matrix (push) Successful in 7s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 22s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s
Python Package Build Test / build (3.13) (push) Failing after 5s
Test Llama Stack Build / build-single-provider (push) Failing after 11s
Python Package Build Test / build (3.12) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 28s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 19s
Update ReadTheDocs / update-readthedocs (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 18s
Unit Tests / unit-tests (3.13) (push) Failing after 15s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s
Test External API and Providers / test-external (venv) (push) Failing after 17s
Test Llama Stack Build / build (push) Failing after 11s
Pre-commit / pre-commit (push) Successful in 1m48s
**Description:**
Restructures contribution guide and move some sections into categories
<img width="1399" height="527" alt="Screenshot 2025-08-12 at 9 28 44 AM"
src="https://github.com/user-attachments/assets/404e23b4-0001-4174-b662-593e0173ef7d "
/>
2025-08-12 16:17:03 -07:00
Ashwin Bharambe
3d90117891
chore(tests): fix responses and vector_io tests ( #3119 )
...
Some fixes to MCP tests. And a bunch of fixes for Vector providers.
I also enabled a bunch of Vector IO tests to be used with
`LlamaStackLibraryClient`
## Test Plan
Run Responses tests with llama stack library client:
```
pytest -s -v tests/integration/non_ci/responses/ --stack-config=server:starter \
--text-model openai/gpt-4o \
--embedding-model=sentence-transformers/all-MiniLM-L6-v2 \
-k "client_with_models"
```
Do the same with `-k openai_client`
The rest should be taken care of by CI.
2025-08-12 16:15:53 -07:00
Ashwin Bharambe
1721aafc1f
feat(responses): type file results properly ( #3117 )
...
Python Package Build Test / build (3.13) (push) Failing after 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s
Test Llama Stack Build / generate-matrix (push) Successful in 8s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s
Python Package Build Test / build (3.12) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s
Test Llama Stack Build / build-single-provider (push) Failing after 10s
Unit Tests / unit-tests (3.12) (push) Failing after 12s
Test External API and Providers / test-external (venv) (push) Failing after 15s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 30s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 28s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 26s
Test Llama Stack Build / build (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s
Pre-commit / pre-commit (push) Successful in 1m16s
Another thing our tests implicitly depended on.
2025-08-12 10:39:09 -07:00
Ashwin Bharambe
4fec49dfdb
feat(responses): add include parameter ( #3115 )
...
Well our Responses tests use it so we better include it in the API, no?
I discovered it because I want to make sure `llama-stack-client` can be
used always instead of `openai-python` as the client (we do want to be
_truly_ compatible.)
2025-08-12 10:24:01 -07:00
Nathan Weinberg
6812aa1e1e
chore: bump min python version in docs and tests ( #3103 )
...
# What does this PR do?
the minimum python version for the project was bumped to 3.12 a couple
months ago, but there remains some artifacts in the repo suggesting we
support >=3.10
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-08-12 08:52:57 -07:00
dependabot[bot]
88c4fdc5d7
chore(python-deps): bump chromadb from 1.0.15 to 1.0.16 ( #3083 )
...
Bumps [chromadb](https://github.com/chroma-core/chroma ) from 1.0.15 to
1.0.16.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/chroma-core/chroma/releases ">chromadb's
releases</a>.</em></p>
<blockquote>
<h2>1.0.16</h2>
<p>Version: <code>1.0.16</code>
Git ref: <code>refs/tags/1.0.16</code>
Build Date: <code>2025-08-08T00:26</code>
PIP Package: <code>chroma-1.0.16.tar.gz</code>
Github Container Registry Image: <code>:1.0.16</code>
DockerHub Image: <code>:1.0.16</code></p>
<h2>What's Changed</h2>
<ul>
<li>[ENH]: add cache mount & tolerations to garbage collector
template in Helm chart by <a
href="https://github.com/codetheweb "><code>@codetheweb</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5016 ">chroma-core/chroma#5016</a></li>
<li>[DOC] Fix docs typo by <a
href="https://github.com/itaismith "><code>@itaismith</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5018 ">chroma-core/chroma#5018</a></li>
<li>[CLN] Change GenericQuotaError from 429 to 422 by <a
href="https://github.com/drewkim "><code>@drewkim</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5022 ">chroma-core/chroma#5022</a></li>
<li>[CHORE] Fix type error in batch_utils by <a
href="https://github.com/jairad26 "><code>@jairad26</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5024 ">chroma-core/chroma#5024</a></li>
<li>[ENH] Add block-level metrics by <a
href="https://github.com/tanujnay112 "><code>@tanujnay112</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/4801 ">chroma-core/chroma#4801</a></li>
<li>[ENH]: return error on /add if embeddings are not provided by <a
href="https://github.com/codetheweb "><code>@codetheweb</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5033 ">chroma-core/chroma#5033</a></li>
<li>[DOC] Docs Polish 07/2025 by <a
href="https://github.com/itaismith "><code>@itaismith</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5032 ">chroma-core/chroma#5032</a></li>
<li>[DOC] Flatten public txt files by <a
href="https://github.com/itaismith "><code>@itaismith</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5040 ">chroma-core/chroma#5040</a></li>
<li>[ENH]: require embeddings & require min embedding dimension on
/add by <a
href="https://github.com/codetheweb "><code>@codetheweb</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5037 ">chroma-core/chroma#5037</a></li>
<li>[ENH] - Adds in dark mode support for hero image by <a
href="https://github.com/tjkrusinskichroma "><code>@tjkrusinskichroma</code></a>
in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5042 ">chroma-core/chroma#5042</a></li>
<li>[BLD] Use 8core runners for all our windows jobs by <a
href="https://github.com/eculver "><code>@eculver</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5027 ">chroma-core/chroma#5027</a></li>
<li>[TST] More benchmark queries for regex by <a
href="https://github.com/Sicheng-Pan "><code>@Sicheng-Pan</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/4910 ">chroma-core/chroma#4910</a></li>
<li>[BUG]: refactor otel/tracing initialization in the frontend to be
independent of hosted entry point by <a
href="https://github.com/c-gamble "><code>@c-gamble</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5028 ">chroma-core/chroma#5028</a></li>
<li>[BUG] js client: handle 422 billing errors as QuotaExceeded instead
of ChromaConnectionError by <a
href="https://github.com/philipithomas "><code>@philipithomas</code></a>
in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5049 ">chroma-core/chroma#5049</a></li>
<li>[BUG] RLS should use 32MB GRPC payload size limit by <a
href="https://github.com/Sicheng-Pan "><code>@Sicheng-Pan</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/5044 ">chroma-core/chroma#5044</a></li>
<li>[BUG] Sync protoc arch and version in dockerfile by <a
href="https://github.com/Sicheng-Pan "><code>@Sicheng-Pan</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/5045 ">chroma-core/chroma#5045</a></li>
<li>[BLD] Fix windows runner label by <a
href="https://github.com/eculver "><code>@eculver</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5052 ">chroma-core/chroma#5052</a></li>
<li>[PERF]: Prefetch segments in get and query by <a
href="https://github.com/sanketkedia "><code>@sanketkedia</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/5053 ">chroma-core/chroma#5053</a></li>
<li>[PERF]: Parallelize fetching blocks for brute force regex by <a
href="https://github.com/sanketkedia "><code>@sanketkedia</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/5051 ">chroma-core/chroma#5051</a></li>
<li>[RELEASE] JS 3.0.7 by <a
href="https://github.com/itaismith "><code>@itaismith</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5059 ">chroma-core/chroma#5059</a></li>
<li>[ENH] Add a delete_many call to the storage API. by <a
href="https://github.com/rescrv "><code>@rescrv</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5020 ">chroma-core/chroma#5020</a></li>
<li>[ENH] Consume delete_many from the wal3 garbage collector. by <a
href="https://github.com/rescrv "><code>@rescrv</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5021 ">chroma-core/chroma#5021</a></li>
<li>[ENH]: limit number of concurrent get_all_block_ids() when using
buffer_unordered() by <a
href="https://github.com/codetheweb "><code>@codetheweb</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5062 ">chroma-core/chroma#5062</a></li>
<li>[ENH]: use new <code>delete_many()</code> storage method in
DeleteUnusedFiles operator by <a
href="https://github.com/codetheweb "><code>@codetheweb</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5061 ">chroma-core/chroma#5061</a></li>
<li>[BUG]: Disable aws stalled stream protection by <a
href="https://github.com/tanujnay112 "><code>@tanujnay112</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/5063 ">chroma-core/chroma#5063</a></li>
<li>[DOC] Update manage collections docs with correct delete collection
info by <a
href="https://github.com/jairad26 "><code>@jairad26</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5066 ">chroma-core/chroma#5066</a></li>
<li>[BUG] Improve wal3 robustness with better shutdown handling and
error recovery by <a
href="https://github.com/rescrv "><code>@rescrv</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5046 ">chroma-core/chroma#5046</a></li>
<li>[ENH] Do not do any mutations of the manifest from within GC. by <a
href="https://github.com/rescrv "><code>@rescrv</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5050 ">chroma-core/chroma#5050</a></li>
<li>[CHORE]: enable change notifier otel/tracing by <a
href="https://github.com/c-gamble "><code>@c-gamble</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5073 ">chroma-core/chroma#5073</a></li>
<li>[CHORE] Add pprof server to query service by <a
href="https://github.com/eculver "><code>@eculver</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5072 ">chroma-core/chroma#5072</a></li>
<li>[ENH]: Dedup inserts to the same key in foyer by <a
href="https://github.com/sanketkedia "><code>@sanketkedia</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/5074 ">chroma-core/chroma#5074</a></li>
<li>[ENH] "Failed to fetch: status: NotFound" be gone. by <a
href="https://github.com/rescrv "><code>@rescrv</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5064 ">chroma-core/chroma#5064</a></li>
<li>[CLN] Remove the the top most spammy log lines from rls/wal3. by <a
href="https://github.com/rescrv "><code>@rescrv</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5071 ">chroma-core/chroma#5071</a></li>
<li>[DOC] Fix badge in readme by <a
href="https://github.com/kylediaz "><code>@kylediaz</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5025 ">chroma-core/chroma#5025</a></li>
<li>[ENH] A tool for patching logs that were deleted before a new
manifest was installed. by <a
href="https://github.com/rescrv "><code>@rescrv</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5083 ">chroma-core/chroma#5083</a></li>
<li>[BUG] Add billing errors to JS client by <a
href="https://github.com/itaismith "><code>@itaismith</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5084 ">chroma-core/chroma#5084</a></li>
<li>[CHORE]: Add s3 get metrics and pod name to tracing spans by <a
href="https://github.com/tanujnay112 "><code>@tanujnay112</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/5086 ">chroma-core/chroma#5086</a></li>
<li>[RELEASE] JS 3.0.8 by <a
href="https://github.com/itaismith "><code>@itaismith</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5087 ">chroma-core/chroma#5087</a></li>
<li>[ENH] A tool to purge the cache. by <a
href="https://github.com/rescrv "><code>@rescrv</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5085 ">chroma-core/chroma#5085</a></li>
<li>[DOC] Update PR template for migration and observability by <a
href="https://github.com/HammadB "><code>@HammadB</code></a> in <a
href="https://redirect.github.com/chroma-core/chroma/pull/5089 ">chroma-core/chroma#5089</a></li>
<li>[CHORE]: Fix s3 get metric name by <a
href="https://github.com/tanujnay112 "><code>@tanujnay112</code></a> in
<a
href="https://redirect.github.com/chroma-core/chroma/pull/5091 ">chroma-core/chroma#5091</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="dff3a786db
"><code>dff3a78</code></a>
[RELEASE] CLI 1.1.5, Python 1.0.16, JS 3.0.11 (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5227 ">#5227</a>)</li>
<li><a
href="f60f932b8d
"><code>f60f932</code></a>
[ENH]: Increase nprobe for smaller collections (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5226 ">#5226</a>)</li>
<li><a
href="f593a43b5d
"><code>f593a43</code></a>
[ENH] Add <code>InsertRecordSet</code> to JS client (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5225 ">#5225</a>)</li>
<li><a
href="76a14c226a
"><code>76a14c2</code></a>
[DOC] Made light/dark mode for Chroma logo (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5215 ">#5215</a>)</li>
<li><a
href="d80817ede4
"><code>d80817e</code></a>
[ENH]: Add more tracing in the filter path (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5219 ">#5219</a>)</li>
<li><a
href="73abfdc51a
"><code>73abfdc</code></a>
[ENH] Handle when the garbage doesn't overlap the manifest. (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5207 ">#5207</a>)</li>
<li><a
href="fa392226ba
"><code>fa39222</code></a>
[BUG] Revert accidentally commited code (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5205 ">#5205</a>)</li>
<li><a
href="815c3ac561
"><code>815c3ac</code></a>
[ENH]: Fix CI flake with adaptive nsearch (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5203 ">#5203</a>)</li>
<li><a
href="ea66d6929c
"><code>ea66d69</code></a>
[BUG] Switch to rust-tls (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5204 ">#5204</a>)</li>
<li><a
href="04aeb22139
"><code>04aeb22</code></a>
[ENH]: Calculate cache weight of block size instead of hardcoding (<a
href="https://redirect.github.com/chroma-core/chroma/issues/5201 ">#5201</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/chroma-core/chroma/compare/1.0.15...1.0.16 ">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores )
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-12 08:44:39 -07:00
dependabot[bot]
393f3714b0
chore(python-deps): bump torch from 2.7.1 to 2.8.0 ( #3082 )
...
Bumps [torch](https://github.com/pytorch/pytorch ) from 2.7.1 to 2.8.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pytorch/pytorch/releases ">torch's
releases</a>.</em></p>
<blockquote>
<h1>PyTorch 2.8.0 Release Notes</h1>
<ul>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#highlights ">Highlights</a></li>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#backwards-incompatible-changes ">Backwards
Incompatible Changes</a></li>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#deprecations ">Deprecations</a></li>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#new-features ">New
Features</a></li>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#improvements ">Improvements</a></li>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#bug-fixes ">Bug
fixes</a></li>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#performance ">Performance</a></li>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#documentation ">Documentation</a></li>
<li><a
href="https://github.com/pytorch/pytorch/blob/HEAD/#developers ">Developers</a></li>
</ul>
<h1>Highlights</h1>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ba56102387
"><code>ba56102</code></a>
Cherrypick: Add the RunLLM widget to the website (<a
href="https://redirect.github.com/pytorch/pytorch/issues/159592 ">#159592</a>)</li>
<li><a
href="c525a02c89
"><code>c525a02</code></a>
[dynamo, docs] cherry pick torch.compile programming model docs into 2.8
(<a
href="https://redirect.github.com/pytorch/pytorch/issues/15 ">#15</a>...</li>
<li><a
href="a1cb3cc05d
"><code>a1cb3cc</code></a>
[Release Only] Remove nvshmem from list of preload libraries (<a
href="https://redirect.github.com/pytorch/pytorch/issues/158925 ">#158925</a>)</li>
<li><a
href="c76b2356bc
"><code>c76b235</code></a>
Move out super large one off foreach_copy test (<a
href="https://redirect.github.com/pytorch/pytorch/issues/158880 ">#158880</a>)</li>
<li><a
href="20a0e225a0
"><code>20a0e22</code></a>
Revert "[Dynamo] Allow inlining into AO quantization modules (<a
href="https://redirect.github.com/pytorch/pytorch/issues/152934 ">#152934</a>)"
(<a
href="https://redirect.github.com/pytorch/pytorch/issues/158 ">#158</a>...</li>
<li><a
href="9167ac8c75
"><code>9167ac8</code></a>
[MPS] Switch Cholesky decomp to column wise (<a
href="https://redirect.github.com/pytorch/pytorch/issues/158237 ">#158237</a>)</li>
<li><a
href="5534685c62
"><code>5534685</code></a>
[MPS] Reimplement <code>tri[ul]</code> as Metal shaders (<a
href="https://redirect.github.com/pytorch/pytorch/issues/158867 ">#158867</a>)</li>
<li><a
href="d19e08d74b
"><code>d19e08d</code></a>
Cherry pick PR 158746 (<a
href="https://redirect.github.com/pytorch/pytorch/issues/158801 ">#158801</a>)</li>
<li><a
href="a6c044ab9a
"><code>a6c044a</code></a>
[cherry-pick] Unify torch.tensor and torch.ops.aten.scalar_tensor
behavior (#...</li>
<li><a
href="620ebd0646
"><code>620ebd0</code></a>
[Dynamo] Use proper sources for constructing dataclass defaults (<a
href="https://redirect.github.com/pytorch/pytorch/issues/158689 ">#158689</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/pytorch/pytorch/compare/v2.7.1...v2.8.0 ">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores )
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-12 08:44:24 -07:00
Matthew Farrellee
b70e2f1f09
fix(dep): update to openai >= 1.99.6 and use new Function location ( #3087 )
...
# What does this PR do?
closes #3072
## Test Plan
ci
2025-08-12 08:40:32 -07:00
Mustafa Elbehery
4a13ef45e9
fix: Implement missing run_moderation
method in PromptGuardSafetyImpl
( #3101 )
...
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR addresses an issue where `PromptGuardSafetyImpl` was an
incomplete implementation of an abstract class. The class was missing
the required run_moderation method from its parent interface.
Currently, running `pre-commit` locally fails with the error below.
```
llama_stack/providers/inline/safety/prompt_guard/__init__.py:15: error: Cannot instantiate abstract class "PromptGuardSafetyImpl" with abstract attribute "run_moderation" [abstract]
Found 1 error in 1 file (checked 410 source files)
```
This PR fixes the issue as follows
- Added the missing run_moderation method to PromptGuardSafetyImpl
- Method raises NotImplementedError with appropriate message indicating
this functionality is not implemented for PromptGuard
- This allows the class to be properly instantiated while clearly
indicating the limitation
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
2025-08-12 08:32:52 -07:00
Nathan Weinberg
19123ca957
refactor: standardize InferenceRouter model handling ( #2965 )
Integration Tests (Replay) / discover-tests (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s
Python Package Build Test / build (3.13) (push) Failing after 16s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s
Test External API and Providers / test-external (venv) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s
Unit Tests / unit-tests (3.12) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 21s
Unit Tests / unit-tests (3.13) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 29s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s
Pre-commit / pre-commit (push) Successful in 1m19s
2025-08-12 04:20:39 -06:00
Ashwin Bharambe
803114180b
chore(logging)!: use comma as a delimiter ( #3095 )
...
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s
Test Llama Stack Build / generate-matrix (push) Successful in 11s
Test Llama Stack Build / build-single-provider (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 11s
Unit Tests / unit-tests (3.13) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s
Update ReadTheDocs / update-readthedocs (push) Failing after 12s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 29s
Test External API and Providers / test-external (venv) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 34s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 26s
Integration Tests (Replay) / discover-tests (push) Successful in 31s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 30s
Python Package Build Test / build (3.13) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 32s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 33s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 40s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 42s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 44s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 32s
Pre-commit / pre-commit (push) Successful in 1m24s
Test Llama Stack Build / build (push) Failing after 54s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 13s
Using commas is much more shell-friendly. A semi-colon is a statement
delimiter and must be escaped.
This change is backwards incompatible but I imagine not many people are
using this. I could be wrong. Looking for feedback.
2025-08-11 11:51:43 -07:00
Francisco Arceo
f7adf58b1b
docs: Add documentation on how to contribute a Vector DB provider and update testing documentation ( #3093 )
...
# What does this PR do?
- Adds documentation on how to contribute a Vector DB provider.
- Updates the testing section to be a little friendlier to navigate.
- Also added new shortcut for search so that `/` and `⌘ K` or `ctrl+K`
trigger search
<img width="1903" height="1346" alt="Screenshot 2025-08-11 at 10 10
12 AM"
src="https://github.com/user-attachments/assets/6995b3b8-a2ab-4200-be72-c5b03a784a29 "
/>
<img width="1915" height="1438" alt="Screenshot 2025-08-11 at 10 10
25 AM"
src="https://github.com/user-attachments/assets/1f54d30e-5be1-4f27-b1e9-3c3537dcb8e9 "
/>
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-08-11 11:11:09 -07:00
Mustafa Elbehery
b5b5f5b9ae
chore: add mypy
prompt guard ( #2678 )
...
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR adds static type coverage to `llama-stack`
Part of https://github.com/meta-llama/llama-stack/issues/2647
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
2025-08-11 08:40:40 -07:00
Francisco Arceo
7448a4a88c
chore: Updating UI Sidebar ( #3081 )
...
# What does this PR do?
This updates the sidebar to look a little more like other popular ones.
<img width="1913" height="1352" alt="Screenshot 2025-08-08 at 11 25
31 PM"
src="https://github.com/user-attachments/assets/00738412-1101-48ec-8864-cde4a8733ec1 "
/>
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-08-11 07:39:52 -07:00
Matthew Farrellee
8faff92591
chore: remove redundant code in unregister_toolgroup ( #3092 )
...
# What does this PR do?
removes redundant code
## Test Plan
ci
2025-08-11 07:38:54 -07:00
Eran Cohen
a4bad6c0b4
feat: Add Google Vertex AI inference provider support ( #2841 )
...
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 10s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 12s
Python Package Build Test / build (3.13) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s
Test Llama Stack Build / generate-matrix (push) Successful in 8s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s
Test External API and Providers / test-external (venv) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s
Test Llama Stack Build / build (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 47s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s
Unit Tests / unit-tests (3.13) (push) Failing after 39s
Pre-commit / pre-commit (push) Successful in 1m37s
# What does this PR do?
- Add new Vertex AI remote inference provider with litellm integration
- Support for Gemini models through Google Cloud Vertex AI platform
- Uses Google Cloud Application Default Credentials (ADC) for
authentication
- Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro,
gemini-2.0-flash.
- Updated provider registry to include vertexai provider
- Updated starter template to support Vertex AI configuration
- Added comprehensive documentation and sample configuration
<!-- If resolving an issue, uncomment and update the line below -->
relates to https://github.com/meta-llama/llama-stack/issues/2747
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Eran Cohen <eranco@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-08-11 08:22:04 -04:00
Francisco Arceo
78a59a4dbe
chore: Adding GitHub Stars, trends, and contributor shout out to README ( #3079 )
...
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Integration Tests (Replay) / discover-tests (push) Successful in 6s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 13s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s
Test External API and Providers / test-external (venv) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 50s
Unit Tests / unit-tests (3.13) (push) Failing after 48s
Pre-commit / pre-commit (push) Successful in 1m54s
# What does this PR do?
Updates READMe to add
1. GitHub badge highlighting Llama Stack as #1 Repo of the Day
2. GitHub Star History (cumulative stars chart)
3. Contributor shout out
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-08-10 21:11:14 -04:00
Varsha
69dc789e15
docs: Add unsupported search mode info about FAISS ( #3089 )
2025-08-10 17:34:34 -06:00
Varsha
ce72a28525
docs: Update doc on search modes for Milvus ( #3078 )
...
# What does this PR do?
Update Milvus doc on using search modes.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
2025-08-10 18:48:36 -04:00
Vlastimil Eliáš
1677d6bffd
feat: Flash-Lite 2.0 and 2.5 models added to Gemini inference provider ( #3058 )
...
Integration Tests (Replay) / discover-tests (push) Successful in 4s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s
Python Package Build Test / build (3.12) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Python Package Build Test / build (3.13) (push) Failing after 10s
Unit Tests / unit-tests (3.12) (push) Failing after 9s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s
Test External API and Providers / test-external (venv) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s
Unit Tests / unit-tests (3.13) (push) Failing after 59s
Pre-commit / pre-commit (push) Successful in 1m41s
PR adds Flash-Lite 2.0 and 2.5 models to the Gemini inference provider
Closes #3046
## Test Plan
I was not able to locate any existing test for this provider, so I
performed manual testing. But the change is really trivial and
straightforward.
2025-08-08 13:48:15 -07:00
ehhuang
0b5a794c27
fix: telemetry logger spams when queue is full ( #3070 )
...
# What does this PR do?
## Test Plan
Ran a stress test on chat completion endpoint locally:
For 10 concurrent users over 3 minutes:
Before:
<img width="1440" height="201" alt="image"
src="https://github.com/user-attachments/assets/24e0d580-186e-4e24-931e-2b936c5859b6 "
/>
After:
<img width="1434" height="204" alt="image"
src="https://github.com/user-attachments/assets/4b806d88-f822-41e9-b25a-018cc4bec866 "
/>
(Will send scripts in a future PR.)
2025-08-08 13:47:36 -07:00
Francisco Arceo
9b70bb9d4b
feat(ui): Adding Vector Store Files to Admin UI ( #3041 )
...
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s
Integration Tests (Replay) / discover-tests (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
Test External API and Providers / test-external (venv) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s
Python Package Build Test / build (3.13) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 57s
Unit Tests / unit-tests (3.12) (push) Failing after 55s
Pre-commit / pre-commit (push) Successful in 2m10s
# What does this PR do?
This PR updates the UI to create new:
1. `/files/{file_id}`
2. `files/{file_id}/contents`
3. `files/{file_id}/contents/{content_id}`
The list of files are clickable which brings the user to the FIles
Detail page
The File Details page shows all of the content
The content details page shows the individual chunk/content parsed
These only use our existing OpenAI compatible APIs. I have a separate
branch where I expose the embedding and the portal is correctly
populated. I included the FE rendering code for that in this PR.
1. `vector-stores/{vector_store_id}/files/{file_id}`
<img width="1913" height="1351" alt="Screenshot 2025-08-06 at 10 20
12 PM"
src="https://github.com/user-attachments/assets/08010d5e-60c8-4bd9-9f3e-a2731ed1ad55 "
/>
2. `vector-stores/{vector_store_id}/files/{file_id}/contents`
<img width="1920" height="1272" alt="Screenshot 2025-08-06 at 10 21
23 PM"
src="https://github.com/user-attachments/assets/3b91e67b-5d64-4fe6-91b6-18f14587e850 "
/>
3.
`vector-stores/{vector_store_id}/files/{file_id}/contents/{content_id}`
<img width="1916" height="1273" alt="Screenshot 2025-08-06 at 10 21
45 PM"
src="https://github.com/user-attachments/assets/d38ca996-e8d9-460c-9e39-7ff0cb5ec0dd "
/>
## Test Plan
I tested this locally and reviewed the code. I generated a significant
share of the code with Claude and some manual intervention. After this,
I'll begin adding tests to the UI.
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-08-08 07:44:06 -07:00