llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

Author	SHA1	Message	Date
Hardik Shah	ddaee42650	test: Update integration-tests.yml (#2443 ) Added `vector_io` to the CI integration tests.	2025-06-13 10:04:08 +02:00
Hardik Shah	fef670b024	feat: update openai tests to work with both clients (#2442 ) Some checks failed Integration Tests / test-matrix (http, 3.11, post_training) (push) Failing after 18s Details Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 22s Details Integration Tests / test-matrix (http, 3.11, providers) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 11s Details Unit Tests / unit-tests (3.11) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 1m45s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1m46s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m1s Details Unit Tests / unit-tests (3.10) (push) Failing after 2m3s Details Pre-commit / pre-commit (push) Successful in 3m11s Details https://github.com/meta-llama/llama-stack-client-python/pull/238 updated llama-stack-client to also support Open AI endpoints for embeddings, files, vector-stores. This updates the test to test all configs -- openai sdk, llama stack sdk and library-as-client.	2025-06-12 16:30:23 -07:00
Hardik Shah	0bc1747ed8	feat: update search for vector_stores (#2441 ) Updated the `search` functionality return response to match openai. ## Test Plan ``` pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ```	2025-06-12 15:34:22 -07:00
Ibrahim Haroon	35c2817d0a	fix(weaviate): handle case where distance is 0 by setting score to infinity (#2415 ) Some checks failed Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.11, tool_runtime) (push) Failing after 41s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 39s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 41s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 42s Details Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 38s Details Integration Tests / test-matrix (http, 3.10, providers) (push) Failing after 46s Details Integration Tests / test-matrix (http, 3.11, inspect) (push) Failing after 44s Details Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 42s Details Integration Tests / test-matrix (http, 3.11, datasets) (push) Failing after 43s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 40s Details Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 39s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 15s Details Test External Providers / test-external-providers (venv) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Unit Tests / unit-tests (3.10) (push) Failing after 1m3s Details Unit Tests / unit-tests (3.11) (push) Failing after 1m12s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m10s Details Pre-commit / pre-commit (push) Successful in 2m23s Details # What does this PR do? Fixes provider weaviate `query_vector` function for when the distance between the query embedding and an embedding within the vector db is 0 (identical vectors). Catches `ZeroDivisionError` and then sets `score` to infinity, which represent maximum similarity. <!-- If resolving an issue, uncomment and update the line below --> Closes [#2381] ## Test Plan Checkout this PR Execute this code and there will no longer be a `ZeroDivisionError` exception ``` from llama_stack_client import LlamaStackClient base_url = "http://localhost:8321" client = LlamaStackClient(base_url=base_url) models = client.models.list() embedding_model = ( em := next(m for m in models if m.model_type == "embedding") ).identifier embedding_dimension = 384 _ = client.vector_dbs.register( vector_db_id="foo_db", embedding_model=embedding_model, embedding_dimension=embedding_dimension, provider_id="weaviate", ) chunk = { "content": "foo", "mime_type": "text/plain", "metadata": { "document_id": "foo-id" } } client.vector_io.insert(vector_db_id="foo_db", chunks=[chunk]) client.vector_io.query(vector_db_id="foo_db", query="foo") ```	2025-06-12 11:23:59 -04:00
Sébastien Han	eb04731750	ci: fix external provider test (#2438 ) # What does this PR do? The test wasn't using the correct virtual environment. Also augment the console width for logs. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-12 16:14:32 +02:00
Hardik Shah	de37a04c3e	fix: set appropriate defaults for params (#2434 ) Some checks failed Integration Tests / test-matrix (http, 3.11, post_training) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 15s Details Test External Providers / test-external-providers (venv) (push) Failing after 20s Details Update ReadTheDocs / update-readthedocs (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 20s Details Unit Tests / unit-tests (3.11) (push) Failing after 1m39s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m37s Details Unit Tests / unit-tests (3.10) (push) Failing after 1m41s Details Pre-commit / pre-commit (push) Failing after 3h4m8s Details Setting defaults to be `\| None` else they get marked as required params in open-api spec.	2025-06-11 17:30:34 -07:00
Hardik Shah	d55100d9b7	feat: OpenAIVectorIOMixin for vector_stores common logic (#2427 ) Extracts common OpenAI vector-store code into its own mixin so that all providers can share the same core logic. This also makes it easy for Llama Stack to support both vector-stores and Llama Stack APIs in the interim so that both share the same underlying vector-dbs. Each provider contains storage specific logic to `create / edit / delete / list` vector dbs while the plumbing logic is standardized in the common code. Ensured that this works well with both faiss and sqllite-vec. ### Test Plan ``` llama stack run starter pytest -sv --stack-config http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ```	2025-06-11 15:40:57 -07:00
Rohan Awhad	4e37b49cdc	fix: #1867 InferenceRouter has no attribute formatter (#2422 ) Some checks failed Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 49s Details Integration Tests / test-matrix (http, 3.11, inspect) (push) Failing after 53s Details Integration Tests / test-matrix (http, 3.10, datasets) (push) Failing after 57s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 17s Details Integration Tests / test-matrix (http, 3.10, scoring) (push) Failing after 55s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 50s Details Integration Tests / test-matrix (http, 3.11, tool_runtime) (push) Failing after 51s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 14s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Unit Tests / unit-tests (3.10) (push) Failing after 2m9s Details Unit Tests / unit-tests (3.11) (push) Failing after 2m7s Details Pre-commit / pre-commit (push) Failing after 3h13m50s Details # What does this PR do? Closes #1867 [Steps to reproduce the bug](https://github.com/meta-llama/llama-stack/issues/1867#issuecomment-2956819381) The change was designed to minimize code changes. Open to option of skipping `metrics` field entirely when `telemetry` is disabled. ## Test Plan 1. Build llama-stack remote-vllm container ```bash llama stack build --template remote-vllm --image-type container ``` 2. Create a small run.yaml ```yaml version: '2' image_name: remote-vllm apis: - inference providers: inference: - provider_id: vllm-inference provider_type: remote::vllm config: url: ${env.VLLM_URL:http://localhost:8000/v1} max_tokens: ${env.VLLM_MAX_TOKENS:4096} api_token: ${env.VLLM_API_TOKEN:fake} tls_verify: ${env.VLLM_TLS_VERIFY:true} metadata_store: type: sqlite db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/remote-vllm}/registry.db inference_store: type: sqlite db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/remote-vllm}/inference_store.db models: - metadata: {} model_id: ${env.INFERENCE_MODEL} provider_id: vllm-inference model_type: llm shields: [] vector_dbs: [] datasets: [] scoring_fns: [] benchmarks: [] server: port: 8321 ``` 3. Run the llama-stack server ```bash export VLLM_URL="http://localhost:8000/v1" export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" llama stack run run.yaml ``` 4. Then perform a curl ```bash curl -X 'POST' \ 'http://localhost:8321/v1/inference/completion' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "model_id": "meta-llama/Llama-3.2-3B-Instruct", "content": "string", "sampling_params": { "strategy": { "type": "greedy" }, "max_tokens": 10, "repetition_penalty": 1, "stop": [ "string" ] }, "stream": false, "logprobs": { "top_k": 0 } }' ``` 5. You should receive a 200 response with metric values set to 0, similar to one below: ``` { "metrics": [ { "metric": "prompt_tokens", "value": 0, "unit": null }, { "metric": "completion_tokens", "value": 0, "unit": null }, { "metric": "total_tokens", "value": 0, "unit": null } ], [...] } ``` Co-authored-by: Rohan Awhad <rawhad@redhat.com>	2025-06-11 18:14:41 +02:00
Hardik Shah	5ac43268e8	feat: Add OpenAI compat /v1/vector_store APIs (#2423 ) Some checks failed Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.10, post_training) (push) Failing after 41s Details Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.10, tool_runtime) (push) Failing after 46s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 16s Details Test External Providers / test-external-providers (venv) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 13s Details Update ReadTheDocs / update-readthedocs (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m31s Details Unit Tests / unit-tests (3.11) (push) Failing after 1m33s Details Unit Tests / unit-tests (3.10) (push) Failing after 1m35s Details Pre-commit / pre-commit (push) Failing after 3h13m41s Details Adding OpenAI compat `/v1/vector-store` apis. This PR implements the `faiss` provider with followup PRs coming up for other providers. Added routes to create, update, delete, list vector stores. Also added route to search a vector store Inserting into vector stores is missing and will be a follow up diff. ### Test Plan - Added new integration test for testing the faiss provider ``` pytest -sv --stack-config http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ```	2025-06-10 13:07:39 -07:00
Ben Browning	ee57e58f29	fix: loosen tool call checks in inference store (#2420 ) # What does this PR do? This loosens up the tool call function name and arguments checks in `tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls` because the small models we use in CI cannot reliably get the tool call function name or arguments exactly right. Closes #2345 ## Test Plan I ran this flaking test in a loop, let it run many dozens of times, and didn't observe any flakes after the changes. Previously it flaked quite regularly. ``` while uv run pytest -s -v \ 'tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[llama_stack_client-txt=3B-False]' \ --stack-config=http://localhost:8321 \ --text-model="meta-llama/Llama-3.2-3B-Instruct" \ --embedding-model=all-MiniLM-L6-v2; do; sleep 0.1; done ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-10 14:45:55 +02:00
Yuan Tang	5639ad7466	docs: Add recent releases (#2424 ) Some checks failed Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 10s Details Unit Tests / unit-tests (3.10) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.11) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Test External Providers / test-external-providers (venv) (push) Failing after 30s Details Pre-commit / pre-commit (push) Successful in 1m20s Details # What does this PR do? This adds recent release notes. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-06-10 08:43:02 +05:30
Yuan Tang	f6718b2408	fix(security): Upgrade requests to 2.32.4. Fixes CVE-2024-47081 (#2425 ) # What does this PR do? This address https://github.com/advisories/GHSA-9hjg-9r4m-mvj7. Diff was generated via: ``` uv sync --upgrade-package requests uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-06-10 08:33:28 +05:30
Ibrahim Haroon	28ca00d0d9	fix(pgvector): handle case where distance is 0 by setting score to infinity (#2416 ) Some checks failed Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 10s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Unit Tests / unit-tests (3.10) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Pre-commit / pre-commit (push) Successful in 57s Details # What does this PR do? Fixes provider pgvector `query_vector` function for when the distance between the query embedding and an embedding within the vector db is 0 (identical vectors). Catches `ZeroDivisionError` and then sets `score` to infinity, which represent maximum similarity. <!-- If resolving an issue, uncomment and update the line below --> Closes [#2381] ## Test Plan Checkout this PR Execute this code and there will no longer be a `ZeroDivisionError` exception ``` from llama_stack_client import LlamaStackClient base_url = "http://localhost:8321" client = LlamaStackClient(base_url=base_url) models = client.models.list() embedding_model = ( em := next(m for m in models if m.model_type == "embedding") ).identifier embedding_dimension = 384 _ = client.vector_dbs.register( vector_db_id="foo_db", embedding_model=embedding_model, embedding_dimension=embedding_dimension, provider_id="pgvector", ) chunk = { "content": "foo", "mime_type": "text/plain", "metadata": { "document_id": "foo-id" } } client.vector_io.insert(vector_db_id="foo_db", chunks=[chunk]) client.vector_io.query(vector_db_id="foo_db", query="foo") ```	2025-06-07 16:31:30 -04:00
Ibrahim Haroon	a34cef925b	fix(faiss): handle case where distance is 0 by setting d to minimum positive… (#2387 ) # What does this PR do? Adds try-catch to faiss `query_vector` function for when the distance between the query embedding and an embedding within the vector db is 0 (identical vectors). Catches `ZeroDivisionError` and then appends `(1.0 / sys.float_info.min)` to `scores` to represent maximum similarity. <!-- If resolving an issue, uncomment and update the line below --> Closes [#2381] ## Test Plan Checkout this PR Execute this code and there will no longer be a `ZeroDivisionError` exception ``` from llama_stack_client import LlamaStackClient base_url = "http://localhost:8321" client = LlamaStackClient(base_url=base_url) models = client.models.list() embedding_model = ( em := next(m for m in models if m.model_type == "embedding") ).identifier embedding_dimension = 384 _ = client.vector_dbs.register( vector_db_id="foo_db", embedding_model=embedding_model, embedding_dimension=embedding_dimension, provider_id="faiss", ) chunk = { "content": "foo", "mime_type": "text/plain", "metadata": { "document_id": "foo-id" } } client.vector_io.insert(vector_db_id="foo_db", chunks=[chunk]) client.vector_io.query(vector_db_id="foo_db", query="foo") ``` ### Running unit tests `uv run pytest tests/unit/rag/test_rag_query.py -v` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Ben Browning <bbrownin@redhat.com>	2025-06-07 16:09:46 -04:00
Sumit Jaiswal	33ecefd284	feat: To add health status check for remote VLLM (#2303 ) Some checks failed Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Unit Tests / unit-tests (3.10) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 11s Details Unit Tests / unit-tests (3.11) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Pre-commit / pre-commit (push) Successful in 56s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> To add health status check for remote VLLM <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> PR includes the unit test to test the added health check implementation feature.	2025-06-06 15:33:12 -04:00
Alexey Rybak	32c651e3a7	chore: update CODEOWNERS (#2414 ) Some checks failed Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 8s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 10s Details Unit Tests / unit-tests (3.10) (push) Failing after 9s Details Unit Tests / unit-tests (3.11) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Pre-commit / pre-commit (push) Successful in 1m12s Details	2025-06-06 20:35:15 +02:00
Hardik Shah	1f48577a02	fix: ChromaDB provider (#2413 ) fixes the remote::chromaDB provider for vector_io by updating the method definition appropriately. Fixed impl to use score_threshold properly. ### Test Plan ``` # Start Chroma Docker docker run --rm \ --name chromadb \ -p 8800:8000 \ -v ~/chroma:/chroma/chroma \ -e IS_PERSISTENT=TRUE \ -e ANONYMIZED_TELEMETRY=FALSE \ chromadb/chroma:latest # run pytest CHROMADB_URL="http://localhost:8800" pytest -sv tests/integration/vector_io/test_vector_io.py --stack-config vector_io=remote::chromadb,inference=fireworks --embedding-model nomic-ai/nomic-embed-text-v1.5 ```	2025-06-06 11:25:58 -07:00
Sébastien Han	0d0b8d2be1	ci: use ollama container image with loaded models (#2410 ) Some checks failed Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 8s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 16s Details Unit Tests / unit-tests (3.11) (push) Failing after 8s Details Unit Tests / unit-tests (3.10) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Pre-commit / pre-commit (push) Successful in 1m3s Details # What does this PR do? Instead of downloading the models each time we now have a single Ollama container that is baked with the models pulled and ready to use. This will remove the CI flakiness on model pulling. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-06 12:08:20 +02:00
github-actions[bot]	692709cd45	build: Bump version to 0.2.10 Some checks failed Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 9s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Unit Tests / unit-tests (3.10) (push) Failing after 8s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 27s Details Test Llama Stack Build / build (push) Failing after 7s Details Pre-commit / pre-commit (push) Failing after 1m16s Details	2025-06-05 22:56:39 +00:00
Hardik Shah	102516f33c	fix: Pin fastapi to avoid picking up spurious versions in test pypi (#2409 ) as titled	2025-06-05 15:33:30 -07:00
ehhuang	446893f791	feat: add deps dynamically based on metastore config (#2405 ) # What does this PR do? ## Test Plan changed metastore in one of the templates, rerun distro gen, observe change in build.yaml	2025-06-05 14:07:25 -07:00
ehhuang	92b59a3377	test: skip files integrations tests for library client (#2407 ) # What does this PR do? ## Test Plan LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/integration/files/test_files.py::test_openai_client_basic_operations	2025-06-05 13:42:10 -07:00
ehhuang	ee6feaa2d5	chore: remove dead code (#2403 ) # What does this PR do? ## Test Plan	2025-06-05 21:17:54 +02:00
Hardik Shah	04592b9590	fix: update pyproject to include recursive LS deps (#2404 ) trying to run `llama` cli after installing wheel fails with this error ``` Traceback (most recent call last): File "/tmp/tmp.wdZath9U6j/.venv/bin/llama", line 4, in <module> from llama_stack.cli.llama import main File "/tmp/tmp.wdZath9U6j/.venv/lib/python3.10/site-packages/llama_stack/__init__.py", line 7, in <module> from llama_stack.distribution.library_client import ( # noqa: F401 ModuleNotFoundError: No module named 'llama_stack.distribution.library_client' ``` This PR fixes it by ensurring that all sub-directories of `llama_stack` are also included. Also, fixes the missing `fastapi` dependency issue.	2025-06-05 11:46:48 -07:00
Sébastien Han	4fb228a1d8	ci: run integration test on more python version (#2400 ) # What does this PR do? Expand the test matrix to include Python 3.10, 3.11, and 3.12 to ensure the project runs correctly on these versions. This will give us confidence to begin considering an increase to the project's minimum supported Python version. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-05 20:40:21 +02:00
Ashwin Bharambe	3251b44d8a	refactor: unify stream and non-stream impls for responses (#2388 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (http, agents) (push) Failing after 10s Details Integration Tests / test-matrix (http, inference) (push) Failing after 9s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, providers) (push) Failing after 10s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, agents) (push) Failing after 9s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 11s Details Unit Tests / unit-tests (3.11) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Unit Tests / unit-tests (3.10) (push) Failing after 30s Details Pre-commit / pre-commit (push) Successful in 1m18s Details The non-streaming version is just a small layer on top of the streaming version - just pluck off the final `response.completed` event and return that as the response! This PR also includes a couple other changes which I ended up making while working on it on a flight: - changes to `ollama` so it does not pull embedding models unconditionally - a small fix to library client to make the stream and non-stream cases a bit more symmetric	2025-06-05 17:48:09 +02:00
Jose Angel Morena Simon	ef885d2147	fix(server): Add missing OpenTelemetry dependencies to resolve telemetry import errors (#2391 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (http, inference) (push) Failing after 9s Details Integration Tests / test-matrix (http, agents) (push) Failing after 10s Details Integration Tests / test-matrix (http, providers) (push) Failing after 8s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, agents) (push) Failing after 7s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Unit Tests / unit-tests (3.10) (push) Failing after 7s Details Test Llama Stack Build / build (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 35s Details Pre-commit / pre-commit (push) Successful in 1m20s Details This PR fixes a runtime import error caused by missing OpenTelemetry dependencies during `llama stack run`. Specifically, the following imports fail if `opentelemetry-sdk` and `opentelemetry-exporter-otlp-proto-http` are not present in the environment: ```python from opentelemetry import metrics, trace from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter ``` See [llama\_stack/providers/inline/telemetry/meta\_reference/telemetry.py#L10-L19](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py#L10-L19) This PR resolves the issue by adding both packages to the `SERVER_DEPENDENCIES` list: ```python "opentelemetry-sdk", "opentelemetry-exporter-otlp-proto-http", ``` ### Reproduction Steps ```bash llama stack build --config llama.yaml --image-type venv --image-name fun-with-lamas llama stack run ~/.llama/distributions/fun-with-lamas/fun-with-lamas-run.yaml ``` Results in: ``` ModuleNotFoundError: No module named 'opentelemetry' ``` or ``` ModuleNotFoundError: No module named 'opentelemetry.exporter' ``` Signed-off-by: Jose Angel Morena <jmorenas@redhat.com> Co-authored-by: raghotham <rsm@meta.com>	2025-06-05 09:34:46 +02:00
Nathan Weinberg	179d72615b	docs: update contributing guidance around uv python versions (#2398 ) As discussed with @leseb here: https://github.com/containers/ramalama-stack/pull/81#discussion_r2125961014 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-06-04 23:12:03 -07:00
ehhuang	a58c0639d5	chore: update postgres_demo distro config (#2396 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 5s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (http, inference) (push) Failing after 9s Details Integration Tests / test-matrix (http, agents) (push) Failing after 9s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, agents) (push) Failing after 8s Details Integration Tests / test-matrix (http, providers) (push) Failing after 10s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, inference) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 8s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, providers) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s Details Unit Tests / unit-tests (3.10) (push) Failing after 7s Details Test Llama Stack Build / build (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.11) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 30s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? ## Test Plan	2025-06-04 17:41:27 -07:00
Sébastien Han	c8c742ba45	fix: vllm starter name (#2392 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (http, agents) (push) Failing after 10s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (http, inference) (push) Failing after 10s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (http, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, inference) (push) Failing after 8s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, agents) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 9s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Integration Tests / test-matrix (library, providers) (push) Failing after 9s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Test Llama Stack Build / build (push) Failing after 6s Details Unit Tests / unit-tests (3.10) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Test External Providers / test-external-providers (venv) (push) Failing after 29s Details Pre-commit / pre-commit (push) Successful in 2m3s Details Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-04 16:21:36 +02:00
grs	0de9536717	fix: remove debug print accidentally merged (#2393 ) I accidentally left a debug print in a PR that was merged. This removes that.	2025-06-04 15:14:14 +02:00
Ben Browning	e9d9f01b8b	docs: Add OpenAI API compatibility page (#2316 ) # What does this PR do? This adds some initial content documenting our OpenAI compatible APIs - Responses, Chat Completions, Completions, and Models - along with instructions on how to use them via OpenAI or Llama Stack clients and some simple examples for each. It's not a lot of content, but it's a start so that users have some idea how to get going as we continue to work on these APIs. ## Test Plan I generated the docs site locally and verified things render properly. I also ran each code example to ensure it works as expected. And, I asked my AI code assistant to do a quick spell-check and review of the docs and it didn't flag any obvious errors. --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Francisco Arceo <farceo@redhat.com>	2025-06-04 06:51:52 -04:00
Ashwin Bharambe	ed69c1b3cc	feat(responses): add more streaming response types (#2375 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Integration Tests / test-matrix (http, agents) (push) Failing after 9s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, agents) (push) Failing after 9s Details Integration Tests / test-matrix (http, inference) (push) Failing after 11s Details Integration Tests / test-matrix (http, providers) (push) Failing after 10s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 7s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Integration Tests / test-matrix (library, providers) (push) Failing after 7s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 9s Details Unit Tests / unit-tests (3.10) (push) Failing after 7s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Unit Tests / unit-tests (3.11) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 34s Details Pre-commit / pre-commit (push) Successful in 1m21s Details	2025-06-03 15:48:41 -07:00
ehhuang	d96f6ec763	chore(ui): use proxy server for backend API calls; simplified k8s deployment (#2350 ) # What does this PR do? - no more CORS middleware needed ## Test Plan ### Local test llama stack run starter --image-type conda npm run dev verify UI works in browser ### Deploy to k8s temporarily change ui-k8s.yaml.template to load from PR commit <img width="604" alt="image" src="https://github.com/user-attachments/assets/87fa2e52-1e93-4e32-9e0f-5b283b7a37b3" /> sh ./apply.sh $ kubectl get services go to external_ip:8322 and play around with UI <img width="1690" alt="image" src="https://github.com/user-attachments/assets/5b7ec827-4302-4435-a9eb-df423676d873" />	2025-06-03 14:57:10 -07:00
grs	7c1998db25	feat: fine grained access control policy (#2264 ) This allows a set of rules to be defined for determining access to resources. The rules are (loosely) based on the cedar policy format. A rule defines a list of action either to permit or to forbid. It may specify a principal or a resource that must match for the rule to take effect. It may also specify a condition, either a 'when' or an 'unless', with additional constraints as to where the rule applies. A list of rules is held for each type to be protected and tried in order to find a match. If a match is found, the request is permitted or forbidden depening on the type of rule. If no match is found, the request is denied. If no rules are specified for a given type, a rule that allows any action as long as the resource attributes match the user attributes is added (i.e. the previous behaviour is the default. Some examples in yaml: ``` model: - permit: principal: user-1 actions: [create, read, delete] comment: user-1 has full access to all models - permit: principal: user-2 actions: [read] resource: model-1 comment: user-2 has read access to model-1 only - permit: actions: [read] when: user_in: resource.namespaces comment: any user has read access to models with matching attributes vector_db: - forbid: actions: [create, read, delete] unless: user_in: role::admin comment: only user with admin role can use vector_db resources ``` --------- Signed-off-by: Gordon Sim <gsim@redhat.com>	2025-06-03 14:51:12 -07:00
Ben Browning	8bee2954be	feat: Structured output for Responses API (#2324 ) # What does this PR do? This adds the missing `text` parameter to the Responses API that is how users control structured outputs. All we do with that parameter is map it to the corresponding chat completion response_format. ## Test Plan The new unit tests exercise the various permutations allowed for this property, while a couple of new verification tests actually use it for real to verify the model outputs are following the format as expected. Unit tests: `python -m pytest -s -v tests/unit/providers/agents/meta_reference/test_openai_responses.py` Verification tests: ``` llama stack run llama_stack/templates/together/run.yaml pytest -s -vv 'tests/verifications/openai_api/test_responses.py' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model meta-llama/Llama-4-Scout-17B-16E-Instruct ``` Note that the verification tests can only be run with a real Llama Stack server (as opposed to using the library client via `--provider=stack:together`) because the Llama Stack python client is not yet updated to accept this text field. Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-03 14:43:00 -07:00
Ignas Baranauskas	c70ca8344f	fix: resolve template name to config path in `llama stack run` (#2361 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fixes a bug where running a known template by name using: `llama stack run ollama` would fail with the following error: `ValueError: Config file ollama does not exist` <!-- If resolving an issue, uncomment and update the line below --> Closes #2291 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> `llama stack run ollama` should work	2025-06-03 14:39:12 -07:00
Ashwin Bharambe	cba55808ab	feat(distro): add more providers to starter distro, prefix conflicting models (#2362 ) The name changes to the verifications file are unfortunate, but maybe we don't need that @ehhuang ? Edit: deleted the verifications template now	2025-06-03 12:10:46 -07:00
Ashwin Bharambe	b380cb463f	feat: add postgres deps to starter distro (#2360 ) Once we have this, we can use the starter distro for the Kubernetes cluster demos.	2025-06-03 11:04:23 -07:00
Jorge	e743257d1d	docs: Add missing dependencies in quickstart demo command (#2347 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Integration Tests / test-matrix (http, agents) (push) Failing after 8s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (http, providers) (push) Failing after 8s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (http, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, agents) (push) Failing after 10s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 7s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, providers) (push) Failing after 9s Details Unit Tests / unit-tests (3.10) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Unit Tests / unit-tests (3.11) (push) Failing after 31s Details Pre-commit / pre-commit (push) Successful in 1m17s Details Adds missing required dependencies to run the demo command in the Quickstart doc Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>	2025-06-03 18:01:36 +02:00
ehhuang	3c9a10d2fe	feat: reference implementation for files API (#2330 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, agents) (push) Failing after 10s Details Integration Tests / test-matrix (http, providers) (push) Failing after 8s Details Integration Tests / test-matrix (http, inference) (push) Failing after 11s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, agents) (push) Failing after 10s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 11s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 8s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, providers) (push) Failing after 9s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Unit Tests / unit-tests (3.10) (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Pre-commit / pre-commit (push) Successful in 53s Details # What does this PR do? TSIA Added Files provider to the fireworks template. Might want to add to all templates as a follow-up. ## Test Plan llama-stack pytest tests/unit/files/test_files.py llama-stack llama stack build --template fireworks --image-type conda --run LLAMA_STACK_CONFIG=http://localhost:8321 pytest -s -v tests/integration/files/	2025-06-02 21:54:24 -07:00
Ashwin Bharambe	ba25c5e7e1	docs(k8s): add UI template (#2343 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 5s Details Integration Tests / test-matrix (http, inference) (push) Failing after 9s Details Integration Tests / test-matrix (http, agents) (push) Failing after 11s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (http, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, agents) (push) Failing after 10s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 8s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Integration Tests / test-matrix (library, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 10s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Unit Tests / unit-tests (3.10) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Pre-commit / pre-commit (push) Successful in 55s Details WIP: add a UI template	2025-06-02 17:55:18 -07:00
Ben Browning	e92f571f47	fix: ollama chat completion needs unique ids (#2344 ) # What does this PR do? The chat completion ids generated by Ollama are not unique enough to use with stored chat completions as they rely on only 3 numbers of randomness to give unique values - ie `chatcmpl-373`. This causes frequent collisions in id values of chat completions in Ollama, which creates issues in our SQL storage of chat completions by id where it expects ids to actually be unique. So, this adjusts Ollama responses to use uuids as unique ids. This does mean we're replacing the ids generated natively by Ollama. If we don't wish to do this, we'll either need to relax the unique constraint on our chat completions id field in the inference storage or convince Ollama upstream to use something closer to uuid values here. Closes #2315 ## Test Plan I tested by running the openai completion / chat completion integration tests in a loop. Without this change, I regularly get unique id collisions. With this change, I do not. We sometimes see flakes from these unique id collisions in our CI tests, and this will resolve those. ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml while true; do; \ INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ pytest -s -v \ tests/integration/inference/test_openai_completion.py \ --stack-config=http://localhost:8321 \ --text-model="meta-llama/Llama-3.2-3B-Instruct"; \ done ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-02 17:43:20 -07:00
ehhuang	4540c9b3e5	chore: revert llama-stack-client dep (#2342 ) # What does this PR do? ## Test Plan	2025-06-02 16:05:21 -07:00
Ashwin Bharambe	dbe4e84aca	feat(responses): implement full multi-turn support (#2295 ) I think the implementation needs more simplification. Spent way too much time trying to get the tests pass with models not co-operating :( Finally had to switch claude-sonnet to get things to pass reliably. ### Test Plan ``` export TAVILY_SEARCH_API_KEY=... export OPENAI_API_KEY=... uv run pytest -p no:warnings \ -s -v tests/verifications/openai_api/test_responses.py \ --provider=stack:starter \ --model openai/gpt-4o ```	2025-06-02 15:35:49 -07:00
ehhuang	cac7d404a2	fix: remove openai dep (#2337 ) # What does this PR do? 1. remove openai dep 2. temporarily update llama-stack-client to stainless sync'd branch as the responses/inputitems API wasn't included in the last push. This will automatically be updated to the next version in the release. ## Test Plan npm run dev go to any responses details page	2025-06-02 15:15:12 -07:00
Ashwin Bharambe	76dcf47320	docs(mcp): add a few lines for how to specify Auth headers in MCP tools (#2336 )	2025-06-02 14:28:38 -07:00
Sébastien Han	6bb174bb05	revert: "chore: Remove zero-width space characters from OTEL service" (#2331 ) # What does this PR do? Revert #2060 and fix PLE2515. --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-02 14:21:35 -07:00
Hardik Shah	3511af7c33	fix: fireworks provider for openai compat inference endpoint (#2335 ) fixes provider to use stream var correctly Before ``` curl --request POST \ --url http://localhost:8321/v1/openai/v1/chat/completions \ --header 'content-type: application/json' \ --data '{ "model": "meta-llama/Llama-4-Scout-17B-16E-Instruct", "messages": [ { "role": "user", "content": "Who are you?" } ] }' {"detail":"Internal server error: An unexpected error occurred."} ``` After ``` llama-stack % curl --request POST \ --url http://localhost:8321/v1/openai/v1/chat/completions \ --header 'content-type: application/json' \ --data '{ "model": "accounts/fireworks/models/llama4-scout-instruct-basic", "messages": [ { "role": "user", "content": "Who are you?" } ] }' {"id":"chatcmpl-97978538-271d-4c73-8d4d-c509bfb6c87e","choices":[{"message":{"role":"assistant","content":"I'm an AI assistant designed by Meta. I'm here to answer your questions, share interesting ideas and maybe even surprise you with a fresh perspective. What's on your mind?","name":null,"tool_calls":null},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","created":1748896403,"model":"accounts/fireworks/models/llama4-scout-instruct-basic"}% ```	2025-06-02 14:11:15 -07:00
Ashwin Bharambe	7fb4bdabea	docs(kubernetes): add more fleshed-out example of a Demo Kubernetes cluster (#2329 ) This Kubernetes cluster has: - vLLM for serving an inference model - vLLM for serving a safety model - Postgres DB (for metadata and other state for the Llama Stack distro) - Chroma DB for Vector IO (memory) Perhaps most importantly, this was me trying to learn Kubernetes for the first time. ## Test Plan Run `sh apply.sh` against an EKS cluster, then after `kubectl port-forward service/llama-stack-service 8321:8321` and after many attempts, we have finally: <img width="1589" alt="image" src="https://github.com/user-attachments/assets/c69f242d-6aaa-4def-9f7c-172113b8bfc1" /> <img width="1978" alt="image" src="https://github.com/user-attachments/assets/cf678404-f551-4fa5-9077-bebe3e8e8ae8" />	2025-06-02 13:07:08 -07:00

1 2 3 4 5 ...

2067 commits