mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 12:07:34 +00:00
258 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
47b640370e
|
feat(tests): introduce a test "suite" concept to encompass dirs, options (#3339)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
UI Tests / ui-tests (22) (push) Successful in 33s
Pre-commit / pre-commit (push) Successful in 1m15s
Our integration tests need to be 'grouped' because each group often needs a specific set of models it works with. We separated vision tests due to this, and we have a separate set of tests which test "Responses" API. This PR makes this system a bit more official so it is very easy to target these groups and apply all testing infrastructure towards all the groups (for example, record-replay) uniformly. There are three suites declared: - base - vision - responses Note that our CI currently runs the "base" and "vision" suites. You can use the `--suite` option when running pytest (or any of the testing scripts or workflows.) For example: ``` OLLAMA_URL=http://localhost:11434 \ pytest -s -v tests/integration/ --stack-config starter --suite vision ``` |
||
|
0c2757a05b
|
chore(sambanova test): skip with_n tests for sambanova, it is not implemented server-side (#3342)
# What does this PR do? skip a test that cannot pass for sambanova see https://docs-legacy.sambanova.ai/sambastudio/latest/open-ai-api.html\#_example_requests_using_openai_client ## Test Plan ci |
||
|
df1526991f
|
feat(batches, completions): add /v1/completions support to /v1/batches (#3309)
# What does this PR do? add support for /v1/completions to the /v1/batches api ## Test Plan ci |
||
|
e2fe39aee1
|
feat!: Migrate Vector DB IDs to Vector Store IDs (breaking change) (#3253)
Some checks failed
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
Test Llama Stack Build / build-single-provider (push) Failing after 3s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Python Package Build Test / build (3.12) (push) Failing after 2s
Test External API and Providers / test-external (venv) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
Update ReadTheDocs / update-readthedocs (push) Failing after 3s
Test Llama Stack Build / build (push) Failing after 3s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
UI Tests / ui-tests (22) (push) Successful in 35s
Pre-commit / pre-commit (push) Successful in 1m15s
# What does this PR do? This change migrates the VectorDB id generation to Vector Stores. This is a breaking change for **_some users_** that may have application code using the `vector_db_id` parameter in the request of the VectorDB protocol instead of the `VectorDB.identifier` in the response. By default we will now create a Vector Store every time we register a VectorDB. The caveat with this approach is that this maps the `vector_db_id` → `vector_store.name`. This is a reasonable tradeoff to transition users towards OpenAI Vector Stores. As an added benefit, registering VectorDBs will result in them appearing in the VectorStores admin UI. ### Why? This PR makes the `POST` API call to `/v1/vector-dbs` swap the `vector_db_id` parameter in the **request body** into the VectorStore's name field and sets the `vector_db_id` to the generated vector store id (e.g., `vs_038247dd-4bbb-4dbb-a6be-d5ecfd46cfdb`). That means that users would have to do something like follows in their application code: ```python res = client.vector_dbs.register( vector_db_id='my-vector-db-id', embedding_model='ollama/all-minilm:l6-v2', embedding_dimension=384, ) vector_db_id = res.identifier ``` And then the rest of their code would behave, including `VectorIO`'s insert protocol using `vector_db_id` in the request. An alternative implementation would be to just delete the `vector_db_id` parameter in `VectorDB` but the end result would still require users having to write `vector_db_id = res.identifier` since `VectorStores.create()` generates the ID for you. So this approach felt the easiest way to migrate users towards VectorStores (subsequent PRs will be added to trigger `files.create()` and `vector_stores.files.create()`). ## Test Plan Unit tests and integration tests have been added. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> |
||
|
3a7ac4227d
|
chore: unbreak inference store test (#3340)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 1s
Python Package Build Test / build (3.13) (push) Failing after 1s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 5s
UI Tests / ui-tests (22) (push) Successful in 1m21s
Pre-commit / pre-commit (push) Successful in 2m27s
# What does this PR do? The inference store writes were moved to asyncio.create_task and not await anymore ## Test Plan ❯ OLLAMA_URL=http://localhost:11434 LLAMA_STACK_CONFIG=server:starter uv run --with pytest-repeat pytest tests/integration/inference --text-model="ollama/llama3.2:3b-instruct-fp16" -vvs -k "test_inference_store_tool_calls and 3b-instruct-fp16-True" --count=10 Uninstalled 2 packages in 102ms Installed 2 packages in 138ms INFO 2025-09-04 14:10:17,775 tests.integration.conftest:66 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS ========================================================================================================== test session starts =========================================================================================================== platform darwin -- Python 3.12.3, pytest-8.4.1, pluggy-1.6.0 -- /Users/erichuang/.cache/uv/builds-v0/.tmpSGMlgt/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.3', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.1', 'pluggy': '1.6.0'}, 'Plugins': {'repeat': '0.9.4', 'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}} rootdir: /Users/erichuang/projects/llama-stack-git configfile: pyproject.toml plugins: repeat-0.9.4, anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 970 items / 950 deselected / 20 selected tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-1-10] instantiating llama_stack_client Starting llama stack server with config 'starter' on port 8321... Waiting for server at http://localhost:8321... (0.0s elapsed) Waiting for server at http://localhost:8321... (0.5s elapsed) Waiting for server at http://localhost:8321... (5.1s elapsed) Waiting for server at http://localhost:8321... (5.6s elapsed) Waiting for server at http://localhost:8321... (10.1s elapsed) Waiting for server at http://localhost:8321... (10.6s elapsed) Waiting for server at http://localhost:8321... (15.2s elapsed) Waiting for server at http://localhost:8321... (15.7s elapsed) Server is ready at http://localhost:8321 llama_stack_client instantiated in 20.583s PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-2-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-3-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-4-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-5-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-6-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-7-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-8-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-9-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-10-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-1-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-2-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-3-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-4-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-5-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-6-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-7-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-8-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-9-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-10-10] PASSEDTerminating llama stack server process... Terminating process 53307 and its group... Server process and children terminated gracefully |
||
|
02f6e0f531
|
fix(tests): set inference mode to be replay by default (#3326)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 1s
Python Package Build Test / build (3.13) (push) Failing after 1s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
UI Tests / ui-tests (22) (push) Successful in 1m19s
Pre-commit / pre-commit (push) Successful in 2m30s
`construct_stack()` relies on the environment variable to know when to
setup the patching infrastructure.
|
||
|
c3d3a0b833
|
feat(tests): auto-merge all model list responses and unify recordings (#3320)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s
Update ReadTheDocs / update-readthedocs (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (push) Failing after 7s
Python Package Build Test / build (3.13) (push) Failing after 8s
Python Package Build Test / build (3.12) (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
UI Tests / ui-tests (22) (push) Successful in 1m7s
Pre-commit / pre-commit (push) Successful in 2m34s
One needed to specify record-replay related environment variables for running integration tests. We could not use defaults because integration tests could be run against Ollama instances which could be running different models. For example, text vs vision tests needed separate instances of Ollama because a single instance typically cannot serve both of these models if you assume the standard CI worker configuration on Github. As a result, `client.list()` as returned by the Ollama client would be different between these runs and we'd end up overwriting responses. This PR "solves" it by adding a small amount of complexity -- we store model list responses specially, keyed by the hashes of the models they return. At replay time, we merge all of them and pretend that we have the union of all models available. ## Test Plan Re-recorded all the tests using `scripts/integration-tests.sh --inference-mode record`, including the vision tests. |
||
|
478b4ff1e6
|
chore(migrate apis): move VectorDBWithIndex from embeddings to openai_embeddings (#3294)
# What does this PR do? migrates VectorDBWithIndex to use openai_embeddings part of #2365 ## Test Plan existing unit tests |
||
|
3370d8e557
|
feat(files, s3, expiration): add expires_after support to S3 files provider (#3283) | ||
|
3130ca0a78
|
feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider (#3064)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this task is to implement `openai/v1/vector_stores/{vector_store_id}/search` for PGVector provider. It involves implementing vector similarity search, keyword search and hybrid search for `PGVectorIndex`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3006 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Run unit tests: ` ./scripts/unit-tests.sh ` Run integration tests for openai vector stores: 1. Export env vars: ``` export ENABLE_PGVECTOR=true export PGVECTOR_HOST=localhost export PGVECTOR_PORT=5432 export PGVECTOR_DB=llamastack export PGVECTOR_USER=llamastack export PGVECTOR_PASSWORD=llamastack ``` 2. Create DB: ``` psql -h localhost -U postgres -c "CREATE ROLE llamastack LOGIN PASSWORD 'llamastack';" psql -h localhost -U postgres -c "CREATE DATABASE llamastack OWNER llamastack;" psql -h localhost -U llamastack -d llamastack -c "CREATE EXTENSION IF NOT EXISTS vector;" ``` 3. Install sentence-transformers: ` uv pip install sentence-transformers ` 4. Run: ``` uv run --group test pytest -s -v --stack-config="inference=inline::sentence-transformers,vector_io=remote::pgvector" --embedding-model sentence-transformers/all-MiniLM-L6-v2 tests/integration/vector_io/test_openai_vector_stores.py ``` Inspect PGVector vector stores (optional): ``` psql llamastack psql (14.18 (Homebrew)) Type "help" for help. llamastack=# \z Access privileges Schema | Name | Type | Access privileges | Column privileges | Policies --------+------------------------------------------------------+-------+-------------------+-------------------+---------- public | llamastack_kvstore | table | | | public | metadata_store | table | | | public | vector_store_pgvector_main | table | | | public | vector_store_vs_1dfbc061_1f4d_4497_9165_ecba2622ba3a | table | | | public | vector_store_vs_2085a9fb_1822_4e42_a277_c6a685843fa7 | table | | | public | vector_store_vs_2b3dae46_38be_462a_afd6_37ee5fe661b1 | table | | | public | vector_store_vs_2f438de6_f606_4561_9d50_ef9160eb9060 | table | | | public | vector_store_vs_3eeca564_2580_4c68_bfea_83dc57e31214 | table | | | public | vector_store_vs_53942163_05f3_40e0_83c0_0997c64613da | table | | | public | vector_store_vs_545bac75_8950_4ff1_b084_e221192d4709 | table | | | public | vector_store_vs_688a37d8_35b2_4298_a035_bfedf5b21f86 | table | | | public | vector_store_vs_70624d9a_f6ac_4c42_b8ab_0649473c6600 | table | | | public | vector_store_vs_73fc1dd2_e942_4972_afb1_1e177b591ac2 | table | | | public | vector_store_vs_9d464949_d51f_49db_9f87_e033b8b84ac9 | table | | | public | vector_store_vs_a1e4d724_5162_4d6d_a6c0_bdafaf6b76ec | table | | | public | vector_store_vs_a328fb1b_1a21_480f_9624_ffaa60fb6672 | table | | | public | vector_store_vs_a8981bf0_2e66_4445_a267_a8fff442db53 | table | | | public | vector_store_vs_ccd4b6a4_1efd_4984_ad03_e7ff8eadb296 | table | | | public | vector_store_vs_cd6420a4_a1fc_4cec_948c_1413a26281c9 | table | | | public | vector_store_vs_cd709284_e5cf_4a88_aba5_dc76a35364bd | table | | | public | vector_store_vs_d7a4548e_fbc1_44d7_b2ec_b664417f2a46 | table | | | public | vector_store_vs_e7f73231_414c_4523_886c_d1174eee836e | table | | | public | vector_store_vs_ffd53588_819f_47e8_bb9d_954af6f7833d | table | | | (23 rows) llamastack=# ``` Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> |
||
|
7ca8233889
|
feat(testing): remove SQLite dependency from inference recorder (#3254)
Recording files use a predictable naming format, making the SQLite index redundant. The binary SQLite file was causing frequent git conflicts. Simplify by calculating file paths directly from request hashes. Signed-off-by: Derek Higgins <derekh@redhat.com> |
||
|
cffc4edf47
|
feat: Add optional idempotency support to batches API (#3171)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 0s
Test Llama Stack Build / build-single-provider (push) Failing after 2s
Pre-commit / pre-commit (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Failing after 5s
Test Llama Stack Build / build (push) Has been skipped
Vector IO Integration Tests / test-matrix (push) Failing after 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s
Python Package Build Test / build (3.13) (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Update ReadTheDocs / update-readthedocs (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 7s
Unit Tests / unit-tests (3.13) (push) Failing after 5s
UI Tests / ui-tests (22) (push) Failing after 6s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s
Implements optional idempotency for batch creation using `idem_tok` parameter: * **Core idempotency**: Same token + parameters returns existing batch * **Conflict detection**: Same token + different parameters raises HTTP 409 ConflictError * **Metadata order independence**: Different key ordering doesn't affect idempotency **API changes:** - Add optional `idem_tok` parameter to `create_batch()` method - Enhanced API documentation with idempotency extensions **Implementation:** - Reference provider supports idempotent batch creation - ConflictError for proper HTTP 409 status code mapping - Comprehensive parameter validation **Testing:** - Unit tests: focused tests covering core scenarios with parametrized conflict detection - Integration tests: tests validating real OpenAI client behavior This enables client-side retry safety and prevents duplicate batch creation when using the same idempotency token, following REST API closes #3144 |
||
|
3b9278f254
|
feat: implement query_metrics (#3074)
# What does this PR do? query_metrics currently has no implementation, meaning once a metric is emitted there is no way in llama stack to query it from the store. implement query_metrics for the meta_reference provider which follows a similar style to `query_traces`, using the trace_store to format an SQL query and execute it in this case the parameters for the query are `metric.METRIC_NAME, start_time, and end_time` and any other matchers if they are provided. this required client side changes since the client had no `query_metrics` or any associated resources, so any tests here will fail but I will provide manual execution logs for the new tests I am adding order the metrics by timestamp. Additionally add `unit` to the `MetricDataPoint` class since this adds much more context to the metric being queried. depends on https://github.com/llamastack/llama-stack-client-python/pull/260 ## Test Plan ``` import time import uuid def create_http_client(): from llama_stack_client import LlamaStackClient return LlamaStackClient(base_url="http://localhost:8321") client = create_http_client() response = client.telemetry.query_metrics(metric_name="total_tokens", start_time=0) print(response) ``` ``` ╰─ python3.12 ~/telemetry.py INFO:httpx:HTTP Request: POST http://localhost:8322/v1/telemetry/metrics/total_tokens "HTTP/1.1 200 OK" [TelemetryQueryMetricsResponse(data=None, metric='total_tokens', labels=[], values=[{'timestamp': 1753999514, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999816, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999881, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999956, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754000200, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754000419, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000714, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000876, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000908, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754001309, 'value': 584.0, 'unit': 'tokens'}, {'timestamp': 1754001311, 'value': 138.0, 'unit': 'tokens'}, {'timestamp': 1754001316, 'value': 349.0, 'unit': 'tokens'}, {'timestamp': 1754001318, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001320, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001341, 'value': 923.0, 'unit': 'tokens'}, {'timestamp': 1754001350, 'value': 354.0, 'unit': 'tokens'}, {'timestamp': 1754001462, 'value': 417.0, 'unit': 'tokens'}, {'timestamp': 1754001464, 'value': 158.0, 'unit': 'tokens'}, {'timestamp': 1754001475, 'value': 697.0, 'unit': 'tokens'}, {'timestamp': 1754001477, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001479, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001489, 'value': 298.0, 'unit': 'tokens'}, {'timestamp': 1754001541, 'value': 615.0, 'unit': 'tokens'}, {'timestamp': 1754001543, 'value': 119.0, 'unit': 'tokens'}, {'timestamp': 1754001548, 'value': 310.0, 'unit': 'tokens'}, {'timestamp': 1754001549, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001551, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001568, 'value': 714.0, 'unit': 'tokens'}, {'timestamp': 1754001800, 'value': 437.0, 'unit': 'tokens'}, {'timestamp': 1754001802, 'value': 200.0, 'unit': 'tokens'}, {'timestamp': 1754001806, 'value': 262.0, 'unit': 'tokens'}, {'timestamp': 1754001808, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001810, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001816, 'value': 82.0, 'unit': 'tokens'}, {'timestamp': 1754001923, 'value': 61.0, 'unit': 'tokens'}, {'timestamp': 1754001929, 'value': 391.0, 'unit': 'tokens'}, {'timestamp': 1754001939, 'value': 598.0, 'unit': 'tokens'}, {'timestamp': 1754001941, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001942, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001952, 'value': 252.0, 'unit': 'tokens'}, {'timestamp': 1754002053, 'value': 251.0, 'unit': 'tokens'}, {'timestamp': 1754002059, 'value': 375.0, 'unit': 'tokens'}, {'timestamp': 1754002062, 'value': 244.0, 'unit': 'tokens'}, {'timestamp': 1754002064, 'value': 111.0, 'unit': 'tokens'}, {'timestamp': 1754002065, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754002083, 'value': 719.0, 'unit': 'tokens'}, {'timestamp': 1754002302, 'value': 279.0, 'unit': 'tokens'}, {'timestamp': 1754002306, 'value': 218.0, 'unit': 'tokens'}, {'timestamp': 1754002308, 'value': 198.0, 'unit': 'tokens'}, {'timestamp': 1754002309, 'value': 69.0, 'unit': 'tokens'}, {'timestamp': 1754002311, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754002324, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754003161, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003161, 'value': 69.0, 'unit': 'tokens'}, {'timestamp': 1754003169, 'value': 499.0, 'unit': 'tokens'}, {'timestamp': 1754003171, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003173, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003185, 'value': 422.0, 'unit': 'tokens'}, {'timestamp': 1754003448, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003453, 'value': 422.0, 'unit': 'tokens'}, {'timestamp': 1754003589, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003609, 'value': 279.0, 'unit': 'tokens'}, {'timestamp': 1754003614, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754003706, 'value': 303.0, 'unit': 'tokens'}, {'timestamp': 1754003706, 'value': 51.0, 'unit': 'tokens'}, {'timestamp': 1754003713, 'value': 426.0, 'unit': 'tokens'}, {'timestamp': 1754003714, 'value': 70.0, 'unit': 'tokens'}, {'timestamp': 1754003715, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003724, 'value': 225.0, 'unit': 'tokens'}, {'timestamp': 1754004226, 'value': 516.0, 'unit': 'tokens'}, {'timestamp': 1754004228, 'value': 127.0, 'unit': 'tokens'}, {'timestamp': 1754004232, 'value': 281.0, 'unit': 'tokens'}, {'timestamp': 1754004234, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004236, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004244, 'value': 206.0, 'unit': 'tokens'}, {'timestamp': 1754004683, 'value': 338.0, 'unit': 'tokens'}, {'timestamp': 1754004690, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754004692, 'value': 124.0, 'unit': 'tokens'}, {'timestamp': 1754004692, 'value': 65.0, 'unit': 'tokens'}, {'timestamp': 1754004694, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004703, 'value': 211.0, 'unit': 'tokens'}, {'timestamp': 1754004743, 'value': 338.0, 'unit': 'tokens'}, {'timestamp': 1754004749, 'value': 211.0, 'unit': 'tokens'}, {'timestamp': 1754005566, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754006101, 'value': 159.0, 'unit': 'tokens'}, {'timestamp': 1754006105, 'value': 272.0, 'unit': 'tokens'}, {'timestamp': 1754006109, 'value': 308.0, 'unit': 'tokens'}, {'timestamp': 1754006110, 'value': 61.0, 'unit': 'tokens'}, {'timestamp': 1754006112, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754006130, 'value': 705.0, 'unit': 'tokens'}, {'timestamp': 1754051825, 'value': 454.0, 'unit': 'tokens'}, {'timestamp': 1754051827, 'value': 152.0, 'unit': 'tokens'}, {'timestamp': 1754051834, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754051835, 'value': 55.0, 'unit': 'tokens'}, {'timestamp': 1754051837, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754051845, 'value': 102.0, 'unit': 'tokens'}, {'timestamp': 1754099929, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754510050, 'value': 598.0, 'unit': 'tokens'}, {'timestamp': 1754510052, 'value': 160.0, 'unit': 'tokens'}, {'timestamp': 1754510064, 'value': 725.0, 'unit': 'tokens'}, {'timestamp': 1754510065, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754510067, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754510083, 'value': 535.0, 'unit': 'tokens'}, {'timestamp': 1754596582, 'value': 36.0, 'unit': 'tokens'}])] ``` adding tests for each currently documented metric in llama stack using this new function. attached is also some manual testing integrations tests passing locally with replay mode and the linked client changes: <img width="1907" height="529" alt="Screenshot 2025-08-08 at 2 49 14 PM" src="https://github.com/user-attachments/assets/d482ab06-dcff-4f0c-a1f1-f870670ee9bc" /> --------- Signed-off-by: Charlie Doern <cdoern@redhat.com> |
||
|
da73f1a180
|
fix: ensure assistant message is followed by tool call message as expected by openai (#3224)
Some checks failed
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Pre-commit / pre-commit (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 3s
Test Llama Stack Build / build-single-provider (push) Failing after 5s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
UI Tests / ui-tests (22) (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 6s
Test External API and Providers / test-external (venv) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 12s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 17s
Test Llama Stack Build / generate-matrix (push) Failing after 21s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 23s
Test Llama Stack Build / build (push) Has been skipped
Update ReadTheDocs / update-readthedocs (push) Failing after 20s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 24s
# What does this PR do? As described in #3134 a langchain example works against openai's responses impl, but not against llama stack's. This turned out to be due to the order of the inputs. The langchain example has the two function call outputs first, followed by each call result in turn. This seems to be valid as it is accepted by openai's impl. However in llama stack, these inputs are converted to chat completion inputs and the resulting order for that api is not accpeted by openai. This PR fixes the issue by ensuring that the converted chat completions inputs are in the expected order. Closes #3134 ## Test Plan Added unit and integration tests. Verified this fixes original issue as reported. --------- Signed-off-by: Gordon Sim <gsim@redhat.com> |
||
|
deffaa9e4e
|
fix: fix the error type in embedding test case (#3197)
# What does this PR do? Currently the embedding integration test cases fail due to a misalignment in the error type. This PR fixes the embedding integration test by fixing the error type. ## Test Plan ``` pytest -s -v tests/integration/inference/test_embedding.py --stack-config="inference=nvidia" --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2" --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ``` |
||
|
1790fc0f25
|
feat: Remove initialize() Method from LlamaStackAsLibrary (#2979)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR removes `init()` from `LlamaStackAsLibrary` Currently client.initialize() had to be invoked by user. To improve dev experience and to avoid runtime errors, this PR init LlamaStackAsLibrary implicitly upon using the client. It prevents also multiple init of the same client, while maintaining backward ccompatibility. This PR does the following - Automatic Initialization: Constructor calls initialize_impl() automatically. - Client is fully initialized after __init__ completes. - Prevents consecutive initialization after the client has been successfully initialized. - initialize() method still exists but is now a no-op. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> fixes https://github.com/meta-llama/llama-stack/issues/2946 --------- Signed-off-by: Mustafa Elbehery <melbeher@redhat.com> |
||
|
14082b22af
|
fix: handle mcp tool calls in previous response correctly (#3155)
# What does this PR do? Handles MCP tool calls in a previous response Closes #3105 ## Test Plan Made call to create response with tool call, then made second call with the first linked through previous_response_id. Did not get error. Also added unit test. Signed-off-by: Gordon Sim <gsim@redhat.com> |
||
|
c2c859a6b0
|
chore(files tests): update files integration tests and fix inline::localfs (#3195)
- update files=inline::localfs to raise ResourceNotFoundError instead of ValueError - only skip tests when no files provider is available - directly use openai_client and llama_stack_client where appropriate - check for correct behavior of non-existent file - xfail the isolation test, no implementation supports it test plan - ``` $ uv run ./scripts/integration-tests.sh --stack-config server:ci-tests --provider ollama --test-subdirs files ... tests/integration/files/test_files.py::test_openai_client_basic_operations PASSED [ 25%] tests/integration/files/test_files.py::test_files_authentication_isolation XFAIL [ 50%] tests/integration/files/test_files.py::test_files_authentication_shared_attributes PASSED [ 75%] tests/integration/files/test_files.py::test_files_authentication_anonymous_access PASSED [100%] ==================================== 3 passed, 1 xfailed in 1.03s ===================================== ``` previously - ``` $ uv run llama stack build --image-type venv --providers files=inline::localfs --run & ... $ ./scripts/integration-tests.sh --stack-config http://localhost:8321 --provider ollama --test-subdirs files ... tests/integration/files/test_files.py::test_openai_client_basic_operations[openai_client-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] PASSED [ 12%] tests/integration/files/test_files.py::test_files_authentication_isolation[openai_client-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 25%] tests/integration/files/test_files.py::test_files_authentication_shared_attributes[openai_client-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 37%] tests/integration/files/test_files.py::test_files_authentication_anonymous_access[openai_client-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 50%] tests/integration/files/test_files.py::test_openai_client_basic_operations[client_with_models-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] PASSED [ 62%] tests/integration/files/test_files.py::test_files_authentication_isolation[client_with_models-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 75%] tests/integration/files/test_files.py::test_files_authentication_shared_attributes[client_with_models-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 87%] tests/integration/files/test_files.py::test_files_authentication_anonymous_access[client_with_models-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [100%] ========================================================= 2 passed, 6 skipped in 1.31s ========================================================== ``` |
||
|
3f8df167f3
|
chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061)
# What does this PR do? This PR adds a step in pre-commit to enforce using `llama_stack` logger. Currently, various parts of the code base uses different loggers. As a custom `llama_stack` logger exist and used in the codebase, it is better to standardize its utilization. Signed-off-by: Mustafa Elbehery <melbeher@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu> |
||
|
8cc4925f7d
|
chore: Enable keyword search for Milvus inline (#3073)
# What does this PR do? With https://github.com/milvus-io/milvus-lite/pull/294 - Milvus Lite supports keyword search using BM25. While introducing keyword search we had explicitly disabled it for inline milvus. This PR removes the need for the check, and enables `inline::milvus` for tests. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Run llama stack with `inline::milvus` enabled: ``` pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v ``` ``` INFO 2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS =========================================================================================== test session starts ============================================================================================ platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}} rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack configfile: pyproject.toml plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0 asyncio: mode=Mode.AUTO collected 3 items tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED [ 33%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED [ 66%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED [100%] ============================================================================================ 3 passed in 4.75s ============================================================================================= ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> |
||
|
eb07a0f86a
|
fix(ci, tests): ensure uv environments in CI are kosher, record tests (#3193)
Some checks failed
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Test Llama Stack Build / build-single-provider (push) Failing after 23s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 28s
Test Llama Stack Build / generate-matrix (push) Successful in 25s
Python Package Build Test / build (3.13) (push) Failing after 25s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 34s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 37s
Test External API and Providers / test-external (venv) (push) Failing after 33s
Unit Tests / unit-tests (3.13) (push) Failing after 33s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 38s
Python Package Build Test / build (3.12) (push) Failing after 1m0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1m4s
Unit Tests / unit-tests (3.12) (push) Failing after 59s
Test Llama Stack Build / build (push) Failing after 50s
Vector IO Integration Tests / test-matrix (push) Failing after 1m48s
UI Tests / ui-tests (22) (push) Successful in 2m12s
Pre-commit / pre-commit (push) Successful in 2m41s
I started this PR trying to unbreak a newly broken test `test_agent_name`. This test was broken all along but did not show up because during testing we were pulling the "non-updated" llama stack client. See this comment: https://github.com/llamastack/llama-stack/pull/3119#discussion_r2270988205 While fixing this, I encountered a large amount of badness in our CI workflow definitions. - We weren't passing `LLAMA_STACK_DIR` or `LLAMA_STACK_CLIENT_DIR` overrides to `llama stack build` at all in some cases. - Even when we did, we used `uv run` liberally. The first thing `uv run` does is "syncs" the project environment. This means, it is going to undo any mutations we might have done ourselves. But we make many mutations in our CI runners to these environments. The most important of which is why `llama stack build` where we install distro dependencies. As a result, when you tried to run the integration tests, you would see old, strange versions. ## Test Plan Re-record using: ``` sh scripts/integration-tests.sh --stack-config ci-tests \ --provider ollama --test-pattern test_agent_name --inference-mode record ``` Then re-run with `--inference-mode replay`. But: Eventually, this test turned out to be quite flaky for telemetry reasons. I haven't investigated it for now and just disabled it sadly since we have a release to push out. |
||
|
7519ab4024
|
feat: Code scanner Provider impl for moderations api (#3100)
# What does this PR do? Add CodeScanner implementations ## Test Plan `SAFETY_MODEL=CodeScanner LLAMA_STACK_CONFIG=starter uv run pytest -v tests/integration/safety/test_safety.py --text-model=llama3.2:3b-instruct-fp16 --embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama` This PR need to land after this https://github.com/meta-llama/llama-stack/pull/3098 |
||
|
5e7c2250be
|
test(recording): add a script to schedule recording workflow (#3170)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Successful in 5s
Python Package Build Test / build (3.13) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 9s
Test Llama Stack Build / build-single-provider (push) Failing after 10s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Vector IO Integration Tests / test-matrix (push) Failing after 14s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Test External API and Providers / test-external (venv) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s
Test Llama Stack Build / build (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Pre-commit / pre-commit (push) Successful in 1m19s
See comment here:
https://github.com/llamastack/llama-stack/pull/3162#issuecomment-3192859097
-- TL;DR it is quite complex to invoke the recording workflow correctly
for an end developer writing tests. This script simplifies the work.
No more manual GitHub UI navigation!
## Script Functionality
- Auto-detects your current branch and associated PR
- Finds the right repository context (works from forks!)
- Runs the workflow where it can actually commit back
- Validates prerequisites and provides helpful error messages
## How to Use
First ensure you are on the branch which introduced a new test and want
it recorded. **Make sure you have pushed this branch remotely, easiest
is to create a PR.**
```
# Record tests for current branch
./scripts/github/schedule-record-workflow.sh
# Record specific test subdirectories
./scripts/github/schedule-record-workflow.sh --test-subdirs "agents,inference"
# Record with vision tests enabled
./scripts/github/schedule-record-workflow.sh --run-vision-tests
# Record tests matching a pattern
./scripts/github/schedule-record-workflow.sh --test-pattern "test_streaming"
```
## Test Plan
Ran `./scripts/github/schedule-record-workflow.sh -s inference -k
tool_choice` which started
|
||
|
914c7be288
|
feat: add batches API with OpenAI compatibility (with inference replay) (#3162)
Add complete batches API implementation with protocol, providers, and tests: Core Infrastructure: - Add batches API protocol using OpenAI Batch types directly - Add Api.batches enum value and protocol mapping in resolver - Add OpenAI "batch" file purpose support - Include proper error handling (ConflictError, ResourceNotFoundError) Reference Provider: - Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list) - Implement background batch processing with configurable concurrency - Add SQLite KVStore backend for persistence - Support /v1/chat/completions endpoint with request validation Comprehensive Test Suite: - Add unit tests for provider implementation with validation - Add integration tests for end-to-end batch processing workflows - Add error handling tests for validation, malformed inputs, and edge cases Configuration: - Add max_concurrent_batches and max_concurrent_requests_per_batch options - Add provider documentation with sample configurations Test with - ``` $ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run & $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK ``` addresses #3066 --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> |
||
|
0e8bb94bf3
|
feat(ci): make recording workflow simpler, more parameterizable (#3169)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 4s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Python Package Build Test / build (3.12) (push) Failing after 12s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 14s
Update ReadTheDocs / update-readthedocs (push) Failing after 12s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s
Test External API and Providers / test-external (venv) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (push) Failing after 28s
Unit Tests / unit-tests (3.12) (push) Failing after 27s
Unit Tests / unit-tests (3.13) (push) Failing after 51s
Pre-commit / pre-commit (push) Successful in 2m6s
# What does this PR do? Recording tests has become a nightmare. This is the first part of making that process simpler by making it _less_ automatic. I tried to be too clever earlier. It simplifies the record-integration-tests workflow to use workflow dispatch inputs instead of PR labels. No more opaque stuff. Just go to the GitHub UI and run the workflow with inputs. I will soon add a helper script for this also. Other things to aid re-running just the small set of things you need to re-record: - Replaces the `test-types` JSON array parameter with a more intuitive `test-subdirs` comma-separated list. The whole JSON array crap was for matrix. - Adds a new `test-pattern` parameter to allow filtering tests using pytest's `-k` option ## Test Plan Note that this PR is in a fork not the source repository. - Replay tests on this PR are green - Manually [ran]( |
||
|
f66ae3b3b1
|
docs(tests): Add a bunch of documentation for our testing systems (#3139)
# What does this PR do? Creates a structured testing documentation section with multiple detailed pages: - Testing overview explaining the record-replay architecture - Integration testing guide with practical usage examples - Record-replay system technical documentation - Guide for writing effective tests - Troubleshooting guide for common testing issues Hopefully this makes things a bit easier. |
||
|
01b2afd4b5
|
fix(tests): record missing tests for test_responses_store (#3163)
# What does this PR do? Updates test recordings. ## Test Plan Started ollama serving the 3.2:3b model. Then ran the server: ``` LLAMA_STACK_TEST_INFERENCE_MODE=record \ LLAMA_STACK_TEST_RECORDING_DIR=tests/integration/recordings/ \ SQLITE_STORE_DIR=$(mktemp -d) \ OLLAMA_URL=http://localhost:11434 \ llama stack build --template starter --image-type venv --run ``` Then ran the tests which needed recording: ``` pytest -sv tests/integration/agents/test_openai_responses.py \ --stack-config=server:starter \ --text-model ollama/llama3.2:3b-instruct-fp16 -k test_responses_store ``` Then, restarted the server with `LLAMA_STACK_TEST_INFERENCE_MODE=replay`, re-ran the tests and verified they passed. |
||
|
8ed69978f9
|
refactor(tests): make the responses tests nicer (#3161)
# What does this PR do? A _bunch_ on cleanup for the Responses tests. - Got rid of YAML test cases, moved them to just use simple pydantic models - Splitting the large monolithic test file into multiple focused test files: - `test_basic_responses.py` for basic and image response tests - `test_tool_responses.py` for tool-related tests - `test_file_search.py` for file search specific tests - Adding a `StreamingValidator` helper class to standardize streaming response validation ## Test Plan Run the tests: ``` pytest -s -v tests/integration/non_ci/responses/ \ --stack-config=starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ``` |
||
|
ba664474de
|
feat(responses): add mcp list tool streaming event (#3159)
# What does this PR do? Adds proper streaming events for MCP tool listing (`mcp_list_tools.in_progress` and `mcp_list_tools.completed`). Also refactors things a bit more. ## Test Plan Verified existing integration tests pass with the refactored code. The test `test_response_streaming_multi_turn_tool_execution` has been updated to check for the new MCP list tools streaming events |
||
|
ee7631b6cf
|
Revert "feat: add batches API with OpenAI compatibility" (#3149)
Reverts llamastack/llama-stack#3088 The PR broke integration tests. |
||
|
de692162af
|
feat: add batches API with OpenAI compatibility (#3088)
Some checks failed
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / discover-tests (push) Successful in 12s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s
Python Package Build Test / build (3.12) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s
Python Package Build Test / build (3.13) (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 25s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 29s
Unit Tests / unit-tests (3.12) (push) Failing after 20s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s
Test External API and Providers / test-external (venv) (push) Failing after 22s
Unit Tests / unit-tests (3.13) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s
Update ReadTheDocs / update-readthedocs (push) Failing after 38s
Pre-commit / pre-commit (push) Successful in 1m53s
Add complete batches API implementation with protocol, providers, and tests: Core Infrastructure: - Add batches API protocol using OpenAI Batch types directly - Add Api.batches enum value and protocol mapping in resolver - Add OpenAI "batch" file purpose support - Include proper error handling (ConflictError, ResourceNotFoundError) Reference Provider: - Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list) - Implement background batch processing with configurable concurrency - Add SQLite KVStore backend for persistence - Support /v1/chat/completions endpoint with request validation Comprehensive Test Suite: - Add unit tests for provider implementation with validation - Add integration tests for end-to-end batch processing workflows - Add error handling tests for validation, malformed inputs, and edge cases Configuration: - Add max_concurrent_batches and max_concurrent_requests_per_batch options - Add provider documentation with sample configurations Test with - ``` $ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run & $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK ``` addresses #3066 |
||
|
e1e161553c
|
feat(responses): add MCP argument streaming and content part events (#3136)
# What does this PR do? Adds content part streaming events to the OpenAI-compatible Responses API to support more granular streaming of response content. This introduces: 1. New schema types for content parts: `OpenAIResponseContentPart` with variants for text output and refusals 2. New streaming event types: - `OpenAIResponseObjectStreamResponseContentPartAdded` for when content parts begin - `OpenAIResponseObjectStreamResponseContentPartDone` for when content parts complete 3. Implementation in the reference provider to emit these events during streaming responses. Also emits MCP arguments just like function call ones. ## Test Plan Updated existing streaming tests to verify content part events are properly emitted |
||
|
8638537d14
|
feat(responses): stream progress of tool calls (#3135)
# What does this PR do? Enhances tool execution streaming by adding support for real-time progress events during tool calls. This implementation adds streaming events for MCP and web search tools, including in-progress, searching, completed, and failed states. The refactored `_execute_tool_call` method now returns an async iterator that yields streaming events throughout the tool execution lifecycle. ## Test Plan Updated the integration test `test_response_streaming_multi_turn_tool_execution` to verify the presence and structure of new streaming events, including: - Checking for MCP in-progress and completed events - Verifying that progress events contain required fields (item_id, output_index, sequence_number) - Ensuring completed events have the necessary sequence_number field |
||
|
5b312a80b9
|
feat(responses): improve streaming for function calls (#3124)
Some checks failed
Test Llama Stack Build / build-single-provider (push) Failing after 5s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 10s
Test Llama Stack Build / generate-matrix (push) Successful in 9s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s
Python Package Build Test / build (3.13) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 21s
Python Package Build Test / build (3.12) (push) Failing after 9s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 29s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Test External API and Providers / test-external (venv) (push) Failing after 13s
Update ReadTheDocs / update-readthedocs (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 22s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 17s
Pre-commit / pre-commit (push) Successful in 1m10s
Test Llama Stack Build / build (push) Failing after 12s
Emit streaming events for function calls ## Test Plan Improved the test case |
||
|
3d90117891
|
chore(tests): fix responses and vector_io tests (#3119)
Some fixes to MCP tests. And a bunch of fixes for Vector providers. I also enabled a bunch of Vector IO tests to be used with `LlamaStackLibraryClient` ## Test Plan Run Responses tests with llama stack library client: ``` pytest -s -v tests/integration/non_ci/responses/ --stack-config=server:starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ``` Do the same with `-k openai_client` The rest should be taken care of by CI. |
||
|
a4bad6c0b4
|
feat: Add Google Vertex AI inference provider support (#2841)
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 10s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 12s
Python Package Build Test / build (3.13) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s
Test Llama Stack Build / generate-matrix (push) Successful in 8s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s
Test External API and Providers / test-external (venv) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s
Test Llama Stack Build / build (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 47s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s
Unit Tests / unit-tests (3.13) (push) Failing after 39s
Pre-commit / pre-commit (push) Successful in 1m37s
# What does this PR do? - Add new Vertex AI remote inference provider with litellm integration - Support for Gemini models through Google Cloud Vertex AI platform - Uses Google Cloud Application Default Credentials (ADC) for authentication - Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash. - Updated provider registry to include vertexai provider - Updated starter template to support Vertex AI configuration - Added comprehensive documentation and sample configuration <!-- If resolving an issue, uncomment and update the line below --> relates to https://github.com/meta-llama/llama-stack/issues/2747 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Signed-off-by: Eran Cohen <eranco@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> |
||
|
e90fe25890
|
fix(tests): move llama stack client init back to fixture (#3071)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Integration Tests (Replay) / discover-tests (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s
Test External API and Providers / test-external (venv) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 50s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 54s
Unit Tests / unit-tests (3.13) (push) Failing after 47s
Pre-commit / pre-commit (push) Successful in 1m44s
See inline comments |
||
|
5f1ddd35e4
|
chore(tests): refactor and move responses tests away from verifications (#3068)
This PR kills the verifications infrastructure which is no longer used. It was relocated to the `llama-stack-evals` (https://github.com/meta-llama/llama-stack-evals) repository previously. Responses tests used this infrastructure but that wasn't quite necessary, just a little useful back when @bbrownin introduced the tests. On Discord, we agreed that tests can be moved to our regular integrations test infra. ## Test Plan Some tests currently do fail (although they run!) I will send a follow-up PR which makes them all pass. |
||
|
e3928e6a29
|
feat: Implement hybrid search in Milvus (#2644)
Some checks failed
Integration Tests (Replay) / discover-tests (push) Successful in 5s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s
Test External API and Providers / test-external (venv) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s
Pre-commit / pre-commit (push) Successful in 57s
# What does this PR do? This PR implements hybrid search for Milvus DB based on the inbuilt milvus support. To test: ``` pytest tests/unit/providers/vector_io/remote/test_milvus.py -v -s --tb=long --disable-warnings --asyncio-mode=auto ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> |
||
|
26d3d25c87
|
feat: Add moderations create api (#3020)
# What does this PR do? This PR adds Open AI Compatible moderations api. Currently only implementing for llama guard safety provider Image support, expand to other safety providers and Deprecation of run_shield will be next steps. ## Test Plan Added 2 new tests for safe/ unsafe text prompt examples for the new open ai compatible moderations api usage `SAFETY_MODEL=llama-guard3:8b LLAMA_STACK_CONFIG=starter uv run pytest -v tests/integration/safety/test_safety.py --text-model=llama3.2:3b-instruct-fp16 --embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama` (Had some issue with previous PR https://github.com/meta-llama/llama-stack/pull/2994 while updating and accidentally close it , reopened new one ) |
||
|
0caef40e0d
|
fix: telemetry fixes (inference and core telemetry) (#2733)
# What does this PR do? I found a few issues while adding new metrics for various APIs: currently metrics are only propagated in `chat_completion` and `completion` since most providers use the `openai_..` routes as the default in `llama-stack-client inference chat-completion`, metrics are currently not working as expected. in order to get them working the following had to be done: 1. get the completion as usual 2. use new `openai_` versions of the metric gathering functions which use `.usage` from the `OpenAI..` response types to gather the metrics which are already populated. 3. define a `stream_generator` which counts the tokens and computes the metrics (only for stream=True) 5. add metrics to response NOTE: I could not add metrics to `openai_completion` where stream=True because that ONLY returns an `OpenAICompletion` not an AsyncGenerator that we can manipulate. acquire the lock, and add event to the span as the other `_log_...` methods do some new output: `llama-stack-client inference chat-completion --message hi` <img width="2416" height="425" alt="Screenshot 2025-07-16 at 8 28 20 AM" src="https://github.com/user-attachments/assets/ccdf1643-a184-4ddd-9641-d426c4d51326" /> and in the client: <img width="763" height="319" alt="Screenshot 2025-07-16 at 8 28 32 AM" src="https://github.com/user-attachments/assets/6bceb811-5201-47e9-9e16-8130f0d60007" /> these were not previously being recorded nor were they being printed to the server due to the improper console sink handling --------- Signed-off-by: Charlie Doern <cdoern@redhat.com> |
||
|
c252dfa3ef
|
fix(ci): allow tests to skip llama stack client instantiation (#3052)
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 9s
Unit Tests / unit-tests (3.12) (push) Failing after 6s
Test Llama Stack Build / build-single-provider (push) Failing after 11s
Python Package Build Test / build (3.13) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s
Update ReadTheDocs / update-readthedocs (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s
Test External API and Providers / test-external (venv) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 15s
Pre-commit / pre-commit (push) Successful in 1m16s
Test Llama Stack Build / build (push) Failing after 8s
|
||
|
e9fced773a
|
refactor: introduce common 'ResourceNotFoundError' exception (#3032)
# What does this PR do? 1. Introduce new base custom exception class `ResourceNotFoundError` 2. All other "not found" exception classes now inherit from `ResourceNotFoundError` Closes #3030 Signed-off-by: Nathan Weinberg <nweinber@redhat.com> |
||
|
7f834339ba
|
chore(misc): make tests and starter faster (#3042)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s
Python Package Build Test / build (3.12) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s
Test Llama Stack Build / generate-matrix (push) Successful in 11s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s
Test External API and Providers / test-external (venv) (push) Failing after 14s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
Test Llama Stack Build / build-single-provider (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 18s
Test Llama Stack Build / build (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s
Python Package Build Test / build (3.13) (push) Failing after 53s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s
Update ReadTheDocs / update-readthedocs (push) Failing after 1m6s
Pre-commit / pre-commit (push) Successful in 1m53s
A bunch of miscellaneous cleanup focusing on tests, but ended up speeding up starter distro substantially. - Pulled llama stack client init for tests into `pytest_sessionstart` so it does not clobber output - Profiling of that told me where we were doing lots of heavy imports for starter, so lazied them - starter now starts 20seconds+ faster on my Mac - A few other smallish refactors for `compat_client` |
||
|
bb6b6041d6
|
chore: fix: integration tests failures marked as successful (#3039) | ||
|
cc87995e2b
|
chore: rename templates to distributions (#3035)
As the title says. Distributions is in, Templates is out. `llama stack build --template` --> `llama stack build --distro`. For backward compatibility, the previous option is kept but results in a warning. Updated `server.py` to remove the "config_or_template" backward compatibility since it has been a couple releases since that change. |
||
|
4411e6e362
|
chore(ci): remove reportlab dep (#3033)
# What does this PR do? remove reportlab dep. change dynamic pdf generation into a pre-computed pdf. ## Test Plan ci |
||
|
e5b542dd8e
|
feat: switch to async completion in LiteLLM OpenAI mixin (#3029)
Some checks failed
Integration Tests (Replay) / discover-tests (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 13s
Unit Tests / unit-tests (3.12) (push) Failing after 11s
Python Package Build Test / build (3.13) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 17s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 21s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 29s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s
Test External API and Providers / test-external (venv) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 25s
Unit Tests / unit-tests (3.13) (push) Failing after 25s
Pre-commit / pre-commit (push) Successful in 1m10s
|
||
|
dbfc15123e
|
test: Implement vector store search test (#3001)
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Test Llama Stack Build / generate-matrix (push) Successful in 8s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 13s
Python Package Build Test / build (3.12) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 9s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 8s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 14s
Python Package Build Test / build (3.13) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s
Test Llama Stack Build / build-single-provider (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Test Llama Stack Build / build (push) Failing after 5s
Test External API and Providers / test-external (venv) (push) Failing after 7s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 45s
Update ReadTheDocs / update-readthedocs (push) Failing after 35s
Pre-commit / pre-commit (push) Successful in 1m30s
# What does this PR do? Implement vector store search test <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan ``` pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> |
||
|
6ac710f3b0
|
fix(recording): endpoint resolution (#3013)
Some checks failed
Integration Tests (Replay) / discover-tests (push) Successful in 5s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 15s
Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s
Python Package Build Test / build (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s
Test External API and Providers / test-external (venv) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Python Package Build Test / build (3.13) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 18s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Unit Tests / unit-tests (3.12) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 56s
Unit Tests / unit-tests (3.13) (push) Failing after 52s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 55s
Pre-commit / pre-commit (push) Successful in 1m49s
# What does this PR do? ## Test Plan |