llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-06 02:30:58 +00:00

Author	SHA1	Message	Date
Eric Huang	ce9a62aa84	chore: introduce write queue for response_store # What does this PR do? ## Test Plan	2025-09-21 20:37:59 -07:00
Sébastien Han	d3600b92d1	fix: force milvus-lite installation for inline::milvus (#3488 ) # What does this PR do? pymilvus recently made `milvus-lite` an optional dependency to their package. If someone wants to use the inline provider we must include the extra dependency. For more details see: https://github.com/milvus-io/pymilvus/pull/2976 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-19 16:12:08 -04:00
ehhuang	4c2fcb6b51	chore: refactor server.main (#3462 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 18s Details API Conformance Tests / check-schema-compatibility (push) Successful in 22s Details UI Tests / ui-tests (22) (push) Successful in 29s Details Pre-commit / pre-commit (push) Successful in 1m25s Details # What does this PR do? As shown in #3421, we can scale stack to handle more RPS with k8s replicas. This PR enables multi process stack with uvicorn --workers so that we can achieve the same scaling without being in k8s. To achieve that we refactor main to split out the app construction logic. This method needs to be non-async. We created a new `Stack` class to house impls and have a `start()` method to be called in lifespan to start background tasks instead of starting them in the old `construct_stack`. This way we avoid having to manage an event loop manually. ## Test Plan CI > uv run --with llama-stack python -m llama_stack.core.server.server benchmarking/k8s-benchmark/stack_run_config.yaml works. > LLAMA_STACK_CONFIG=benchmarking/k8s-benchmark/stack_run_config.yaml uv run uvicorn llama_stack.core.server.server:create_app --port 8321 --workers 4 works.	2025-09-18 21:11:13 -07:00
Charlie Doern	8422bd102a	feat: combine ProviderSpec datatypes (#3378 ) Some checks failed Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 36s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m12s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details # What does this PR do? currently `RemoteProviderSpec` has an `AdapterSpec` embedded in it. Remove `AdapterSpec`, and put its leftover fields into `RemoteProviderSpec`. Additionally, many of the fields were duplicated between `InlineProviderSpec` and `RemoteProviderSpec`. Move these to `ProviderSpec` so they are shared. Fixup the distro codegen to use `RemoteProviderSpec` directly rather than `remote_provider_spec` which took an AdapterSpec and returned a full provider spec ## Test Plan existing distro tests should pass. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-18 16:10:00 +02:00
Jiayi Ni	e66103c09d	fix: add missing files provider to NVIDIA distribution (#3479 ) # What does this PR do? The rag-runtime tool requires files API as a dependency, but the NVIDIA distribution was missing the files provider configuration. Thus, when running: ``` llama stack build --distro nvidia --image-type venv ``` And then: ``` llama stack run {path_to_distribution_config} --image-type venv ``` It would raise an error: ``` RuntimeError: Failed to resolve 'tool_runtime' provider 'rag-runtime' of type 'inline::rag-runtime': required dependency 'files' is not available. Please add a 'files' provider to your configuration or check if the provider is properly configured. ``` This PR fixes the issue by adding missing files provider to NVIDIA distribution. ## Test Plan N/A	2025-09-18 13:49:46 +02:00
Matthew Farrellee	ea396a54cd	chore: update the ollama inference impl to use OpenAIMixin for openai-compat functions (#3395 ) # What does this PR do? update Ollama inference provider to use OpenAIMixin for openai-compat endpoints ## Test Plan ci	2025-09-18 13:09:57 +02:00
Matthew Farrellee	521865c388	feat: include all models from provider's /v1/models (#3471 ) # What does this PR do? this replaces the static model listing for any provider using OpenAIMixin currently - - anthropic - azure openai - gemini - groq - llama-api - nvidia - openai - sambanova - tgi - vertexai - vllm - not changed: together has its own impl ## Test Plan - new unit tests - manual for llama-api, openai, groq, gemini ``` for provider in llama-openai-compat openai groq gemini; do uv run llama stack build --image-type venv --providers inference=remote::provider --run & uv run --with llama-stack-client llama-stack-client models list \| grep Total ``` results (17 sep 2025): - llama-api: 4 - openai: 86 - groq: 21 - gemini: 66 closes #3467	2025-09-18 05:17:11 -04:00
Akram Ben Aissi	4842145202	feat: Add dynamic authentication token forwarding support for vLLM (#3388 ) # What does this PR do? Add dynamic authentication token forwarding support for vLLM provider This enables per-request authentication tokens for vLLM providers, supporting use cases like RAG operations where different requests may need different authentication tokens. The implementation follows the same pattern as other providers like Together AI, Fireworks, and Passthrough. - Add LiteLLMOpenAIMixin that manages the vllm_api_token properly Usage: - Static: VLLM_API_TOKEN env var or config.api_token - Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token All existing functionality is preserved while adding new dynamic capabilities. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` curl -X POST "http://localhost:8000/v1/chat/completions" -H "Authorization: Bearer my-dynamic-token" \ -H "X-LlamaStack-Provider-Data: {\"vllm_api_token\": \"Bearer my-dynamic-token\", \"vllm_url\": \"http://dynamic-server:8000\"}" \ -H "Content-Type: application/json" \ -d '{"model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello!"}]}' ``` --------- Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>	2025-09-18 11:13:55 +02:00
Doug Edgar	42c23b45f6	feat: update qdrant hash function from SHA-1 to SHA-256 (#3477 ) Some checks failed Installer CI / smoke-test-on-dev (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Installer CI / lint (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 2s Details UI Tests / ui-tests (22) (push) Successful in 29s Details Pre-commit / pre-commit (push) Successful in 1m10s Details # What does this PR do? Updates the qdrant provider's convert_id function to use a FIPS-validated cryptographic hashing function, so that llama-stack is considered to be `Designed for FIPS`. The standard library `uuid.uuid5()` function uses SHA-1 under the hood, which is not FIPS-validated. This commit uses an approach similar to the one merged in #3423. Closes #3476. ## Test Plan Unit tests from scripts/unit-tests.sh were ran to verify that the tests pass. A small test script can display the data flow: ```python import hashlib import uuid # Input _id = "chunk_abc123" print(_id) # Step 1: Format and encode hash_input = f"qdrant_id:{_id}".encode() print(hash_input) # Result: b'qdrant_id:chunk_abc123' # Step 2: SHA-256 hash sha256_hash = hashlib.sha256(hash_input).hexdigest() print(sha256_hash) # Result: "184893a6eafeaac487cb9166351e8625b994d50f3456d8bc6cea32a014a27151" # Step 3: Create UUID from first 32 chars uuid_string = str(uuid.UUID(sha256_hash[:32])) print(uuid_string) # sha256_hash[:32] = "184893a6eafeaac487cb9166351e8625" # Final result: "184893a6-eafe-aac4-87cb-9166351e8625" ``` Signed-off-by: Doug Edgar <dedgar@redhat.com>	2025-09-17 15:10:10 -07:00
Alexey Rybak	9fe8097ca4	docs: update documentation links (#3459 ) # What does this PR do? * Updates documentation links from readthedocs to llamastack.github.io ## Test Plan * Manual testing	2025-09-17 10:37:35 -07:00
Francisco Arceo	9acf49753e	fix: Fixing prompts import warning (#3455 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? Fixes this warning in llama stack build: ```bash WARNING 2025-09-15 15:29:02,197 llama_stack.core.distribution:149 core: Failed to import module prompts: No module named 'llama_stack.providers.registry.prompts'" ``` ## Test Plan Test added --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-09-17 10:24:58 +02:00
Omar Abdelwahab	e0e2b1bd0e	fix: Added a bug fix when registering new models (#3453 ) # What does this PR do? Modified the code in registry.py. The key changes are: 1. Removed the `return False` statement 2. Added a warning log message that includes the object type, identifier, and provider_id for better debugging. 3. The method now continues with the registration process instead of early returning. --------- Co-authored-by: Omar Abdelwahab <omara@fb.com>	2025-09-16 19:09:06 -07:00
github-actions[bot]	ececc323d3	build: Bump version to 0.2.22 Some checks failed Pre-commit / pre-commit (push) Successful in 1m14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 31s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details	2025-09-16 19:44:03 +00:00
Matthew Farrellee	49d4a5cc84	feat: add embedding and dynamic model support to Together inference adapter (#3458 ) # What does this PR do? adds embedding and dynamic model support to Together inference adapter - updated to use OpenAIMixin - workarounds for Together api quirks - recordings for together suite when subdirs=inference,pattern=openai ## Test Plan ``` $ TOGETHER_API_KEY=_NONE_ ./scripts/integration-tests.sh --stack-config server:ci-tests --setup together --subdirs inference --pattern openai ... tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:sanity] instantiating llama_stack_client Port 8321 is already in use, assuming server is already running... llama_stack_client instantiated in 0.121s PASSED [ 2%] tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:suffix] SKIPPED [ 4%] tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:sanity] PASSED [ 6%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-1] SKIPPED [ 8%] tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free] SKIPPED [ 10%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_01] PASSED [ 12%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] PASSED [ 14%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] SKIPPED [ 17%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 19%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 21%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free] SKIPPED [ 23%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 25%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 27%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 29%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 31%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 34%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 36%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 38%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 40%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 42%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 44%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-0] SKIPPED [ 46%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_02] PASSED [ 48%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] PASSED [ 51%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] SKIPPED [ 53%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 55%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 57%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 59%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 61%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 63%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 65%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 68%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 70%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 72%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 74%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 76%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 78%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_01] PASSED [ 80%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] PASSED [ 82%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] SKIPPED [ 85%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 87%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 89%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_02] PASSED [ 91%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] PASSED [ 93%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] SKIPPED [ 95%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 97%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [100%] ============================================ 30 passed, 17 skipped, 50 deselected, 4 warnings in 21.96s ============================================= ```	2025-09-16 11:53:41 -07:00
Sébastien Han	65d45c7318	chore: various watsonx fixes (#3428 ) # What does this PR do? use a logger * update the distro to add the Files API otherwise it won't start since it is a dependency of vector * clarify project_id and api_key requirements * disable openai compatible calls since the endpoint returns 404 * disable text_inference structured format tests * fixed openai client initialization ## Test Plan Execute text_inference: ``` WATSONX_API_KEY=... WATSONX_PROJECT_ID=... python -m llama_stack.core.server.server llama_stack/distributions/watsonx/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -vvvv -ra --text-model watsonx/meta-llama/llama-3-3-70b-instruct tests/integration/inference/test_text_inference.py ============================================= test session starts ============================================== platform darwin -- Python 3.12.8, pytest-8.4.2, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.12.8', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0', 'hydra-core': '1.3.2'}} rootdir: /Users/leseb/Documents/AI/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0, hydra-core-1.3.2 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 20 items tests/integration/inference/test_text_inference.py::test_text_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:sanity] PASSED [ 5%] tests/integration/inference/test_text_inference.py::test_text_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:sanity] PASSED [ 10%] tests/integration/inference/test_text_inference.py::test_text_completion_stop_sequence[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:stop_sequence] XFAIL [ 15%] tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] XFAIL [ 20%] tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] XFAIL [ 25%] tests/integration/inference/test_text_inference.py::test_text_completion_structured_output[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:structured_output] SKIPPED structured output) [ 30%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_01] PASSED [ 35%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_01] PASSED [ 40%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 45%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 50%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_required[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 55%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_none[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 60%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_structured_output[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:structured_output] SKIPPEDstructured output) [ 65%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling_tools_absent-True] PASSED [ 70%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:text_then_tool] XFAIL [ 75%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_02] PASSED [ 80%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02] PASSED [ 85%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling_tools_absent-False] PASSED [ 90%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_then_answer] XFAIL [ 95%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:array_parameter] XFAIL [100%] =========================================== short test summary info ============================================ SKIPPED [2] tests/integration/inference/test_text_inference.py:49: Model watsonx/meta-llama/llama-3-3-70b-instruct hosted by remote::watsonx doesn't support json_schema structured output XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_stop_sequence[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:stop_sequence] - remote::watsonx doesn't support 'stop' parameter yet XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] - remote::watsonx doesn't support log probs yet XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] - remote::watsonx doesn't support log probs yet XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:text_then_tool] - Not tested for non-llama4 models yet XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_then_answer] - Not tested for non-llama4 models yet XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:array_parameter] - Not tested for non-llama4 models yet ============================ 12 passed, 2 skipped, 6 xfailed, 14 warnings in 36.88s ============================ ``` --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-16 13:55:10 +02:00
Matthew Farrellee	f4ab154ade	feat: add dynamic model registration support to TGI inference (#3417 ) Some checks failed Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 43s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m21s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details # What does this PR do? adds dynamic model support to TGI add new overwrite_completion_id feature to OpenAIMixin to deal with TGI always returning id="" ## Test Plan tgi: `docker run --gpus all --shm-size 1g -p 8080:80 -v /data:/data ghcr.io/huggingface/text-generation-inference --model-id Qwen/Qwen3-0.6B` stack: `TGI_URL=http://localhost:8080 uv run llama stack build --image-type venv --distro ci-tests --run` test: `./scripts/integration-tests.sh --stack-config http://localhost:8321 --setup tgi --subdirs inference --pattern openai`	2025-09-15 15:52:40 -04:00
IAN MILLER	ab321739f2	feat: create HTTP DELETE API endpoints to unregister ScoringFn and Benchmark resources in Llama Stack (#3371 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR provides functionality for users to unregister ScoringFn and Benchmark resources for `scoring` and `eval` APIs. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3051 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Updated integration and unit tests via CI workflow	2025-09-15 12:43:38 -07:00
Matthew Farrellee	01bdcce4d2	chore(recorder): update mocks to be closer to non-mock environment (#3442 ) # What does this PR do? the @required_args decorator in openai-python is masking the async nature of the {AsyncCompletions,chat.AsyncCompletions}.create method. see https://github.com/openai/openai-python/issues/996 this means two things - 0. we cannot use iscoroutine in the recorder to detect async vs non 1. our mocks are inappropriately introducing identifiable async for (0), we update the iscoroutine check w/ detection of /v1/models, which is the only non-async function we mock & record. for (1), we could leave everything as is and assume (0) will catch errors. to be defensive, we update the unit tests to mock below create methods, allowing the true openai-python create() methods to be tested.	2025-09-15 15:25:53 -04:00
dependabot[bot]	b6cb817897	chore(ui-deps): bump @radix-ui/react-select from 2.2.5 to 2.2.6 in /llama_stack/ui (#3437 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 19s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 21s Details UI Tests / ui-tests (22) (push) Successful in 55s Details Pre-commit / pre-commit (push) Successful in 1m39s Details Bumps [@radix-ui/react-select](https://github.com/radix-ui/primitives) from 2.2.5 to 2.2.6. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@radix-ui/react-select&package-manager=npm_and_yarn&previous-version=2.2.5&new-version=2.2.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 09:46:14 +02:00
dependabot[bot]	36fd97e306	chore(ui-deps): bump next from 15.3.3 to 15.5.3 in /llama_stack/ui (#3438 ) Bumps [next](https://github.com/vercel/next.js) from 15.3.3 to 15.5.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.5.3</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: validation return types of pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/83069">#83069</a>)</li> <li>fix: relative paths in dev in validator.ts (<a href="https://redirect.github.com/vercel/next.js/issues/83073">#83073</a>)</li> <li>fix: remove satisfies keyword from type validation to preserve old TS compatibility (<a href="https://redirect.github.com/vercel/next.js/issues/83071">#83071</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a> for helping!</p> <h2>v15.5.2</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: disable unknownatrules lint rule entirely (<a href="https://redirect.github.com/vercel/next.js/issues/83059">#83059</a>)</li> <li>revert: add ?dpl to fonts in /_next/static/media (<a href="https://redirect.github.com/vercel/next.js/issues/83062">#83062</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a> and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> <h2>v15.5.1</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: aliased navigations should apply scroll handling (<a href="https://redirect.github.com/vercel/next.js/issues/82900">#82900</a>)</li> <li>Turbopack: fix invalid NFT entry with file behind symlink (<a href="https://redirect.github.com/vercel/next.js/issues/82887">#82887</a>)</li> <li>fix: typesafe linking to route handlers and pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/82858">#82858</a>)</li> <li>fix: change "noUnknownAtRules" to "warn" for Biome (<a href="https://redirect.github.com/vercel/next.js/issues/82974">#82974</a>)</li> <li>fix: add path normalization to getRelativePath for Windows (<a href="https://redirect.github.com/vercel/next.js/issues/82918">#82918</a>)</li> <li>feat: add typesafety with config.typedRoutes to redirect() and permanentRedirect() (<a href="https://redirect.github.com/vercel/next.js/issues/82860">#82860</a>)</li> <li>fix: avoid importing types that will be unused (<a href="https://redirect.github.com/vercel/next.js/issues/82856">#82856</a>)</li> <li>fix: update the config.api.responseLimit type (<a href="https://redirect.github.com/vercel/next.js/issues/82852">#82852</a>)</li> <li>fix: update validation return types (<a href="https://redirect.github.com/vercel/next.js/issues/82854">#82854</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a>, <a href="https://github.com/mischnic"><code>@mischnic</code></a>, and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> <h2>v15.5.1-canary.39</h2> <h3>Core Changes</h3> <ul> <li>[metadata] change the metadata routes params to promises: <a href="https://redirect.github.com/vercel/next.js/issues/83560">#83560</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`07d1cbc9c6`"><code>07d1cbc</code></a> v15.5.3</li> <li><a href="`db56d77595`"><code>db56d77</code></a> [backport] fix: validation return types of pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/83069">#83069</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83580">#83580</a>)</li> <li><a href="`7a806231f8`"><code>7a80623</code></a> [backport] fix: relative paths in dev in validator.ts (<a href="https://redirect.github.com/vercel/next.js/issues/83073">#83073</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83190">#83190</a>)</li> <li><a href="`fddaeb85a0`"><code>fddaeb8</code></a> [backport] fix: remove <code>satisfies</code> keyword from type validation to preserve o...</li> <li><a href="`497ec6aa08`"><code>497ec6a</code></a> v15.5.2</li> <li><a href="`bc72f41a2e`"><code>bc72f41</code></a> [backport] revert: add ?dpl to fonts in <code>/_next/static/media</code> (<a href="https://redirect.github.com/vercel/next.js/issues/83062">#83062</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83066">#83066</a>)</li> <li><a href="`c8faf6800b`"><code>c8faf68</code></a> [backport] fix: disable unknownatrules lint rule entirely (<a href="https://redirect.github.com/vercel/next.js/issues/83059">#83059</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83060">#83060</a>)</li> <li><a href="`cc68ced552`"><code>cc68ced</code></a> v15.5.1</li> <li><a href="`1ce9857276`"><code>1ce9857</code></a> [backport] fix: update validation return types (<a href="https://redirect.github.com/vercel/next.js/issues/82854">#82854</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83027">#83027</a>)</li> <li><a href="`b93c894717`"><code>b93c894</code></a> [backport] fix: update the config.api.responseLimit type (<a href="https://redirect.github.com/vercel/next.js/issues/82852">#82852</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83028">#83028</a>)</li> <li>Additional commits viewable in <a href="https://github.com/vercel/next.js/compare/v15.3.3...v15.5.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=next&package-manager=npm_and_yarn&previous-version=15.3.3&new-version=15.5.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 09:46:05 +02:00
Matthew Farrellee	6787755c0c	chore(recorder): add support for NOT_GIVEN (#3430 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 18s Details Python Package Build Test / build (3.12) (push) Failing after 14s Details UI Tests / ui-tests (22) (push) Successful in 41s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m31s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details # What does this PR do? the recorder mocks the openai-python interface. the openai-python interface allows NOT_GIVEN as an input option. this change properly handles NOT_GIVEN. ## Test Plan ci (coverage for chat, completions, embeddings)	2025-09-13 11:11:38 -07:00
Matthew Farrellee	3de9ad0a87	chore(recorder, tests): add test for openai /v1/models (#3426 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m19s Details # What does this PR do? - [x] adds a test for the recorder's handling of /v1/models - [x] adds a fix for /v1/models handling ## Test Plan ci	2025-09-12 14:59:56 -07:00
Doug Edgar	f67081d2d6	feat: migrate to FIPS-validated cryptographic algorithms (#3423 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 16s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (push) Failing after 19s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Pre-commit / pre-commit (push) Successful in 1m13s Details # What does this PR do? Migrates MD5 and SHA-1 hash algorithms to SHA-256. In particular, replaces: - MD5 in chunk ID generation. - MD5 in file verification. - SHA-1 in model identifier digests. And updates all related test expectations. Original discussion: https://github.com/llamastack/llama-stack/discussions/3413 <!-- If resolving an issue, uncomment and update the line below --> Closes #3424. ## Test Plan Unit tests from scripts/unit-tests.sh were updated to match the new hash output, and ran to verify the tests pass. Signed-off-by: Doug Edgar <dedgar@redhat.com>	2025-09-12 11:18:19 +02:00
Matthew Farrellee	8ef1189be7	chore: update the vLLM inference impl to use OpenAIMixin for openai-compat functions (#3404 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 31s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? update vLLM inference provider to use OpenAIMixin for openai-compat functions inference recordings from Qwen3-0.6B and vLLM 0.8.3 - ``` docker run --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 --ipc=host \ vllm/vllm-openai:latest \ --model Qwen/Qwen3-0.6B --enable-auto-tool-choice --tool-call-parser hermes ``` ## Test Plan ``` ./scripts/integration-tests.sh --stack-config server:ci-tests --setup vllm --subdirs inference ```	2025-09-11 09:04:38 -04:00
Francisco Arceo	d15368a302	chore: Updating documentation, adding exception handling for Vector Stores in RAG Tool, more tests on migration, and migrate off of inference_api for context_retriever for RAG (#3367 ) # What does this PR do? - Updating documentation on migration from RAG Tool to Vector Stores and Files APIs - Adding exception handling for Vector Stores in RAG Tool - Add more tests on migration from RAG Tool to Vector Stores - Migrate off of inference_api for context_retriever for RAG <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Integration and unit tests added Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-09-11 14:20:11 +02:00
Sébastien Han	f31bcc11bc	feat: add Azure OpenAI inference provider support (#3396 ) # What does this PR do? Llama-stack now supports a new OpenAI compatible endpoint with Azure OpenAI. The starter distro has been updated to add the new remote inference provider. A few tests have been modified and improved. ## Test Plan Deploy a model in the Aure portal then: ``` $ AZURE_API_KEY=... AZURE_API_BASE=... uv run llama stack build --image-type venv --providers inference=remote::azure --run ... $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -v -ra --text-model azure/gpt-4.1 tests/integration/inference/test_openai_completion.py ... Results: ``` ============================================= test session starts ============================================== platform darwin -- Python 3.12.8, pytest-8.4.1, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.12.8', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.1', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0', 'hydra-core': '1.3.2'}} rootdir: /Users/leseb/Documents/AI/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0, hydra-core-1.3.2 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 27 items tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=azure/gpt-5-mini-inference:completion:sanity] SKIPPED [ 3%] tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=azure/gpt-5-mini-inference:completion:suffix] SKIPPED [ 7%] tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=azure/gpt-5-mini-inference:completion:sanity] SKIPPED [ 11%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=azure/gpt-5-mini-1] SKIPPED [ 14%] tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=azure/gpt-5-mini] SKIPPED [ 18%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_01] PASSED [ 22%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01] PASSED [ 25%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01] PASSED [ 29%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=azure/gpt-5-mini-True] PASSED [ 33%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=azure/gpt-5-mini-True] PASSED [ 37%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=azure/gpt-5-mini] SKIPPEDed files.) [ 40%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=azure/gpt-5-mini-0] SKIPPED [ 44%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_02] PASSED [ 48%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02] PASSED [ 51%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02] PASSED [ 55%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=azure/gpt-5-mini-False] PASSED [ 59%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=azure/gpt-5-mini-False] PASSED [ 62%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_01] PASSED [ 66%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01] PASSED [ 70%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01] PASSED [ 74%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=azure/gpt-5-mini-True] PASSED [ 77%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=azure/gpt-5-mini-True] PASSED [ 81%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_02] PASSED [ 85%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02] PASSED [ 88%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02] PASSED [ 92%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=azure/gpt-5-mini-False] PASSED [ 96%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=azure/gpt-5-mini-False] PASSED [100%] =========================================== short test summary info ============================================ SKIPPED [3] tests/integration/inference/test_openai_completion.py:63: Model azure/gpt-5-mini hosted by remote::azure doesn't support OpenAI completions. SKIPPED [3] tests/integration/inference/test_openai_completion.py:118: Model azure/gpt-5-mini hosted by remote::azure doesn't support vllm extra_body parameters. SKIPPED [1] tests/integration/inference/test_openai_completion.py:124: Model azure/gpt-5-mini hosted by remote::azure doesn't support chat completion calls with base64 encoded files. ================================== 20 passed, 7 skipped, 2 warnings in 51.77s ================================== ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-11 13:48:38 +02:00
Matthew Farrellee	c2d281e01b	chore(replay): improve replay robustness with un-validated construction (#3414 ) # What does this PR do? some providers do not produce spec compliant outputs. when this happens the replay infra will fail to construct the proper types and will return a dict to the client. the client likely does not expect a dict. this was discovered with tgi, which returns finish_reason="" when valid values are "stop", "length" or "content_filter" ## Test Plan ci	2025-09-11 13:48:19 +02:00
Sumanth Kamenani	2838d5a20f	fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386 ) Fixes #3370 AWS switched to requiring region-prefixed inference profile IDs instead of foundation model IDs for on-demand throughput. This was causing ValidationException errors. Added auto-detection based on boto3 client region to convert model IDs like meta.llama3-1-70b-instruct-v1:0 to us.meta.llama3-1-70b-instruct-v1:0 depending on the detected region. Also handles edge cases like ARNs, case insensitive regions, and None regions. Tested with this request. ```json { "model_id": "meta.llama3-1-8b-instruct-v1:0", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "tell me a riddle" } ], "sampling_params": { "strategy": { "type": "top_p", "temperature": 0.7, "top_p": 0.9 }, "max_tokens": 512 } } ``` <img width="1488" height="878" alt="image" src="https://github.com/user-attachments/assets/0d61beec-3869-4a31-8f37-9f554c280b88" />	2025-09-11 11:41:53 +02:00
Sébastien Han	8e05c68d15	chore: remove openai dependency from providers (#3398 ) # What does this PR do? The openai package is already a dependency of the llama-stack project itself, so let's the project dictate which openai version we need and avoid potential breakage with unsatisfiable dependency resolution. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-11 10:19:59 +02:00
Ashwin Bharambe	0c7f49490c	fix(inference_store): on duplicate chat completion IDs, replace (#3408 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 23s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details UI Tests / ui-tests (22) (push) Successful in 35s Details Pre-commit / pre-commit (push) Successful in 1m45s Details # What does this PR do? Duplicate chat completion IDs can be generated during tests especially if they are replaying recorded responses across different tests. No need to warn or error under those circumstances. In the wild, this is not likely to happen at all (no evidence) so we aren't really hiding any problem.	2025-09-10 14:34:18 -07:00
dependabot[bot]	d4e45cd5f1	chore(ui-deps): bump tailwindcss from 4.1.6 to 4.1.13 in /llama_stack/ui (#3362 ) Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss) from 4.1.6 to 4.1.13. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/tailwindlabs/tailwindcss/releases">tailwindcss's releases</a>.</em></p> <blockquote> <h2>v4.1.13</h2> <h3>Changed</h3> <ul> <li>Drop warning from browser build (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18731">#18731</a>)</li> <li>Drop exact duplicate declarations when emitting CSS (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18809">#18809</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Don't transition <code>visibility</code> when using <code>transition</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18795">#18795</a>)</li> <li>Discard matched variants with unknown named values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Discard matched variants with non-string values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Show suggestions for known <code>matchVariant</code> values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18798">#18798</a>)</li> <li>Replace deprecated <code>clip</code> with <code>clip-path</code> in <code>sr-only</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18769">#18769</a>)</li> <li>Hide internal fields from completions in <code>matchUtilities</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18820">#18820</a>)</li> <li>Ignore <code>.vercel</code> folders by default (can be overridden by <code>@source …</code> rules) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18855">#18855</a>)</li> <li>Consider variants starting with <code>@-</code> to be invalid (e.g. <code>@-2xl:flex</code>) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18869">#18869</a>)</li> <li>Do not allow custom variants to start or end with a <code>-</code> or <code>_</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18867">#18867</a>, <a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18872">#18872</a>)</li> <li>Upgrade: Migrate <code>aria</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18815">#18815</a>)</li> <li>Upgrade: Migrate <code>data</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18816">#18816</a>)</li> <li>Upgrade: Migrate <code>supports</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18817">#18817</a>)</li> </ul> <h2>v4.1.12</h2> <h3>Fixed</h3> <ul> <li>Don't consider the global important state in <code>@apply</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18404">#18404</a>)</li> <li>Add missing suggestions for <code>flex-<number></code> utilities (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18642">#18642</a>)</li> <li>Fix trailing <code>)</code> from interfering with extraction in Clojure keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Detect classes inside Elixir charlist, word list, and string sigils (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18432">#18432</a>)</li> <li>Track source locations through <code>@plugin</code> and <code>@config</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Allow boolean values of <code>process.env.DEBUG</code> in <code>@tailwindcss/node</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18485">#18485</a>)</li> <li>Ignore consecutive semicolons in the CSS parser (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18532">#18532</a>)</li> <li>Center the dropdown icon added to an input with a paired datalist by default (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18511">#18511</a>)</li> <li>Extract candidates in Slang templates (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18565">#18565</a>)</li> <li>Improve error messages when encountering invalid functional utility names (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18568">#18568</a>)</li> <li>Discard CSS AST objects with <code>false</code> or <code>undefined</code> properties (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18571">#18571</a>)</li> <li>Allow users to disable URL rebasing in <code>@tailwindcss/postcss</code> via <code>transformAssetUrls: false</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18321">#18321</a>)</li> <li>Fix false-positive migrations in <code>addEventListener</code> and JavaScript variable names (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18718">#18718</a>)</li> <li>Fix Standalone CLI showing default Bun help when run via symlink on Windows (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18723">#18723</a>)</li> <li>Read from <code>--border-color-</code> theme keys in <code>divide-</code> utilities for backwards compatibility (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18704/">#18704</a>)</li> <li>Don't scan <code>.hdr</code> and <code>.exr</code> files for classes by default (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18734">#18734</a>)</li> </ul> <h2>v4.1.11</h2> <h3>Fixed</h3> <ul> <li>Add heuristic to skip candidate migrations inside <code>emit(…)</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18330">#18330</a>)</li> <li>Extract candidates with variants in Clojure/ClojureScript keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18338">#18338</a>)</li> <li>Document <code>--watch=always</code> in the CLI's usage (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18337">#18337</a>)</li> <li>Add support for Vite 7 to <code>@tailwindcss/vite</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18384">#18384</a>)</li> </ul> <h2>v4.1.10</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tailwindlabs/tailwindcss/blob/main/CHANGELOG.md">tailwindcss's changelog</a>.</em></p> <blockquote> <h2>[4.1.13] - 2025-09-03</h2> <h3>Changed</h3> <ul> <li>Drop warning from browser build (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18731">#18731</a>)</li> <li>Drop exact duplicate declarations when emitting CSS (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18809">#18809</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Don't transition <code>visibility</code> when using <code>transition</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18795">#18795</a>)</li> <li>Discard matched variants with unknown named values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Discard matched variants with non-string values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Show suggestions for known <code>matchVariant</code> values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18798">#18798</a>)</li> <li>Replace deprecated <code>clip</code> with <code>clip-path</code> in <code>sr-only</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18769">#18769</a>)</li> <li>Hide internal fields from completions in <code>matchUtilities</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18820">#18820</a>)</li> <li>Ignore <code>.vercel</code> folders by default (can be overridden by <code>@source …</code> rules) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18855">#18855</a>)</li> <li>Consider variants starting with <code>@-</code> to be invalid (e.g. <code>@-2xl:flex</code>) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18869">#18869</a>)</li> <li>Do not allow custom variants to start or end with a <code>-</code> or <code>_</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18867">#18867</a>, <a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18872">#18872</a>)</li> <li>Upgrade: Migrate <code>aria</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18815">#18815</a>)</li> <li>Upgrade: Migrate <code>data</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18816">#18816</a>)</li> <li>Upgrade: Migrate <code>supports</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18817">#18817</a>)</li> </ul> <h2>[4.1.12] - 2025-08-13</h2> <h3>Fixed</h3> <ul> <li>Don't consider the global important state in <code>@apply</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18404">#18404</a>)</li> <li>Add missing suggestions for <code>flex-<number></code> utilities (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18642">#18642</a>)</li> <li>Fix trailing <code>)</code> from interfering with extraction in Clojure keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Detect classes inside Elixir charlist, word list, and string sigils (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18432">#18432</a>)</li> <li>Track source locations through <code>@plugin</code> and <code>@config</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Allow boolean values of <code>process.env.DEBUG</code> in <code>@tailwindcss/node</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18485">#18485</a>)</li> <li>Ignore consecutive semicolons in the CSS parser (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18532">#18532</a>)</li> <li>Center the dropdown icon added to an input with a paired datalist by default (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18511">#18511</a>)</li> <li>Extract candidates in Slang templates (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18565">#18565</a>)</li> <li>Improve error messages when encountering invalid functional utility names (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18568">#18568</a>)</li> <li>Discard CSS AST objects with <code>false</code> or <code>undefined</code> properties (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18571">#18571</a>)</li> <li>Allow users to disable URL rebasing in <code>@tailwindcss/postcss</code> via <code>transformAssetUrls: false</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18321">#18321</a>)</li> <li>Fix false-positive migrations in <code>addEventListener</code> and JavaScript variable names (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18718">#18718</a>)</li> <li>Fix Standalone CLI showing default Bun help when run via symlink on Windows (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18723">#18723</a>)</li> <li>Read from <code>--border-color-</code> theme keys in <code>divide-</code> utilities for backwards compatibility (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18704/">#18704</a>)</li> <li>Don't scan <code>.hdr</code> and <code>.exr</code> files for classes by default (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18734">#18734</a>)</li> </ul> <h2>[4.1.11] - 2025-06-26</h2> <h3>Fixed</h3> <ul> <li>Add heuristic to skip candidate migrations inside <code>emit(…)</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18330">#18330</a>)</li> <li>Extract candidates with variants in Clojure/ClojureScript keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18338">#18338</a>)</li> <li>Document <code>--watch=always</code> in the CLI's usage (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18337">#18337</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1334c99db8`"><code>1334c99</code></a> Prepare v4.1.13 release (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18868">#18868</a>)</li> <li><a href="`65dc530f05`"><code>65dc530</code></a> Do not allow variants to end with <code>-</code> or <code>_</code> (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18872">#18872</a>)</li> <li><a href="`54c3f308e9`"><code>54c3f30</code></a> Do not allow variants to start with <code>-</code> (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18867">#18867</a>)</li> <li><a href="`494051ca08`"><code>494051c</code></a> Consider variants starting with <code>@-</code> to be invalid (e.g. <code>@-2xl:flex</code>) (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18869">#18869</a>)</li> <li><a href="`c318329a1e`"><code>c318329</code></a> chore: remove redundant words (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18853">#18853</a>)</li> <li><a href="`ddc84b079b`"><code>ddc84b0</code></a> update test after prettier change</li> <li><a href="`f1331a857a`"><code>f1331a8</code></a> run prettier</li> <li><a href="`e5513b6c75`"><code>e5513b6</code></a> Fix missing code block delimiters in comment blocks (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18837">#18837</a>)</li> <li><a href="`5e2a160d8b`"><code>5e2a160</code></a> Drop exact duplicate declarations from output CSS within a style rule (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18809">#18809</a>)</li> <li><a href="`b1fb02a2d7`"><code>b1fb02a</code></a> Hide internal fields from completions in <code>matchUtilities</code> (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18820">#18820</a>)</li> <li>Additional commits viewable in <a href="https://github.com/tailwindlabs/tailwindcss/commits/v4.1.13/packages/tailwindcss">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tailwindcss&package-manager=npm_and_yarn&previous-version=4.1.6&new-version=4.1.13)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-10 13:18:14 -07:00
ehhuang	e980436a2e	chore: introduce write queue for inference_store (#3383 ) # What does this PR do? Adds a write worker queue for writes to inference store. This avoids overwhelming request processing with slow inference writes. ## Test Plan Benchmark: ``` cd /docs/source/distributions/k8s-benchmark # start mock server python openai-mock-server.py --port 8000 # start stack server LLAMA_STACK_LOGGING="all=WARNING" uv run --with llama-stack python -m llama_stack.core.server.server docs/source/distributions/k8s-benchmark/stack_run_config.yaml # run benchmark script uv run python3 benchmark.py --duration 120 --concurrent 50 --base-url=http://localhost:8321/v1/openai/v1 --model=vllm-inference/meta-llama/Llama-3.2-3B-Instruct ``` ## RPS from 21 -> 57	2025-09-10 11:57:42 -07:00
Francisco Arceo	a6b1588dc6	revert: Fireworks chat completion broken due to telemetry (#3402 ) Reverts llamastack/llama-stack#3392	2025-09-10 11:53:38 -07:00
ehhuang	f6bf36343d	chore: logging perf improvments (#3393 ) # What does this PR do? - Use BackgroundLogger when logging metric events. - Reuse event loop in BackgroundLogger ## Test Plan ``` cd /docs/source/distributions/k8s-benchmark # start mock server python openai-mock-server.py --port 8000 # start stack server LLAMA_STACK_LOGGING="all=WARNING" uv run --with llama-stack python -m llama_stack.core.server.server docs/source/distributions/k8s-benchmark/stack_run_config.yaml # run benchmark script uv run python3 benchmark.py --duration 120 --concurrent 50 --base-url=http://localhost:8321/v1/openai/v1 --model=vllm-inference/meta-llama/Llama-3.2-3B-Instruct ``` ### RPS from 57 -> 62	2025-09-10 11:52:23 -07:00
slekkala1	935b8e28de	fix: Fireworks chat completion broken due to telemetry (#3392 ) # What does this PR do? Fix fireworks chat completion broken due to telemetry expecting response.usage Closes https://github.com/llamastack/llama-stack/issues/3391 ## Test Plan 1. `uv run --with llama-stack llama stack build --distro starter --image-type venv --run` Try ``` curl -X POST http://0.0.0.0:8321/v1/openai/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct", "messages": [{"role": "user", "content": "Hello!"}] }' ``` ``` {"id":"chatcmpl-ee922a08-0df0-4974-b0d3-b322113e8bc0","choices":[{"message":{"role":"assistant","content":"Hello! How can I assist you today?","name":null,"tool_calls":null},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","created":1757456375,"model":"fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct"}% ``` Without fix fails as mentioned in https://github.com/llamastack/llama-stack/issues/3391 Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-09-10 08:48:01 -07:00
Sébastien Han	c86e45496e	ci: Re-enable pre-commit to fail (#3399 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 58s Details Pre-commit / pre-commit (push) Successful in 1m14s Details If pre-commit fails, the workflow must fail. --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-10 10:00:46 -04:00
Matthew Farrellee	0e27016cf2	chore: update the vertexai inference impl to use openai-python for openai-compat functions (#3377 ) # What does this PR do? update VertexAI inference provider to use openai-python for openai-compat functions ## Test Plan ``` $ VERTEX_AI_PROJECT=... uv run llama stack build --image-type venv --providers inference=remote::vertexai --run ... $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -v -ra --text-model vertexai/vertex_ai/gemini-2.5-flash tests/integration/inference/test_openai_completion.py ... ``` i don't have an account to test this. `get_api_key` may also need to be updated per https://cloud.google.com/vertex-ai/generative-ai/docs/start/openai --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-09-10 15:39:29 +02:00
Cesare Pompeiano	1c23aeb937	feat: Add vector_db_id to chunk metadata (#3304 ) # What does this PR do? When running RAG in a multi vector DB setting, it can be difficult to trace where retrieved chunks originate from. This PR adds the `vector_db_id` into each chunk’s metadata, making it easier to understand which database a given chunk came from. This is helpful for debugging and for analyzing retrieval behavior of multiple DBs. Relevant code: ```python for vector_db_id, result in zip(vector_db_ids, results): for chunk, score in zip(result.chunks, result.scores): if not hasattr(chunk, "metadata") or chunk.metadata is None: chunk.metadata = {} chunk.metadata["vector_db_id"] = vector_db_id chunks.append(chunk) scores.append(score) ``` ## Test Plan * Ran Llama Stack in debug mode. * Verified that `vector_db_id` was added to each chunk’s metadata. * Confirmed that the metadata was printed in the console when using the RAG tool. --------- Co-authored-by: are-ces <cpompeia@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-09-10 11:19:21 +02:00
Matthew Farrellee	dd1f946b3e	feat: include a default inference store during llama stack build (#3373 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 43s Details Pre-commit / pre-commit (push) Successful in 1m14s Details # What does this PR do? enables completions storage when using `llama stack build --providers` - - GET /v1/chat/completions - GET /v1/chat/completions/{id} todo: llama stack build and distro codegen should use the same code paths ## Test Plan ci	2025-09-09 15:54:58 -07:00
ehhuang	9d3a234bf3	chore: remove unused variable (#3389 ) # What does this PR do? ## Test Plan	2025-09-09 15:51:20 -07:00
github-actions[bot]	28696c3f30	build: Bump version to 0.2.21 Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 2s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 41s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Test External API and Providers / test-external (venv) (push) Failing after 41s Details Pre-commit / pre-commit (push) Successful in 2m0s Details	2025-09-08 22:30:03 +00:00
Ashwin Bharambe	30468d0c43	fix(deps): bump datasets versions for all providers (#3382 ) Not doing so results in errors of the kind you see in: `4989026435`	2025-09-08 15:13:42 -07:00
Derek Higgins	ef02b9ea10	fix: environment variable typo in inference recorder error message (#3374 ) The error message was referencing LLAMA_STACK_INFERENCE_MODE instead of the correct LLAMA_STACK_TEST_INFERENCE_MODE environment variable.	2025-09-08 17:51:38 +02:00
Francisco Arceo	ad6ea7fb91	feat: Adding OpenAI Prompts API (#3319 ) # What does this PR do? This PR adds support for OpenAI Prompts API. Note, OpenAI does not explicitly expose the Prompts API but instead makes it available in the Responses API and in the [Prompts Dashboard](https://platform.openai.com/docs/guides/prompting#create-a-prompt). I have added the following APIs: - CREATE - GET - LIST - UPDATE - Set Default Version The Set Default Version API is made available only in the Prompts Dashboard and configures which prompt version is returned in the GET (the latest version is the default). Overall, the expected functionality in Responses will look like this: ```python from openai import OpenAI client = OpenAI() response = client.responses.create( prompt={ "id": "pmpt_68b0c29740048196bd3a6e6ac3c4d0e20ed9a13f0d15bf5e", "version": "2", "variables": { "city": "San Francisco", "age": 30, } } ) ``` ### Resolves https://github.com/llamastack/llama-stack/issues/3276 ## Test Plan Unit tests added. Integration tests can be added after client generation. ## Next Steps 1. Update Responses API to support Prompt API 2. I'll enhance the UI to implement the Prompt Dashboard. 3. Add cache for lower latency --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-09-08 11:05:13 -04:00
Akram Ben Aissi	072dca0609	feat: Add Kubernetes auth provider to use SelfSubjectReview and kubernetes api server (#2559 ) # What does this PR do? Add Kubernetes authentication provider support - Add KubernetesAuthProvider class for token validation using Kubernetes SelfSubjectReview API - Add KubernetesAuthProviderConfig with configurable API server URL, TLS settings, and claims mapping - Implement authentication via POST requests to /apis/authentication.k8s.io/v1/selfsubjectreviews endpoint - Add support for parsing Kubernetes SelfSubjectReview response format to extract user information - Add KUBERNETES provider type to AuthProviderType enum - Update create_auth_provider factory function to handle 'kubernetes' provider type - Add comprehensive unit tests for KubernetesAuthProvider functionality - Add documentation with configuration examples and usage instructions The provider validates tokens by sending SelfSubjectReview requests to the Kubernetes API server and extracts user information from the userInfo structure in the response. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> What This Verifies: Authentication header validation Token validation with Kubernetes SelfSubjectReview and kubernetes server API endpoint Error handling for invalid tokens and HTTP errors Request payload structure and headers ``` python -m pytest tests/unit/server/test_auth.py -k "kubernetes" -v ``` Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>	2025-09-08 11:25:10 +02:00
dependabot[bot]	e1b81ce1fc	chore(ui-deps): bump @radix-ui/react-dropdown-menu from 2.1.14 to 2.1.16 in /llama_stack/ui (#3361 ) Bumps [@radix-ui/react-dropdown-menu](https://github.com/radix-ui/primitives) from 2.1.14 to 2.1.16. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@radix-ui/react-dropdown-menu&package-manager=npm_and_yarn&previous-version=2.1.14&new-version=2.1.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 09:59:44 +02:00
dependabot[bot]	e508aef320	chore(ui-deps): bump lucide-react from 0.510.0 to 0.542.0 in /llama_stack/ui (#3363 ) Bumps [lucide-react](https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react) from 0.510.0 to 0.542.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/lucide-icons/lucide/releases">lucide-react's releases</a>.</em></p> <blockquote> <h2>Version 0.542.0</h2> <h2>What's Changed</h2> <ul> <li>feat(docs): add MDN Web Docs & Nuxt to showcase by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3590">lucide-icons/lucide#3590</a></li> <li>feat(icons): added <code>list-chevrons-down-up</code> icon by <a href="https://github.com/juliankellydesign"><code>@juliankellydesign</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3492">lucide-icons/lucide#3492</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/juliankellydesign"><code>@juliankellydesign</code></a> made their first contribution in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3492">lucide-icons/lucide#3492</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.541.0...0.542.0">https://github.com/lucide-icons/lucide/compare/0.541.0...0.542.0</a></p> <h2>Version 0.541.0</h2> <h2>What's Changed</h2> <ul> <li>feat(packages/lucide): added support for providing a custom root element by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3543">lucide-icons/lucide#3543</a></li> <li>fix(icons): optimized <code>chrome</code> icon & renamed to <code>chromium</code> by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3572">lucide-icons/lucide#3572</a></li> <li>fix(icons): changed <code>wallpaper</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3566">lucide-icons/lucide#3566</a></li> <li>fix(icons): optimized <code>cog</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3548">lucide-icons/lucide#3548</a></li> <li>fix(icons): changed <code>building</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3510">lucide-icons/lucide#3510</a></li> <li>feat(dpi-preview): add previous version for easier comparison by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3532">lucide-icons/lucide#3532</a></li> <li>feat(icons): added 'panel-dashed' variants + update tags on existing icons by <a href="https://github.com/irvineacosta"><code>@irvineacosta</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3500">lucide-icons/lucide#3500</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.540.0...0.541.0">https://github.com/lucide-icons/lucide/compare/0.540.0...0.541.0</a></p> <h2>Version 0.540.0</h2> <h2>What's Changed</h2> <ul> <li>fix(license): add full text of Feather license by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3530">lucide-icons/lucide#3530</a></li> <li>fix(icons): changed <code>umbrella</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3490">lucide-icons/lucide#3490</a></li> <li>docs(site): added official statement on brand logos in Lucide by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3541">lucide-icons/lucide#3541</a></li> <li>fix(icons): changed <code>camera</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3539">lucide-icons/lucide#3539</a></li> <li>feat(icons): added <code>rose</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/1972">lucide-icons/lucide#1972</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.539.0...0.540.0">https://github.com/lucide-icons/lucide/compare/0.539.0...0.540.0</a></p> <h2>Version 0.539.0</h2> <h2>What's Changed</h2> <ul> <li>feat(icons): added <code>brick-wall-shield</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3476">lucide-icons/lucide#3476</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.538.0...0.539.0">https://github.com/lucide-icons/lucide/compare/0.538.0...0.539.0</a></p> <h2>Version 0.538.0</h2> <h2>What's Changed</h2> <ul> <li>fix(icons): changed <code>apple</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3505">lucide-icons/lucide#3505</a></li> <li>fix(icons): changed <code>store</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3501">lucide-icons/lucide#3501</a></li> <li>fix(icons): changed <code>mic-off</code> icon by <a href="https://github.com/lieonlion"><code>@lieonlion</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/2823">lucide-icons/lucide#2823</a></li> <li>chore(deps): bump astro from 5.5.2 to 5.12.8 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3523">lucide-icons/lucide#3523</a></li> <li>fix(icons): deprecate rail-symbol by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/2862">lucide-icons/lucide#2862</a></li> <li>feat(icons): added <code>kayak</code> icon by <a href="https://github.com/jpjacobpadilla"><code>@jpjacobpadilla</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3054">lucide-icons/lucide#3054</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`e71198d9b3`"><code>e71198d</code></a> chore: icon alias improvements (<a href="https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react/issues/2861">#2861</a>)</li> <li><a href="`3e644fda2d`"><code>3e644fd</code></a> chore(scripts): Refactor scripts to typescript (<a href="https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react/issues/3316">#3316</a>)</li> <li><a href="`19fa01b5fc`"><code>19fa01b</code></a> build(deps-dev): bump vite from 6.3.2 to 6.3.4 (<a href="https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react/issues/3181">#3181</a>)</li> <li>See full diff in <a href="https://github.com/lucide-icons/lucide/commits/0.542.0/packages/lucide-react">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=lucide-react&package-manager=npm_and_yarn&previous-version=0.510.0&new-version=0.542.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 09:59:24 +02:00
dependabot[bot]	91c7c4570e	chore(ui-deps): bump sonner from 2.0.6 to 2.0.7 in /llama_stack/ui (#3364 ) Bumps [sonner](https://github.com/emilkowalski/sonner) from 2.0.6 to 2.0.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/emilkowalski/sonner/releases">sonner's releases</a>.</em></p> <blockquote> <h2>v2.0.7</h2> <p>Sonner now supports multiple <code><Toaster /></code> components, see more <a href="https://sonner.emilkowal.ski/toaster#multiple-toasters">here</a>.</p> <h2>What's Changed</h2> <ul> <li>feat: add testId prop for individual toast components by <a href="https://github.com/b-like-bahar"><code>@b-like-bahar</code></a> in <a href="https://redirect.github.com/emilkowalski/sonner/pull/660">emilkowalski/sonner#660</a></li> <li>feat(toaster): add support for multiple toasters with unique identifiers by <a href="https://github.com/taroj1205"><code>@taroj1205</code></a> in <a href="https://redirect.github.com/emilkowalski/sonner/pull/665">emilkowalski/sonner#665</a></li> <li>fix: tests by <a href="https://github.com/emilkowalski"><code>@emilkowalski</code></a> in <a href="https://redirect.github.com/emilkowalski/sonner/pull/677">emilkowalski/sonner#677</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/b-like-bahar"><code>@b-like-bahar</code></a> made their first contribution in <a href="https://redirect.github.com/emilkowalski/sonner/pull/660">emilkowalski/sonner#660</a></li> <li><a href="https://github.com/taroj1205"><code>@taroj1205</code></a> made their first contribution in <a href="https://redirect.github.com/emilkowalski/sonner/pull/665">emilkowalski/sonner#665</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/emilkowalski/sonner/compare/v2.0.6...v2.0.7">https://github.com/emilkowalski/sonner/compare/v2.0.6...v2.0.7</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`3ba7aa17ab`"><code>3ba7aa1</code></a> v2.0.7</li> <li><a href="`0604827063`"><code>0604827</code></a> fix: tests (<a href="https://redirect.github.com/emilkowalski/sonner/issues/677">#677</a>)</li> <li><a href="`c50fe92dfb`"><code>c50fe92</code></a> fix tests</li> <li><a href="`0600a5cb40`"><code>0600a5c</code></a> feat(toaster): add support for multiple toasters with unique identifiers (<a href="https://redirect.github.com/emilkowalski/sonner/issues/665">#665</a>)</li> <li><a href="`c14bf44a03`"><code>c14bf44</code></a> feat: add testId prop for individual toast components (<a href="https://redirect.github.com/emilkowalski/sonner/issues/660">#660</a>)</li> <li>See full diff in <a href="https://github.com/emilkowalski/sonner/compare/v2.0.6...v2.0.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=sonner&package-manager=npm_and_yarn&previous-version=2.0.6&new-version=2.0.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 09:59:02 +02:00
dependabot[bot]	fe134d90e5	chore(ui-deps): bump react-dom and @types/react-dom in /llama_stack/ui (#3360 ) Bumps [react-dom](https://github.com/facebook/react/tree/HEAD/packages/react-dom) and [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom). These dependencies needed to be updated together. Updates `react-dom` from 19.1.0 to 19.1.1 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/facebook/react/releases">react-dom's releases</a>.</em></p> <blockquote> <h2>19.1.1 (July 28, 2025)</h2> <h3>React</h3> <ul> <li>Fixed Owner Stacks to work with ES2015 function.name semantics (<a href="https://redirect.github.com/facebook/react/pull/33680">#33680</a> by <a href="https://github.com/hoxyq"><code>@hoxyq</code></a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/facebook/react/blob/main/CHANGELOG.md">react-dom's changelog</a>.</em></p> <blockquote> <h2>19.1.1 (July 28, 2025)</h2> <h3>React</h3> <ul> <li>Fixed Owner Stacks to work with ES2015 function.name semantics (<a href="https://redirect.github.com/facebook/react/pull/33680">#33680</a> by <a href="https://github.com/hoxyq"><code>@hoxyq</code></a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`87e33ca2b7`"><code>87e33ca</code></a> Set release versions to 19.1.1</li> <li><a href="`b793948e15`"><code>b793948</code></a> Bump next prerelease version numbers (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/32782">#32782</a>)</li> <li>See full diff in <a href="https://github.com/facebook/react/commits/v19.1.1/packages/react-dom">compare view</a></li> </ul> </details> <br /> Updates `@types/react-dom` from 19.1.5 to 19.1.9 <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 09:58:45 +02:00
Matthew Farrellee	6a35bd7bb6	chore: update the anthropic inference impl to use openai-python for openai-compat functions (#3366 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 1m13s Details # What does this PR do? update the Anthropic inference provider to use openai-python for the openai-compat endpoints ## Test Plan ci Co-authored-by: raghotham <rsm@meta.com>	2025-09-07 14:00:42 -07:00

1 2 3 4 5 ...

1607 commits