llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

Author	SHA1	Message	Date
Varsha	cfee63bd0d	feat: Add search_mode support to OpenAI vector store API (#2500 ) Some checks failed Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 18s Details Test Llama Stack Build / build-single-provider (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 19s Details Test Llama Stack Build / build (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 44s Details Test External Providers / test-external-providers (venv) (push) Failing after 47s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 50s Details Pre-commit / pre-commit (push) Successful in 2m12s Details # What does this PR do? Add search_mode parameter (vector/keyword/hybrid) to openai_search_vector_store method. Fixes OpenAPI code generation by using str instead of Literal type. Closes: #2459 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-06-24 20:38:47 -04:00
ehhuang	6832e8a658	feat: remove score_threshold constraint (#2479 ) Some checks failed Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.11, datasets) (push) Failing after 28s Details Python Package Build Test / build (3.11) (push) Failing after 3s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 6s Details Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 26s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.11, providers) (push) Failing after 28s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 25s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 30s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 22s Details Unit Tests / unit-tests (3.12) (push) Failing after 11s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Unit Tests / unit-tests (3.11) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 48s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m5s Details Pre-commit / pre-commit (push) Successful in 2m17s Details # What does this PR do? See inline comment. fixes test _ test_openai_vector_store_search_with_high_score_filter[llama_stack_client-meta-llama/Llama-3.3-70B-Instruct-meta-llama/Llama-4-Scout-17B-16E-Instruct-all-MiniLM-L6-v2-None-None] _ llama-stack/llama_stack/distribution/library_client.py:98: in convert_to_pydantic return TypeAdapter(annotation).validate_python(value) .venv/lib/python3.10/site-packages/pydantic/type_adapter.py:421: in validate_python return self.validator.validate_python( E pydantic_core._pydantic_core.ValidationError: 1 validation error for nullable[SearchRankingOptions] E score_threshold E Input should be less than or equal to 1 [type=less_than_equal, input_value=1.3458905661753127, input_type=float] E For further information visit https://errors.pydantic.dev/2.11/v/less_than_equal The above exception was the direct cause of the following exception: llama-stack/tests/integration/vector_io/test_openai_vector_stores.py:376: in test_openai_vector_store_search_with_high_score_filter search_response = compat_client.vector_stores.search( .venv/lib/python3.10/site-packages/llama_stack_client/resources/vector_stores/vector_stores.py:356: in search return self._post( .venv/lib/python3.10/site-packages/llama_stack_client/_base_client.py:1232: in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) llama-stack/llama_stack/distribution/library_client.py:177: in request result = loop.run_until_complete(self.async_client.request(args, *kwargs)) /opt/hostedtoolcache/Python/3.10.18/x64/lib/python3.10/asyncio/base_events.py:649: in run_until_complete return future.result() llama-stack/llama_stack/distribution/library_client.py:292: in request response = await self._call_non_streaming( llama-stack/llama_stack/distribution/library_client.py:313: in _call_non_streaming body = self._convert_body(path, options.method, body) llama-stack/llama_stack/distribution/library_client.py:425: in _convert_body converted_body[param_name] = convert_to_pydantic(param.annotation, value) llama-stack/llama_stack/distribution/library_client.py:112: in convert_to_pydantic raise ValueError(f"Failed to convert parameter {value} into {annotation}: {e}") from e E ValueError: Failed to convert parameter {'score_threshold': 1.3458905661753127} into llama_stack.apis.vector_io.vector_io.SearchRankingOptions \| None: 1 validation error for nullable[SearchRankingOptions] E score_threshold E Input should be less than or equal to 1 [type=less_than_equal, input_value=1.3458905661753127, input_type=float] E For further information visit https://errors.pydantic.dev/2.11/v/less_than_equal ## Test Plan	2025-06-20 09:17:42 +05:30
Ben Browning	f394c7f2d9	feat: Add missing Vector Store Files API surface (#2468 ) Some checks failed Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 19s Details Python Package Build Test / build (3.11) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 15s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 15s Details Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 22s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 48s Details Test External Providers / test-external-providers (venv) (push) Failing after 43s Details Unit Tests / unit-tests (3.13) (push) Failing after 52s Details Pre-commit / pre-commit (push) Successful in 2m4s Details # What does this PR do? This adds the ability to list, retrieve, update, and delete Vector Store Files. It implements these new APIs for the faiss and sqlite-vec providers, since those are the two that also have the rest of the vector store files implementation. Closes #2445 ## Test Plan ### test_openai_vector_stores Integration Tests There are a number of new integration tests added, which I ran for each provider as outlined below. faiss (from ollama distro): ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io/test_openai_vector_stores.py \ --embedding-model=all-MiniLM-L6-v2 ``` sqlite-vec (from starter distro): ``` llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io/test_openai_vector_stores.py \ --embedding-model=all-MiniLM-L6-v2 ``` ### file_search verification tests I also ensured the file_search verification tests continue to work, both for faiss and sqlite-vec. faiss (ollama distro): ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.2-3B-Instruct ``` sqlite-vec (starter distro): ``` llama stack run llama_stack/templates/starter/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=together/meta-llama/Llama-3.2-3B-Instruct-Turbo ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-19 11:08:24 -04:00
ehhuang	db2cd9e8f3	feat: support filters in file search (#2472 ) # What does this PR do? Move to use vector_stores.search for file search tool in Responses, which supports filters. closes #2435 ## Test Plan Added e2e test with fitlers. myenv ❯ llama stack run llama_stack/templates/fireworks/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search and filters' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.3-70B-Instruct	2025-06-18 21:50:55 -07:00
Ben Browning	941f505eb0	feat: File search tool for Responses API (#2426 ) # What does this PR do? This is an initial working prototype of wiring up the `file_search` builtin tool for the Responses API to our existing rag knowledge search tool. This is me seeing what I could pull together on top of the bits we already have merged. This may not be the ideal way to implement this, and things like how I shuffle the vector store ids from the original response API tool request to the actual tool execution feel a bit hacky (grep for `tool_kwargs["vector_db_ids"]` in `_execute_tool_call` to see what I mean). ## Test Plan I stubbed in some new tests to exercise this using text and pdf documents. Note that this is currently under tests/verification only because it sometimes flakes with tool calling of the small Llama-3.2-3B model we run in CI (and that I use as an example below). We'd want to make the test a bit more robust in some way if we moved this over to tests/integration and ran it in CI. ### OpenAI SaaS (to verify test correctness) ``` pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search' \ --base-url=https://api.openai.com/v1 \ --model=gpt-4o ``` ### Fireworks with faiss vector store ``` llama stack run llama_stack/templates/fireworks/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.3-70B-Instruct ``` ### Ollama with faiss vector store This sometimes flakes on Ollama because the quantized small model doesn't always choose to call the tool to answer the user's question. But, it often works. ``` ollama run llama3.2:3b INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run ./llama_stack/templates/ollama/run.yaml \ --image-type venv \ --env OLLAMA_URL="http://0.0.0.0:11434" pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.2-3B-Instruct ``` ### OpenAI provider with sqlite-vec vector store ``` llama stack run ./llama_stack/templates/starter/run.yaml --image-type venv pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=openai/gpt-4o-mini ``` ### Ensure existing vector store integration tests still pass ``` ollama run llama3.2:3b INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run ./llama_stack/templates/ollama/run.yaml \ --image-type venv \ --env OLLAMA_URL="http://0.0.0.0:11434" LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io \ --text-model "meta-llama/Llama-3.2-3B-Instruct" \ --embedding-model=all-MiniLM-L6-v2 ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-13 14:32:48 -04:00
Hardik Shah	0bc1747ed8	feat: update search for vector_stores (#2441 ) Updated the `search` functionality return response to match openai. ## Test Plan ``` pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ```	2025-06-12 15:34:22 -07:00
Hardik Shah	de37a04c3e	fix: set appropriate defaults for params (#2434 ) Some checks failed Integration Tests / test-matrix (http, 3.11, post_training) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 15s Details Test External Providers / test-external-providers (venv) (push) Failing after 20s Details Update ReadTheDocs / update-readthedocs (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 20s Details Unit Tests / unit-tests (3.11) (push) Failing after 1m39s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m37s Details Unit Tests / unit-tests (3.10) (push) Failing after 1m41s Details Pre-commit / pre-commit (push) Failing after 3h4m8s Details Setting defaults to be `\| None` else they get marked as required params in open-api spec.	2025-06-11 17:30:34 -07:00
Hardik Shah	d55100d9b7	feat: OpenAIVectorIOMixin for vector_stores common logic (#2427 ) Extracts common OpenAI vector-store code into its own mixin so that all providers can share the same core logic. This also makes it easy for Llama Stack to support both vector-stores and Llama Stack APIs in the interim so that both share the same underlying vector-dbs. Each provider contains storage specific logic to `create / edit / delete / list` vector dbs while the plumbing logic is standardized in the common code. Ensured that this works well with both faiss and sqllite-vec. ### Test Plan ``` llama stack run starter pytest -sv --stack-config http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ```	2025-06-11 15:40:57 -07:00
Hardik Shah	5ac43268e8	feat: Add OpenAI compat /v1/vector_store APIs (#2423 ) Some checks failed Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.10, post_training) (push) Failing after 41s Details Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.10, tool_runtime) (push) Failing after 46s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 16s Details Test External Providers / test-external-providers (venv) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 13s Details Update ReadTheDocs / update-readthedocs (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m31s Details Unit Tests / unit-tests (3.11) (push) Failing after 1m33s Details Unit Tests / unit-tests (3.10) (push) Failing after 1m35s Details Pre-commit / pre-commit (push) Failing after 3h13m41s Details Adding OpenAI compat `/v1/vector-store` apis. This PR implements the `faiss` provider with followup PRs coming up for other providers. Added routes to create, update, delete, list vector stores. Also added route to search a vector store Inserting into vector stores is missing and will be a follow up diff. ### Test Plan - Added new integration test for testing the faiss provider ``` pytest -sv --stack-config http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ```	2025-06-10 13:07:39 -07:00
Francisco Arceo	f328436831	feat: Enable ingestion of precomputed embeddings (#2317 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, agents) (push) Failing after 10s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (http, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, agents) (push) Failing after 9s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (http, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, inference) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 10s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Unit Tests / unit-tests (3.10) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Pre-commit / pre-commit (push) Successful in 1m15s Details	2025-05-31 04:03:37 -06:00
Sébastien Han	bb5fca9521	chore: more API validators (#2165 ) # What does this PR do? We added: * make sure docstrings are present with 'params' and 'returns' * fail if someone sets 'returns: None' * fix the failing APIs Signed-off-by: Sébastien Han <seb@redhat.com>	2025-05-15 11:22:51 -07:00
Ihar Hrachyshka	9e6561a1ec	chore: enable pyupgrade fixes (#1806 ) # What does this PR do? The goal of this PR is code base modernization. Schema reflection code needed a minor adjustment to handle UnionTypes and collections.abc.AsyncIterator. (Both are preferred for latest Python releases.) Note to reviewers: almost all changes here are automatically generated by pyupgrade. Some additional unused imports were cleaned up. The only change worth of note can be found under `docs/openapi_generator` and `llama_stack/strong_typing/schema.py` where reflection code was updated to deal with "newer" types. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-05-01 14:23:50 -07:00
Ihar Hrachyshka	515c16e352	chore: mypy violations cleanup for inline::{telemetry,tool_runtime,vector_io} (#1711 ) # What does this PR do? Clean up mypy violations for inline::{telemetry,tool_runtime,vector_io}. This also makes API accept a tool call result without any content (like RAG tool already may produce). Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-03-20 10:01:10 -07:00
Ashwin Bharambe	314ee09ae3	chore: move all Llama Stack types from llama-models to llama-stack (#1098 ) llama-models should have extremely minimal cruft. Its sole purpose should be didactic -- show the simplest implementation of the llama models and document the prompt formats, etc. This PR is the complement to https://github.com/meta-llama/llama-models/pull/279 ## Test Plan Ensure all `llama` CLI `model` sub-commands work: ```bash llama model list llama model download --model-id ... llama model prompt-format -m ... ``` Ran tests: ```bash cd tests/client-sdk LLAMA_STACK_CONFIG=fireworks pytest -s -v inference/ LLAMA_STACK_CONFIG=fireworks pytest -s -v vector_io/ LLAMA_STACK_CONFIG=fireworks pytest -s -v agents/ ``` Create a fresh venv `uv venv && source .venv/bin/activate` and run `llama stack build --template fireworks --image-type venv` followed by `llama stack run together --image-type venv` <-- the server runs Also checked that the OpenAPI generator can run and there is no change in the generated files as a result. ```bash cd docs/openapi_generator sh run_openapi_generator.sh ```	2025-02-14 09:10:59 -08:00
Ashwin Bharambe	a63a43c646	[memory refactor][6/n] Update naming and routes (#839 ) Making a few small naming changes as per feedback: - RAGToolRuntime methods are called `insert` and `query` to keep them more general - The tool names are changed to non-namespaced forms `insert_into_memory` and `query_from_memory` - The REST endpoints are more REST-ful	2025-01-22 10:39:13 -08:00
Ashwin Bharambe	3ae8585b65	[memory refactor][1/n] Rename Memory -> VectorIO, MemoryBanks -> VectorDBs (#828 ) See https://github.com/meta-llama/llama-stack/issues/827 for the broader design. This is the first part: - delete other kinds of memory banks (keyvalue, keyword, graph) for now; we will introduce a keyvalue store API as part of this design but not use it in the RAG tool yet. - renaming of the APIs	2025-01-22 09:59:30 -08:00

16 commits