mirror of
				https://github.com/meta-llama/llama-stack.git
				synced 2025-10-26 09:15:40 +00:00 
			
		
		
		
	
	
		
			16 commits
		
	
	
	| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|  | 3b83032555 | feat(registry): more flexible model lookup (#2859) This PR updates model registration and lookup behavior to be slightly more general / flexible. See https://github.com/meta-llama/llama-stack/issues/2843 for more details. Note that this change is backwards compatible given the design of the `lookup_model()` method. ## Test Plan Added unit tests | ||
|  | c8f274347d | chore: Adding Access Control for OpenAI Vector Stores methods (#2772) # What does this PR do? Refactors the vector store routing logic by moving OpenAI-compatible vector store operations from the `VectorIORouter` to the `VectorDBsRoutingTable`. Closes https://github.com/meta-llama/llama-stack/issues/2761 ## Test Plan Added unit tests to cover new routing logic and ACL checks. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> | ||
|  | 31b088978a | fix: Fix /vector-stores/createAPI when vector store with duplicatename(#2617)# What does this PR do? Resolves https://github.com/meta-llama/llama-stack/issues/2735 Currently, if you test against OpenAI's Vector Stores API the `client.vector_stores.search` call fails with an invalid vector_db during routing (see the script referenced in the clickable item under the Test Plan section). This PR ensures that `client.vector_stores.search()` is compatible with OpenAI's Vector Stores API. Two biggest changes: 1. The `name`, which was previously used as the `vector_db_id`, has been changed to be consistent with OpenAI's `vs_{uuid}` format. 2. The vector store ID has to be referenced by the ID, the name is not reliable as every `client.vector_stores.create` results in a new vector store. NOTE: I believe this is a breaking change for end users as they'll need to update their VectorDB identifiers. ## Test Plan Unit tests: ```bash ./scripts/unit-tests.sh tests/unit/providers/vector_io/ -v ``` Integration tests: ```bash ENABLE_MILVUS=milvus llama stack run /Users/farceo/dev/llama-stack/llama_stack/templates/starter/run.yaml --image-type venv LLAMA_STACK_CONFIG=http://localhost:8321 pytest -sv tests/integration/vector_io/test_openai_vector_stores.py --embedding-model=all-MiniLM-L6-v2 -vv ``` Unit tests and test script below 👇 <details> <summary>Click here for script used to test OpenAI and Llama Stack Vector Store implementation</summary> ```python import json import argparse from openai import OpenAI, pagination import logging from colorama import Fore, Style, init import traceback import os # Initialize colorama for color support in terminal init(autoreset=True) # Setup basic logging logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') DEMO_VECTOR_STORE_NAME = "Support FAQ FJA" global DEMO_VECTOR_STORE_ID global DEMO_VECTOR_STORE_ID2 def colored_print(color, text): """Prints text to the console with the specified color.""" print(f"{color}{text}{Style.RESET_ALL}") def log_and_print(color, message, level=logging.INFO): """Logs a message and prints it to the console with the specified color.""" logging.log(level, message) colored_print(color, message) def run_tests(client, prefix="openai"): """ Runs all tests using the provided OpenAI client and saves the output to JSON files with the given prefix. """ # Create the directory if it doesn't exist os.makedirs('openai_testing', exist_ok=True) # Default values in case tests fail global DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 DEMO_VECTOR_STORE_ID = None DEMO_VECTOR_STORE_ID2 = None def test_idempotent_vector_store_creation(): """ Test that creating a vector store with the same name is idempotent. """ log_and_print(Fore.BLUE, "Starting vector store creation test...") try: vector_store = client.vector_stores.create( name=DEMO_VECTOR_STORE_NAME, ) # Attempt to create the same vector store again vector_store2 = client.vector_stores.create( name=DEMO_VECTOR_STORE_NAME, ) # Check instead of assert if vector_store2.id != vector_store.id: log_and_print(Fore.YELLOW, f"FAILED IDEMPOTENCY: the same VectorStore name for {prefix.upper()} does not return the same ID", level=logging.WARNING) else: log_and_print(Fore.GREEN, f"PASSED IDEMPOTENCY: f{vector_store2.id} == {vector_store.id} the same VectorStore name for {prefix.upper()} returns the same ID") vector_store_data = vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.create = {json.dumps(vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_create.json', 'w') as f: json.dump(vector_store_data, f, indent=2) global DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 DEMO_VECTOR_STORE_ID = vector_store.id DEMO_VECTOR_STORE_ID2 = vector_store2.id return DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 except Exception as e: log_and_print(Fore.RED, f"Idempotent vector store creation test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) # Create a fallback vector store ID if needed if 'vector_store' in locals() and vector_store: DEMO_VECTOR_STORE_ID = vector_store.id return DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 def test_vector_store_list(): """ Test listing vector stores. """ log_and_print(Fore.BLUE, "Starting vector store list test...") try: vector_stores = client.vector_stores.list() # Check instead of assert if not isinstance(vector_stores, pagination.SyncCursorPage): log_and_print(Fore.YELLOW, f"FAILED: Expected a list of vector stores, got {type(vector_stores)}", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Vector store list test passed!") vector_stores_data = vector_stores.to_dict() log_and_print(Fore.WHITE, f"vector_stores.list = {json.dumps(vector_stores_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_list.json', 'w') as f: json.dump(vector_stores_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Vector store list test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_retrieve_vector_store(): """ Test retrieving a specific vector store. """ log_and_print(Fore.BLUE, "Starting retrieve vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping retrieve vector store test - no vector store ID available", level=logging.WARNING) return try: vector_store = client.vector_stores.retrieve( vector_store_id=DEMO_VECTOR_STORE_ID, ) # Check instead of assert if vector_store.id != DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "FAILED: Retrieved vector store ID does not match", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Retrieve vector store test passed!") vector_store_data = vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.retrieve = {json.dumps(vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_retrieve.json', 'w') as f: json.dump(vector_store_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Retrieve vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_modify_vector_store(): """ Test modifying a vector store. """ log_and_print(Fore.BLUE, "Starting modify vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping modify vector store test - no vector store ID available", level=logging.WARNING) return try: updated_vector_store = client.vector_stores.update( vector_store_id=DEMO_VECTOR_STORE_ID, name="Updated Support FAQ FJA", ) # Check instead of assert if updated_vector_store.name != "Updated Support FAQ FJA": log_and_print(Fore.YELLOW, "FAILED: Vector store name was not updated correctly", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Modify vector store test passed!") updated_vector_store_data = updated_vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.modify = {json.dumps(updated_vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_modify.json', 'w') as f: json.dump(updated_vector_store_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Modify vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_delete_vector_store(): """ Test deleting a vector store. """ log_and_print(Fore.BLUE, "Starting delete vector store test...") if not DEMO_VECTOR_STORE_ID2: log_and_print(Fore.YELLOW, "Skipping delete vector store test - no second vector store ID available", level=logging.WARNING) return try: response = client.vector_stores.delete( vector_store_id=DEMO_VECTOR_STORE_ID2, ) log_and_print(Fore.GREEN, "Delete vector store test passed!") response_data = response.to_dict() log_and_print(Fore.WHITE, f"Vector store delete response = {json.dumps(response_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_delete.json', 'w') as f: json.dump(response_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Delete vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_create_vector_store_file(): log_and_print(Fore.BLUE, "Starting create vector store file test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping create vector store file test - no vector store ID available", level=logging.WARNING) return try: # create jsonl of files as an example with open("mydata.jsonl", "w") as f: f.write('{"text": "What is the return policy?", "metadata": {"category": "support"}}\n') f.write('{"text": "How do I reset my password?", "metadata": {"category": "support"}}\n') f.write('{"text": "Where can I find my order history?", "metadata": {"category": "support"}}\n') f.write('{"text": "What are the shipping options?", "metadata": {"category": "support"}}\n') f.write('{"text": "What is your favorite banana?", "metadata": {"category": "support"}}\n') # Create a simple text file if my_data_small.txt doesn't exist if not os.path.exists("my_data_small.txt"): with open("my_data_small.txt", "w") as f: f.write("This is a test file for vector store testing.\n") created_file = client.files.create( file=open("my_data_small.txt", "rb"), purpose="assistants", ) created_file_data = created_file.to_dict() log_and_print(Fore.WHITE, f"Created file {json.dumps(created_file_data, indent=2)}") with open(f'openai_testing/{prefix}_file_create.json', 'w') as f: json.dump(created_file_data, f, indent=2) retrieved_files = client.files.retrieve(created_file.id) retrieved_files_data = retrieved_files.to_dict() log_and_print(Fore.WHITE, f"Retrieved file {json.dumps(retrieved_files_data, indent=2)}") with open(f'openai_testing/{prefix}_file_retrieve.json', 'w') as f: json.dump(retrieved_files_data, f, indent=2) vector_store_file = client.vector_stores.files.create( vector_store_id=DEMO_VECTOR_STORE_ID, file_id=created_file.id, ) log_and_print(Fore.GREEN, "Create vector store file test passed!") except Exception as e: log_and_print(Fore.RED, f"Create vector store file test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_search_vector_store(): """ Test searching a vector store. """ log_and_print(Fore.BLUE, "Starting search vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping search vector store test - no vector store ID available", level=logging.WARNING) return try: query = "What is the banana policy?" search_results = client.vector_stores.search( vector_store_id=DEMO_VECTOR_STORE_ID, query=query, max_num_results=10, ranking_options={ 'ranker': 'default-2024-11-15', 'score_threshold': 0.0, }, rewrite_query=False, ) # Check instead of assert if not isinstance(search_results, pagination.SyncPage): log_and_print(Fore.YELLOW, f"FAILED: Expected a list of search results, got {type(search_results)}", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Search vector store test passed!") search_results_dict = search_results.to_dict() log_and_print(Fore.WHITE, f"Search results = {search_results_dict}") with open(f'openai_testing/{prefix}_vector_store_search.json', 'w') as f: json.dump(search_results_dict, f, indent=2) log_and_print(Fore.WHITE, f"vector_stores.search = {search_results.to_json()}") except Exception as e: log_and_print(Fore.RED, f"Search vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) # Run all tests in sequence, even if some fail test_results = [] try: result = test_idempotent_vector_store_creation() if result and len(result) == 2: DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 = result test_results.append(True) except Exception as e: log_and_print(Fore.RED, f"Vector store creation test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) test_results.append(False) for test_func in [ test_vector_store_list, test_retrieve_vector_store, test_modify_vector_store, test_delete_vector_store, test_create_vector_store_file, test_search_vector_store ]: try: test_func() test_results.append(True) except Exception as e: log_and_print(Fore.RED, f"{test_func.__name__} failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) test_results.append(False) if all(test_results): log_and_print(Fore.GREEN, f"All {prefix} tests completed successfully!") else: failed_count = test_results.count(False) log_and_print(Fore.YELLOW, f"{failed_count} {prefix} test(s) failed, but script completed.") if __name__ == "__main__": parser = argparse.ArgumentParser(description="Run OpenAI and/or LlamaStack tests.") parser.add_argument( "--provider", type=str, default="llama", choices=["openai", "llama", "both"], help="Specify which environment to test: openai, llama, or both. Default is both.", ) args = parser.parse_args() try: if args.provider in ("openai", "both"): openai_client = OpenAI() run_tests(openai_client, prefix="openai") if args.provider in ("llama", "both"): llama_client = OpenAI(base_url="http://localhost:8321/v1/openai/v1", api_key="none") run_tests(llama_client, prefix="llama") log_and_print(Fore.GREEN, "All tests completed!") except Exception as e: log_and_print(Fore.RED, f"Tests failed to complete: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) ``` </details> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> | ||
|  | ac5fd57387 | chore: remove nested imports (#2515) # What does this PR do? * Given that our API packages use "import *" in `__init.py__` we don't need to do `from llama_stack.apis.models.models` but simply from llama_stack.apis.models. The decision to use `import *` is debatable and should probably be revisited at one point. * Remove unneeded Ruff F401 rule * Consolidate Ruff F403 rule in the pyprojectfrom llama_stack.apis.models.models Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | cfee63bd0d | feat: Add search_mode support to OpenAI vector store API (#2500) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 17s Python Package Build Test / build (3.13) (push) Failing after 5s Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 18s Test Llama Stack Build / build-single-provider (push) Failing after 8s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 13s Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 11s Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 17s Unit Tests / unit-tests (3.12) (push) Failing after 7s Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 17s Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 16s Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 18s Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Unit Tests / unit-tests (3.13) (push) Failing after 8s Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 19s Test Llama Stack Build / build (push) Failing after 5s Update ReadTheDocs / update-readthedocs (push) Failing after 44s Test External Providers / test-external-providers (venv) (push) Failing after 47s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 50s Pre-commit / pre-commit (push) Successful in 2m12s # What does this PR do? Add search_mode parameter (vector/keyword/hybrid) to openai_search_vector_store method. Fixes OpenAPI code generation by using str instead of Literal type. Closes: #2459 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> | ||
|  | f394c7f2d9 | feat: Add missing Vector Store Files API surface (#2468) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 16s Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 26s Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 19s Python Package Build Test / build (3.11) (push) Failing after 5s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 6s Python Package Build Test / build (3.12) (push) Failing after 3s Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 18s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 17s Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 15s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 18s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Python Package Build Test / build (3.13) (push) Failing after 5s Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 24s Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 20s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 15s Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 21s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 12s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 15s Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 22s Unit Tests / unit-tests (3.11) (push) Failing after 7s Update ReadTheDocs / update-readthedocs (push) Failing after 4s Unit Tests / unit-tests (3.12) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 48s Test External Providers / test-external-providers (venv) (push) Failing after 43s Unit Tests / unit-tests (3.13) (push) Failing after 52s Pre-commit / pre-commit (push) Successful in 2m4s # What does this PR do? This adds the ability to list, retrieve, update, and delete Vector Store Files. It implements these new APIs for the faiss and sqlite-vec providers, since those are the two that also have the rest of the vector store files implementation. Closes #2445 ## Test Plan ### test_openai_vector_stores Integration Tests There are a number of new integration tests added, which I ran for each provider as outlined below. faiss (from ollama distro): ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io/test_openai_vector_stores.py \ --embedding-model=all-MiniLM-L6-v2 ``` sqlite-vec (from starter distro): ``` llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io/test_openai_vector_stores.py \ --embedding-model=all-MiniLM-L6-v2 ``` ### file_search verification tests I also ensured the file_search verification tests continue to work, both for faiss and sqlite-vec. faiss (ollama distro): ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.2-3B-Instruct ``` sqlite-vec (starter distro): ``` llama stack run llama_stack/templates/starter/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=together/meta-llama/Llama-3.2-3B-Instruct-Turbo ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> | ||
|  | fa1d986f72 | fix: remove asyncio.TimeoutError since Python update (#2476) # What does this PR do? Since we now support Pythong starting from 3.11, this is not needed anymore. Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | db2cd9e8f3 | feat: support filters in file search (#2472) # What does this PR do? Move to use vector_stores.search for file search tool in Responses, which supports filters. closes #2435 ## Test Plan Added e2e test with fitlers. myenv ❯ llama stack run llama_stack/templates/fireworks/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search and filters' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.3-70B-Instruct | ||
|  | 90d03552d4 | feat: To add health check for faiss inline vector_io provider (#2319) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 10s Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 8s Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 7s Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 7s Integration Tests / test-matrix (library, 3.10, vector_io) (push) Failing after 13s Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 7s Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 7s Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 5s Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 5s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 5s Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 4s Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 5s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 4s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 6s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 4s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 6s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 4s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 7s Test External Providers / test-external-providers (venv) (push) Failing after 1m1s Unit Tests / unit-tests (3.11) (push) Failing after 1m11s Unit Tests / unit-tests (3.10) (push) Failing after 1m13s Unit Tests / unit-tests (3.12) (push) Failing after 1m9s Unit Tests / unit-tests (3.13) (push) Failing after 15s Pre-commit / pre-commit (push) Successful in 1m52s # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> To add health check for faiss inline vector_io provider. I tried adding `async def health(self) -> HealthResponse:` like in inference provider, but it didn't worked for `inline->vector_io->faiss` provider. And via debug logs, I understood the critical issue, that the health responses are being stored with the API name as the key, not as a nested dictionary with provider IDs. This means that all providers of the same API type (e.g., "vector_io") will share the same health response, and only the last one processed will be visible in the API response. I've created a patch file that fixes this issue by: - Storing the original get_providers_health method - Creating a patched version that correctly maps health responses to providers - Applying the patch to the `ProviderImpl` class Not an expert, so please let me know, if there can be any other workaround using which I can get the health status updated directly from `faiss.py`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Added unit tests to test the provider patch implementation in the PR. Adding a screenshot with the FAISS inline vector_io health status as "OK"  | ||
|  | 822307e6d5 | fix: Do not throw when listing vector stores (#2460) When trying to `list` vector_stores , if we cannot retrieve one, log an error and return all the ones that are valid. ### Test Plan ``` pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` Also tested for `--stack-config fireworks` | ||
|  | 941f505eb0 | feat: File search tool for Responses API (#2426) # What does this PR do? This is an initial working prototype of wiring up the `file_search` builtin tool for the Responses API to our existing rag knowledge search tool. This is me seeing what I could pull together on top of the bits we already have merged. This may not be the ideal way to implement this, and things like how I shuffle the vector store ids from the original response API tool request to the actual tool execution feel a bit hacky (grep for `tool_kwargs["vector_db_ids"]` in `_execute_tool_call` to see what I mean). ## Test Plan I stubbed in some new tests to exercise this using text and pdf documents. Note that this is currently under tests/verification only because it sometimes flakes with tool calling of the small Llama-3.2-3B model we run in CI (and that I use as an example below). We'd want to make the test a bit more robust in some way if we moved this over to tests/integration and ran it in CI. ### OpenAI SaaS (to verify test correctness) ``` pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search' \ --base-url=https://api.openai.com/v1 \ --model=gpt-4o ``` ### Fireworks with faiss vector store ``` llama stack run llama_stack/templates/fireworks/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.3-70B-Instruct ``` ### Ollama with faiss vector store This sometimes flakes on Ollama because the quantized small model doesn't always choose to call the tool to answer the user's question. But, it often works. ``` ollama run llama3.2:3b INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run ./llama_stack/templates/ollama/run.yaml \ --image-type venv \ --env OLLAMA_URL="http://0.0.0.0:11434" pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.2-3B-Instruct ``` ### OpenAI provider with sqlite-vec vector store ``` llama stack run ./llama_stack/templates/starter/run.yaml --image-type venv pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=openai/gpt-4o-mini ``` ### Ensure existing vector store integration tests still pass ``` ollama run llama3.2:3b INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run ./llama_stack/templates/ollama/run.yaml \ --image-type venv \ --env OLLAMA_URL="http://0.0.0.0:11434" LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io \ --text-model "meta-llama/Llama-3.2-3B-Instruct" \ --embedding-model=all-MiniLM-L6-v2 ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> | ||
|  | 0bc1747ed8 | feat: update search for vector_stores (#2441) Updated the `search` functionality return response to match openai. ## Test Plan ``` pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` | ||
|  | de37a04c3e | fix: set appropriate defaults for params (#2434) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (http, 3.11, post_training) (push) Failing after 15s Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 9s Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 9s Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 17s Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 11s Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 12s Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 14s Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 7s Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 19s Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 16s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 9s Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 17s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 19s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 14s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 12s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 15s Test External Providers / test-external-providers (venv) (push) Failing after 20s Update ReadTheDocs / update-readthedocs (push) Failing after 17s Unit Tests / unit-tests (3.12) (push) Failing after 20s Unit Tests / unit-tests (3.11) (push) Failing after 1m39s Unit Tests / unit-tests (3.13) (push) Failing after 1m37s Unit Tests / unit-tests (3.10) (push) Failing after 1m41s Pre-commit / pre-commit (push) Failing after 3h4m8s Setting defaults to be `| None` else they get marked as required params in open-api spec. | ||
|  | d55100d9b7 | feat: OpenAIVectorIOMixin for vector_stores common logic (#2427) Extracts common OpenAI vector-store code into its own mixin so that all providers can share the same core logic. This also makes it easy for Llama Stack to support both vector-stores and Llama Stack APIs in the interim so that both share the same underlying vector-dbs. Each provider contains storage specific logic to `create / edit / delete / list` vector dbs while the plumbing logic is standardized in the common code. Ensured that this works well with both faiss and sqllite-vec. ### Test Plan ``` llama stack run starter pytest -sv --stack-config http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` | ||
|  | 5ac43268e8 | feat: Add OpenAI compat /v1/vector_store APIs (#2423) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 12s Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 11s Integration Tests / test-matrix (http, 3.10, post_training) (push) Failing after 41s Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 10s Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 13s Integration Tests / test-matrix (http, 3.10, tool_runtime) (push) Failing after 46s Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 14s Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 12s Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 14s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 5s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 16s Test External Providers / test-external-providers (venv) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 13s Update ReadTheDocs / update-readthedocs (push) Failing after 8s Unit Tests / unit-tests (3.13) (push) Failing after 11s Unit Tests / unit-tests (3.12) (push) Failing after 1m31s Unit Tests / unit-tests (3.11) (push) Failing after 1m33s Unit Tests / unit-tests (3.10) (push) Failing after 1m35s Pre-commit / pre-commit (push) Failing after 3h13m41s Adding OpenAI compat `/v1/vector-store` apis. This PR implements the `faiss` provider with followup PRs coming up for other providers. Added routes to create, update, delete, list vector stores. Also added route to search a vector store Inserting into vector stores is missing and will be a follow up diff. ### Test Plan - Added new integration test for testing the faiss provider ``` pytest -sv --stack-config http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` | ||
|  | eedf21f19c | chore: split routers into individual files (inference, tool, vector_io, eval_scoring) (#2258) |