mirror of
				https://github.com/meta-llama/llama-stack.git
				synced 2025-10-25 17:11:12 +00:00 
			
		
		
		
	
	
		
			145 commits
		
	
	
	| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|  | 81c7d6fa2e | chore(ci): disable post training tests (#2953) Post training tests need _much_ better thinking before we can re-enable them to be run on every single PR. Running periodically should be approached only when it is shown that the tests are reliable and as light-weight as can be; otherwise, it is just kicking the can down the road. | ||
|  | 072d20a124 | feat(test): record agents, safety and vector_io integration tests (#2952) Continue to build on top of https://github.com/meta-llama/llama-stack/pull/2941 ## Test Plan Run server with `LLAMA_STACK_TEST_INFERENCE_MODE=record` and then run the integration tests with `--stack-config=server:starter`. Then restart the server with `LLAMA_STACK_TEST_INFERENCE_MODE=replay` and re-run the tests. Verify that no request hit Ollama at any point. | ||
|  | 2e5ca3f15c | chore: move recordings one directory upwards | ||
|  | 08b4a1deb3 | feat(tests): introduce inference record/replay to increase test reliability (#2941) Implements a comprehensive recording and replay system for inference API calls that eliminates dependency on online inference providers during testing. The system treats inference as deterministic by recording real API responses and replaying them in subsequent test runs. Applies to OpenAI clients (which should cover many inference requests) as well as Ollama AsyncClient. For storing, we use a hybrid system: Sqlite for fast lookups and JSON files for easy greppability / debuggability. As expected, tests become much much faster (more than 3x in just inference testing.) ```bash LLAMA_STACK_TEST_INFERENCE_MODE=record LLAMA_STACK_TEST_RECORDING_DIR=<...> \ uv run pytest -s -v tests/integration/inference \ --stack-config=starter \ -k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \ --text-model="ollama/llama3.2:3b-instruct-fp16" \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 ``` ```bash LLAMA_STACK_TEST_INFERENCE_MODE=replay LLAMA_STACK_TEST_RECORDING_DIR=<...> \ uv run pytest -s -v tests/integration/inference \ --stack-config=starter \ -k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \ --text-model="ollama/llama3.2:3b-instruct-fp16" \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 ``` - `LLAMA_STACK_TEST_INFERENCE_MODE`: `live` (default), `record`, or `replay` - `LLAMA_STACK_TEST_RECORDING_DIR`: Storage location (must be specified for record or replay modes) | ||
|  | c7dc0f21b4 | fix: error on failed job, do not wait for timeout (#2945) # What does this PR do? cause post training integration test to error when job fails. ## Test Plan ci | ||
|  | 870a37ff4b | feat: add base64 encoded PDF support for OpenAI Chat Completions (#2881) 
		
			Some checks failed
		
		
	 Coverage Badge / unit-tests (push) Failing after 1s Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Integration Tests / discover-tests (push) Successful in 3s Test Llama Stack Build / generate-matrix (push) Successful in 6s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Unit Tests / unit-tests (3.12) (push) Failing after 8s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Unit Tests / unit-tests (3.13) (push) Failing after 10s Test Llama Stack Build / build-single-provider (push) Failing after 15s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Test External API and Providers / test-external (venv) (push) Failing after 16s Test Llama Stack Build / build (push) Failing after 9s Python Package Build Test / build (3.12) (push) Failing after 23s Update ReadTheDocs / update-readthedocs (push) Failing after 21s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 27s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 29s SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 31s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 58s Python Package Build Test / build (3.13) (push) Failing after 54s Integration Tests / test-matrix (push) Failing after 56s SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m4s Pre-commit / pre-commit (push) Successful in 2m15s # What does this PR do? OpenAI Chat Completions supports passing a base64 encoded PDF file to a model, but Llama Stack currently does not allow for this behavior. This PR extends our implementation of the OpenAI API spec to change that. Closes #2129 ## Test Plan A new functional test has been added to test the validity of such a request Signed-off-by: Nathan Weinberg <nweinber@redhat.com> | ||
|  | 52201612de | feat: implement chunk deletion for vector stores (#2701) Add support for deleting individual chunks from vector stores - Add abstract remove_chunk() method to EmbeddingIndex base class - Implement chunk deletion for Faiss provider, SQLite Vec, Milvus, PGVector - Placeholder implementations with NotImplementedError for Chroma/Qdrant/Weaviate - Integrate chunk deletion into OpenAI vector store file deletion flow - removed xfail from test_openai_vector_store_delete_file_removes_from_vector_store Closes: #2477 --------- Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> | ||
|  | 9e77be1f72 | chore: Fix chroma unit tests (#2896) # What does this PR do? Enable Chroma inline unit tests and fix integration tests. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> | ||
|  | 4ea1f2aa9f | test: Add VLLM provider support to integration tests (#2757) - Add setup-vllm GitHub action to start VLLM container - Extend integration test matrix to support both ollama and vllm providers - Make test setup conditional based on provider type - Add provider-specific environment variables and configurations - vllm tests setup to run weekly or can be triggered manually (only ollama on PR) TODO: investigate failing tests for vllm provider (safety and post_training) Also need a proper fix for #2713 (tmp fix for this in the first commit in this PR) Closes: #1648 --------- Signed-off-by: Derek Higgins <derekh@redhat.com> | ||
|  | cd8715d327 | chore: Added openai compatible vector io endpoints for chromadb (#2489) 
		
			Some checks failed
		
		
	 Integration Tests / discover-tests (push) Successful in 3s Coverage Badge / unit-tests (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s Test Llama Stack Build / generate-matrix (push) Successful in 3s Python Package Build Test / build (3.13) (push) Failing after 2s Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Python Package Build Test / build (3.12) (push) Failing after 12s Test External Providers / test-external-providers (venv) (push) Failing after 12s Update ReadTheDocs / update-readthedocs (push) Failing after 10s Test Llama Stack Build / build-single-provider (push) Failing after 15s SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Unit Tests / unit-tests (3.13) (push) Failing after 14s Test Llama Stack Build / build (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s Unit Tests / unit-tests (3.12) (push) Failing after 14s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 49s Integration Tests / test-matrix (push) Failing after 53s Pre-commit / pre-commit (push) Successful in 1m42s # What does this PR do? This PR implements the openai compatible endpoints for chromadb Closes #2462 ## Test Plan Ran ollama llama stack server and ran the command `pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2` 8 failed, 27 passed, 8 skipped, 1 xfailed The failed ones are regarding files api --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com> Co-authored-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> | ||
|  | 9736f096f6 | chore(test): fix flaky telemetry tests (#2815) 
		
			Some checks failed
		
		
	 Installer CI / lint (push) Failing after 2s Installer CI / smoke-test (push) Has been skipped Integration Tests / discover-tests (push) Successful in 3s Coverage Badge / unit-tests (push) Failing after 6s Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Python Package Build Test / build (3.12) (push) Failing after 3s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 5s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Unit Tests / unit-tests (3.12) (push) Failing after 6s SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s Test Llama Stack Build / generate-matrix (push) Successful in 11s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s Test Llama Stack Build / build-single-provider (push) Failing after 12s Update ReadTheDocs / update-readthedocs (push) Failing after 9s Integration Tests / test-matrix (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Test External Providers / test-external-providers (venv) (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 8s SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Test Llama Stack Build / build (push) Failing after 3s Python Package Build Test / build (3.13) (push) Failing after 48s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 55s Unit Tests / unit-tests (3.13) (push) Failing after 52s Pre-commit / pre-commit (push) Successful in 1m42s # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fixes flaky telemetry tests <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> See https://github.com/meta-llama/llama-stack/pull/2814 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com> | ||
|  | d0208df286 | test: skip flaky telemetry tests (#2814) # What does this PR do?
example error:
 | ||
|  | 0a6e588f68 | feat: enable auth for LocalFS Files Provider (#2773) 
		
			Some checks failed
		
		
	 Integration Tests / discover-tests (push) Successful in 4s Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s Test Llama Stack Build / generate-matrix (push) Successful in 7s Coverage Badge / unit-tests (push) Failing after 16s Test Llama Stack Build / build-single-provider (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Unit Tests / unit-tests (3.12) (push) Failing after 13s Test External Providers / test-external-providers (venv) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Python Package Build Test / build (3.12) (push) Failing after 13s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 17s SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 23s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s Update ReadTheDocs / update-readthedocs (push) Failing after 19s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 23s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 21s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Unit Tests / unit-tests (3.13) (push) Failing after 20s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s Test Llama Stack Build / build (push) Failing after 16s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Python Package Build Test / build (3.13) (push) Failing after 2m19s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 2m25s SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2m32s Integration Tests / test-matrix (push) Failing after 2m24s Pre-commit / pre-commit (push) Successful in 3m57s # What does this PR do? Supports authentication for LocalFS Files provider. closes https://github.com/meta-llama/llama-stack/issues/2760 ## Test Plan CI. added tests. | ||
|  | 6d55f2f137 | feat: enable ls client for files tests (#2769) # What does this PR do? titled ## Test Plan CI | ||
|  | d7cc38e934 | fix: remove async test markers (fix pre-commit) (#2808) # What does this PR do? some async test markers are in the codebase causing pre-commit to fail due to #2744 remove these pytest fixtures ## Test Plan pre-commit passes Signed-off-by: Charlie Doern <cdoern@redhat.com> | ||
|  | 3ae4aeb344 | test: add some tests for Telemetry API (#2787) # What does this PR do? ## Test Plan ENABLE_OLLAMA=ollama LLAMA_STACK_CONFIG=starter uv run pytest tests/integration/telemetry --text-model="ollama/llama3.2:3b-instruct-fp16" | ||
|  | e1755d1ed2 | chore:  Adding OpenAI Vector Stores Files API compatibility for PGVector (#2755) # What does this PR do? Adding OpenAI Vector Stores Files API compatibility for PGVector <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Updated CI to include PGVector --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> | ||
|  | 31b088978a | fix: Fix /vector-stores/createAPI when vector store with duplicatename(#2617)# What does this PR do? Resolves https://github.com/meta-llama/llama-stack/issues/2735 Currently, if you test against OpenAI's Vector Stores API the `client.vector_stores.search` call fails with an invalid vector_db during routing (see the script referenced in the clickable item under the Test Plan section). This PR ensures that `client.vector_stores.search()` is compatible with OpenAI's Vector Stores API. Two biggest changes: 1. The `name`, which was previously used as the `vector_db_id`, has been changed to be consistent with OpenAI's `vs_{uuid}` format. 2. The vector store ID has to be referenced by the ID, the name is not reliable as every `client.vector_stores.create` results in a new vector store. NOTE: I believe this is a breaking change for end users as they'll need to update their VectorDB identifiers. ## Test Plan Unit tests: ```bash ./scripts/unit-tests.sh tests/unit/providers/vector_io/ -v ``` Integration tests: ```bash ENABLE_MILVUS=milvus llama stack run /Users/farceo/dev/llama-stack/llama_stack/templates/starter/run.yaml --image-type venv LLAMA_STACK_CONFIG=http://localhost:8321 pytest -sv tests/integration/vector_io/test_openai_vector_stores.py --embedding-model=all-MiniLM-L6-v2 -vv ``` Unit tests and test script below 👇 <details> <summary>Click here for script used to test OpenAI and Llama Stack Vector Store implementation</summary> ```python import json import argparse from openai import OpenAI, pagination import logging from colorama import Fore, Style, init import traceback import os # Initialize colorama for color support in terminal init(autoreset=True) # Setup basic logging logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') DEMO_VECTOR_STORE_NAME = "Support FAQ FJA" global DEMO_VECTOR_STORE_ID global DEMO_VECTOR_STORE_ID2 def colored_print(color, text): """Prints text to the console with the specified color.""" print(f"{color}{text}{Style.RESET_ALL}") def log_and_print(color, message, level=logging.INFO): """Logs a message and prints it to the console with the specified color.""" logging.log(level, message) colored_print(color, message) def run_tests(client, prefix="openai"): """ Runs all tests using the provided OpenAI client and saves the output to JSON files with the given prefix. """ # Create the directory if it doesn't exist os.makedirs('openai_testing', exist_ok=True) # Default values in case tests fail global DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 DEMO_VECTOR_STORE_ID = None DEMO_VECTOR_STORE_ID2 = None def test_idempotent_vector_store_creation(): """ Test that creating a vector store with the same name is idempotent. """ log_and_print(Fore.BLUE, "Starting vector store creation test...") try: vector_store = client.vector_stores.create( name=DEMO_VECTOR_STORE_NAME, ) # Attempt to create the same vector store again vector_store2 = client.vector_stores.create( name=DEMO_VECTOR_STORE_NAME, ) # Check instead of assert if vector_store2.id != vector_store.id: log_and_print(Fore.YELLOW, f"FAILED IDEMPOTENCY: the same VectorStore name for {prefix.upper()} does not return the same ID", level=logging.WARNING) else: log_and_print(Fore.GREEN, f"PASSED IDEMPOTENCY: f{vector_store2.id} == {vector_store.id} the same VectorStore name for {prefix.upper()} returns the same ID") vector_store_data = vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.create = {json.dumps(vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_create.json', 'w') as f: json.dump(vector_store_data, f, indent=2) global DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 DEMO_VECTOR_STORE_ID = vector_store.id DEMO_VECTOR_STORE_ID2 = vector_store2.id return DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 except Exception as e: log_and_print(Fore.RED, f"Idempotent vector store creation test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) # Create a fallback vector store ID if needed if 'vector_store' in locals() and vector_store: DEMO_VECTOR_STORE_ID = vector_store.id return DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 def test_vector_store_list(): """ Test listing vector stores. """ log_and_print(Fore.BLUE, "Starting vector store list test...") try: vector_stores = client.vector_stores.list() # Check instead of assert if not isinstance(vector_stores, pagination.SyncCursorPage): log_and_print(Fore.YELLOW, f"FAILED: Expected a list of vector stores, got {type(vector_stores)}", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Vector store list test passed!") vector_stores_data = vector_stores.to_dict() log_and_print(Fore.WHITE, f"vector_stores.list = {json.dumps(vector_stores_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_list.json', 'w') as f: json.dump(vector_stores_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Vector store list test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_retrieve_vector_store(): """ Test retrieving a specific vector store. """ log_and_print(Fore.BLUE, "Starting retrieve vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping retrieve vector store test - no vector store ID available", level=logging.WARNING) return try: vector_store = client.vector_stores.retrieve( vector_store_id=DEMO_VECTOR_STORE_ID, ) # Check instead of assert if vector_store.id != DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "FAILED: Retrieved vector store ID does not match", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Retrieve vector store test passed!") vector_store_data = vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.retrieve = {json.dumps(vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_retrieve.json', 'w') as f: json.dump(vector_store_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Retrieve vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_modify_vector_store(): """ Test modifying a vector store. """ log_and_print(Fore.BLUE, "Starting modify vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping modify vector store test - no vector store ID available", level=logging.WARNING) return try: updated_vector_store = client.vector_stores.update( vector_store_id=DEMO_VECTOR_STORE_ID, name="Updated Support FAQ FJA", ) # Check instead of assert if updated_vector_store.name != "Updated Support FAQ FJA": log_and_print(Fore.YELLOW, "FAILED: Vector store name was not updated correctly", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Modify vector store test passed!") updated_vector_store_data = updated_vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.modify = {json.dumps(updated_vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_modify.json', 'w') as f: json.dump(updated_vector_store_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Modify vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_delete_vector_store(): """ Test deleting a vector store. """ log_and_print(Fore.BLUE, "Starting delete vector store test...") if not DEMO_VECTOR_STORE_ID2: log_and_print(Fore.YELLOW, "Skipping delete vector store test - no second vector store ID available", level=logging.WARNING) return try: response = client.vector_stores.delete( vector_store_id=DEMO_VECTOR_STORE_ID2, ) log_and_print(Fore.GREEN, "Delete vector store test passed!") response_data = response.to_dict() log_and_print(Fore.WHITE, f"Vector store delete response = {json.dumps(response_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_delete.json', 'w') as f: json.dump(response_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Delete vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_create_vector_store_file(): log_and_print(Fore.BLUE, "Starting create vector store file test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping create vector store file test - no vector store ID available", level=logging.WARNING) return try: # create jsonl of files as an example with open("mydata.jsonl", "w") as f: f.write('{"text": "What is the return policy?", "metadata": {"category": "support"}}\n') f.write('{"text": "How do I reset my password?", "metadata": {"category": "support"}}\n') f.write('{"text": "Where can I find my order history?", "metadata": {"category": "support"}}\n') f.write('{"text": "What are the shipping options?", "metadata": {"category": "support"}}\n') f.write('{"text": "What is your favorite banana?", "metadata": {"category": "support"}}\n') # Create a simple text file if my_data_small.txt doesn't exist if not os.path.exists("my_data_small.txt"): with open("my_data_small.txt", "w") as f: f.write("This is a test file for vector store testing.\n") created_file = client.files.create( file=open("my_data_small.txt", "rb"), purpose="assistants", ) created_file_data = created_file.to_dict() log_and_print(Fore.WHITE, f"Created file {json.dumps(created_file_data, indent=2)}") with open(f'openai_testing/{prefix}_file_create.json', 'w') as f: json.dump(created_file_data, f, indent=2) retrieved_files = client.files.retrieve(created_file.id) retrieved_files_data = retrieved_files.to_dict() log_and_print(Fore.WHITE, f"Retrieved file {json.dumps(retrieved_files_data, indent=2)}") with open(f'openai_testing/{prefix}_file_retrieve.json', 'w') as f: json.dump(retrieved_files_data, f, indent=2) vector_store_file = client.vector_stores.files.create( vector_store_id=DEMO_VECTOR_STORE_ID, file_id=created_file.id, ) log_and_print(Fore.GREEN, "Create vector store file test passed!") except Exception as e: log_and_print(Fore.RED, f"Create vector store file test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_search_vector_store(): """ Test searching a vector store. """ log_and_print(Fore.BLUE, "Starting search vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping search vector store test - no vector store ID available", level=logging.WARNING) return try: query = "What is the banana policy?" search_results = client.vector_stores.search( vector_store_id=DEMO_VECTOR_STORE_ID, query=query, max_num_results=10, ranking_options={ 'ranker': 'default-2024-11-15', 'score_threshold': 0.0, }, rewrite_query=False, ) # Check instead of assert if not isinstance(search_results, pagination.SyncPage): log_and_print(Fore.YELLOW, f"FAILED: Expected a list of search results, got {type(search_results)}", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Search vector store test passed!") search_results_dict = search_results.to_dict() log_and_print(Fore.WHITE, f"Search results = {search_results_dict}") with open(f'openai_testing/{prefix}_vector_store_search.json', 'w') as f: json.dump(search_results_dict, f, indent=2) log_and_print(Fore.WHITE, f"vector_stores.search = {search_results.to_json()}") except Exception as e: log_and_print(Fore.RED, f"Search vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) # Run all tests in sequence, even if some fail test_results = [] try: result = test_idempotent_vector_store_creation() if result and len(result) == 2: DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 = result test_results.append(True) except Exception as e: log_and_print(Fore.RED, f"Vector store creation test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) test_results.append(False) for test_func in [ test_vector_store_list, test_retrieve_vector_store, test_modify_vector_store, test_delete_vector_store, test_create_vector_store_file, test_search_vector_store ]: try: test_func() test_results.append(True) except Exception as e: log_and_print(Fore.RED, f"{test_func.__name__} failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) test_results.append(False) if all(test_results): log_and_print(Fore.GREEN, f"All {prefix} tests completed successfully!") else: failed_count = test_results.count(False) log_and_print(Fore.YELLOW, f"{failed_count} {prefix} test(s) failed, but script completed.") if __name__ == "__main__": parser = argparse.ArgumentParser(description="Run OpenAI and/or LlamaStack tests.") parser.add_argument( "--provider", type=str, default="llama", choices=["openai", "llama", "both"], help="Specify which environment to test: openai, llama, or both. Default is both.", ) args = parser.parse_args() try: if args.provider in ("openai", "both"): openai_client = OpenAI() run_tests(openai_client, prefix="openai") if args.provider in ("llama", "both"): llama_client = OpenAI(base_url="http://localhost:8321/v1/openai/v1", api_key="none") run_tests(llama_client, prefix="llama") log_and_print(Fore.GREEN, "All tests completed!") except Exception as e: log_and_print(Fore.RED, f"Tests failed to complete: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) ``` </details> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> | ||
|  | 6b8a8c1be9 | fix: Safety in starter (#2731) - fireworks, together do not support Llama-guard 3 8b model anymore - Need to default to ollama - current safety shields logic was not correct since the shield_id was the provider ( which had duplicates ) - Followed similar logic to models Note: Seems a bit over-engineered but this can now be extended to other providers and fits in the overall mechanism of how env_vars are used to manage starter. ### How to test ``` ENABLE_OLLAMA=ollama ENABLE_FIREWORKS=fireworks SAFETY_MODEL=llama-guard3:1b pytest -s -v tests/integration/ --stack-config starter -k 'not(supervised_fine_tune or builtin_tool_code or safety_with_image or code_interpreter_for or rag_and_code or truncation or register_and_unregister)' --text-model fireworks/meta-llama/Llama-3.3-70B-Instruct --vision-model fireworks/meta-llama/Llama-4-Scout-17B-16E-Instruct --safety-shield llama-guard3:1b --embedding-model all-MiniLM-L6-v2 ``` ### Related but not obvious in this PR In the llama-stack-ops repo, we run tests before publishing packages and docker containers. The actions in that repo were using the fireworks / together distros ( which are non-existent ) So need to update that to run with `starter` and use `ollama` specifically for safety. | ||
|  | aa2595c7c3 | fix: sambanova shields and model validation (#2693) # What does this PR do? Update the shield register validation of Sambanova not to raise, but only warn when a model is not available in the base url endpoint used, also added warnings when model is not available in the base url endpoint used <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> run starter distro with Sambanova enabled | ||
|  | 30b2e6a495 | chore: default to pytest asyncio-mode=auto (#2730) # What does this PR do? previously, developers who ran `./scripts/unit-tests.sh` would get `asyncio-mode=auto`, which meant `@pytest.mark.asyncio` and `@pytest_asyncio.fixture` were redundent. developers who ran `pytest` directly would get pytest's default (strict mode), would run into errors leading them to add `@pytest.mark.asyncio` / `@pytest_asyncio.fixture` to their code. with this change - - `asyncio_mode=auto` is included in `pyproject.toml` making behavior consistent for all invocations of pytest - removes all redundant `@pytest_asyncio.fixture` and `@pytest.mark.asyncio` - for good measure, requires `pytest>=8.4` and `pytest-asyncio>=1.0` ## Test Plan - `./scripts/unit-tests.sh` - `uv run pytest tests/unit` | ||
|  | d880c2df0e | fix: auth sql store: user is owner policy (#2674) 
		
			Some checks failed
		
		
	 Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Installer CI / lint (push) Failing after 4s Installer CI / smoke-test (push) Has been skipped Integration Tests / discover-tests (push) Successful in 5s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 4s Python Package Build Test / build (3.12) (push) Failing after 7s Python Package Build Test / build (3.13) (push) Failing after 8s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Test Llama Stack Build / generate-matrix (push) Successful in 10s Test External Providers / test-external-providers (venv) (push) Failing after 8s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Unit Tests / unit-tests (3.13) (push) Failing after 8s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 13s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 12s Update ReadTheDocs / update-readthedocs (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 13s Test Llama Stack Build / build-single-provider (push) Failing after 13s Integration Tests / test-matrix (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Unit Tests / unit-tests (3.12) (push) Failing after 13s Test Llama Stack Build / build (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s Pre-commit / pre-commit (push) Successful in 1m8s # What does this PR do? The current authorized sql store implementation does not respect user.principal (only checks attributes). This PR addresses that. ## Test Plan Added test cases to integration tests. | ||
|  | 01c222e12f | ci: run all APIs integration tests (#2646) # What does this PR do? We are now automatically building the list of integration test to run. In that process, eval and files and being tested now. This is pending https://github.com/meta-llama/llama-stack/pull/2628 Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | 81109a0f72 | test: terminate server process when finished (#2700) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (server, 3.12, providers) (push) Failing after 14s Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 14s Integration Tests / test-matrix (server, 3.12, tool_runtime) (push) Failing after 7s Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 7s Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 7s Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 6s Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 6s Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 6s Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 6s Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 7s Integration Tests / test-matrix (server, 3.13, safety) (push) Failing after 6s Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 5s Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 6s Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 5s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s Python Package Build Test / build (3.12) (push) Failing after 7s Python Package Build Test / build (3.13) (push) Failing after 8s Test External Providers / test-external-providers (venv) (push) Failing after 10s Unit Tests / unit-tests (3.12) (push) Failing after 9s Unit Tests / unit-tests (3.13) (push) Failing after 8s Pre-commit / pre-commit (push) Successful in 1m31s # What does this PR do? Terminate server process for real. ## Test Plan ```ENABLE_OPENAI=openai LLAMA_STACK_CONFIG=server:starter pytest -v tests/integration/agents/test_openai_responses.py --text-model "gpt-4o-mini" -vv -s -k 'test_list_response_input_items[' && lsof -ti:8321``` observe no process printed anymore | ||
|  | 780b4c6eea | fix: llama stack run starter in conda (#2679) # What does this PR do? `llama stack run starter` in conda environment fails with ' --config is required for venv and conda environments' because it is passed as --template and start_stack.sh doesn't process template. ## Test Plan `llama stack run starter` | ||
|  | e9926564bd | fix: authorized sql store with postgres (#2641) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 13s Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 11s Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 14s Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 14s Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 8s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 28s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 27s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s Test Llama Stack Build / generate-matrix (push) Successful in 5s Python Package Build Test / build (3.12) (push) Failing after 1s Test External Providers / test-external-providers (venv) (push) Failing after 3s Python Package Build Test / build (3.13) (push) Failing after 3s Update ReadTheDocs / update-readthedocs (push) Failing after 3s Test Llama Stack Build / build (push) Failing after 4s Unit Tests / unit-tests (3.12) (push) Failing after 4s Unit Tests / unit-tests (3.13) (push) Failing after 7s Test Llama Stack Build / build-single-provider (push) Failing after 44s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 41s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 43s Pre-commit / pre-commit (push) Successful in 1m34s # What does this PR do? postgres has different json extract syntax from sqlite ## Test Plan added integration test | ||
|  | 5561f1c36d | ci: error when a pipefails (#2635) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (server, 3.12, inference) (push) Failing after 9s Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 11s Integration Tests / test-matrix (server, 3.12, providers) (push) Failing after 10s Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 12s Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 10s Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 12s Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 30s Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 26s Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 24s Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 22s Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 7s Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Python Package Build Test / build (3.12) (push) Failing after 2s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Test External Providers / test-external-providers (venv) (push) Failing after 3s Unit Tests / unit-tests (3.12) (push) Failing after 6s Python Package Build Test / build (3.13) (push) Failing after 1m1s Unit Tests / unit-tests (3.13) (push) Failing after 1m5s Pre-commit / pre-commit (push) Successful in 1m53s # What does this PR do? The CI was failing but the error was eaten by the pipe. Now we run the task with pipefail. Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | c4349f532b | feat: consolidate most distros into "starter" (#2516) # What does this PR do? * Removes a bunch of distros * Removed distros were added into the "starter" distribution * Doc for "starter" has been added * Partially reverts https://github.com/meta-llama/llama-stack/pull/2482 since inference providers are disabled by default and can be turned on manually via env variable. * Disables safety in starter distro Closes: https://github.com/meta-llama/llama-stack/issues/2502. ~Needs: https://github.com/meta-llama/llama-stack/pull/2482 for Ollama to work properly in the CI.~ TODO: - [ ] We can only update `install.sh` when we get a new release. - [x] Update providers documentation - [ ] Update notebooks to reference starter instead of ollama Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | ef26259209 | feat: add llama guard 4 model (#2579) add support for Llama Guard 4 model to the llama_guard safety provider test with - 0. NVIDIA_API_KEY=... llama stack build --image-type conda --image-name env-nvidia --providers inference=remote::nvidia,safety=inline::llama-guard --run 1. llama-stack-client models register meta-llama/Llama-Guard-4-12B --provider-model-id meta/llama-guard-4-12b 2. pytest tests/integration/safety/test_llama_guard.py Co-authored-by: raghotham <rsm@meta.com> | ||
|  | 4afd619c56 | chore: Add support for vector-stores files api for Milvus (#2582) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 10s Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 9s Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 13s Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 13s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 7s Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 14s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 24s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 18s Test Llama Stack Build / generate-matrix (push) Successful in 20s Python Package Build Test / build (3.13) (push) Failing after 1s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 28s Unit Tests / unit-tests (3.12) (push) Failing after 3s Test Llama Stack Build / build (push) Failing after 4s Test External Providers / test-external-providers (venv) (push) Failing after 6s Update ReadTheDocs / update-readthedocs (push) Failing after 5s Unit Tests / unit-tests (3.13) (push) Failing after 9s Python Package Build Test / build (3.12) (push) Failing after 51s Test Llama Stack Build / build-single-provider (push) Failing after 55s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 54s Pre-commit / pre-commit (push) Successful in 1m44s # What does this PR do? ### Summary This pull request implements support for the OpenAI Vector Store Files API for the Milvus vector store provider in `llama_stack`. It enables storing, loading, updating, and deleting file metadata and file contents in Milvus collections, allowing OpenAI vector store files to be managed directly within Milvus. ### Main Changes - **Milvus Vector Store Files API Implementation** - Implements all required methods for storing, loading, updating, and deleting vector store file metadata and contents (`_save_openai_vector_store_file`, `_load_openai_vector_store_file`, `_load_openai_vector_store_file_contents`, `_update_openai_vector_store_file`, `_delete_openai_vector_store_file_from_storage`). - Uses two Milvus collections: `openai_vector_store_files` (for metadata) and `openai_vector_store_files_contents` (for chunked file contents). - Collections are created dynamically if they do not exist, with appropriate schema definitions. - **Collection Name Sanitization** - Adds a `sanitize_collection_name` utility to ensure Milvus collection names only contain valid characters (letters, numbers, underscores). - **Testing** - Updates test skip logic to include `"inline::milvus"` for cases where the OpenAI Vector Store Files API is not supported, improving integration test accuracy. - **Other Improvements** - Passes `kvstore` to `MilvusIndex` for consistency. - Removes obsolete NotImplementedErrors and legacy code for file storage. ## Test Plan CI and tested via a test script ## Notes - `VectorDB` currently uses the `name` as the `identifier` in `openai_create_vector_store`. We need to add `name` as a field to `VectorDB` and generate the `identifier` upon creation. OpenAI is not idempotent with respect to the `name` field that they pass (i.e., you can pass the same name multiple times and OpenAI will generate a new identifier). I'll add a follow up PR for this. - The `Files` api needs to use `files-` as a prefix in the identifier. I have updated the Vector Store to use the OpenAI prefix `vs_*`. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> | ||
|  | dae1fcd3c2 | ci: let pytest run the distro server (#2586) # What does this PR do? * Use #2580 functionality to auto-start the server with the tests * Reduce timeout to 30sec * Print server logs on errors * Pytest logs are collected to a file pytest.log Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | fc735a414e | test: Add one-step integration testing with server auto-start (#2580) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 14s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 13s Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 21s Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 20s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Python Package Build Test / build (3.12) (push) Failing after 1m3s Python Package Build Test / build (3.13) (push) Failing after 1m3s Test External Providers / test-external-providers (venv) (push) Failing after 1m7s Unit Tests / unit-tests (3.12) (push) Failing after 1m15s Unit Tests / unit-tests (3.13) (push) Failing after 19s Pre-commit / pre-commit (push) Successful in 2m42s ## Summary Add support for `server:<config>` format in `--stack-config` option to enable seamless one-step integration testing. This eliminates the need to manually start servers in separate terminals before running tests. ## Key Features - **Auto-start server**: Automatically launches `llama stack run <config>` if target port is available - **Smart reuse**: Reuses existing server if port is already occupied - **Health check polling**: Waits up to 2 minutes for server readiness via `/v1/health` endpoint - **Custom port support**: Use `server:<config>:<port>` for non-default ports - **Clean output**: Server runs quietly in background without cluttering test output - **Backward compatibility**: All existing `--stack-config` formats continue to work ## Usage Examples ```bash # Auto-start server with default port 8321 pytest tests/integration/inference/ --stack-config=server:fireworks # Use custom port pytest tests/integration/safety/ --stack-config=server:together:8322 # Run multiple test suites seamlessly pytest tests/integration/inference/ tests/integration/agents/ --stack-config=server:starter ``` ## Implementation Details - Enhanced `llama_stack_client` fixture with server management - Updated documentation with cleaner organization and comprehensive examples - Added utility functions for port checking, server startup, and health verification ## Test Plan - Verified server auto-start when port 8321 is available - Verified server reuse when port 8321 is occupied - Tested health check polling via `/v1/health` endpoint - Confirmed custom port configuration works correctly - Verified backward compatibility with existing config formats ## Before/After Comparison **Before (2 steps):** ```bash # Terminal 1: Start server manually llama stack run fireworks --port 8321 # Terminal 2: Wait for startup, then run tests pytest tests/integration/inference/ --stack-config=http://localhost:8321 ``` **After (1 step):** ```bash # Single command handles everything pytest tests/integration/inference/ --stack-config=server:fireworks ``` | ||
|  | 0066135944 | chore: Enabling VectorIO Integration tests for Milvus (#2546) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 12s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 12s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 17s Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Test Llama Stack Build / generate-matrix (push) Successful in 6s Python Package Build Test / build (3.13) (push) Failing after 1s Test External Providers / test-external-providers (venv) (push) Failing after 6s Test Llama Stack Build / build (push) Failing after 4s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Update ReadTheDocs / update-readthedocs (push) Failing after 5s Unit Tests / unit-tests (3.12) (push) Failing after 8s Test Llama Stack Build / build-single-provider (push) Failing after 41s Python Package Build Test / build (3.12) (push) Failing after 35s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 41s Unit Tests / unit-tests (3.13) (push) Failing after 37s Pre-commit / pre-commit (push) Successful in 2m3s | ||
|  | be9bf68246 | feat: Add webmethod for deleting openai responses (#2160) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 16s Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 11s Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 12s Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 12s Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 11s Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 12s Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 12s Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 12s Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 17s Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 7s Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 16s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 21s Test External Providers / test-external-providers (venv) (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 19s Unit Tests / unit-tests (3.12) (push) Failing after 9s Update ReadTheDocs / update-readthedocs (push) Failing after 7s Unit Tests / unit-tests (3.13) (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 39s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 37s Python Package Build Test / build (3.13) (push) Failing after 33s Python Package Build Test / build (3.12) (push) Failing after 36s Pre-commit / pre-commit (push) Failing after 1m19s # What does this PR do? This PR creates a webmethod for deleting open AI responses, adds and implementation for it and makes an integration test for the OpenAI delete response method. [//]: # (If resolving an issue, uncomment and update the line below) # (Closes #2077) ## Test Plan Ran the standard tests and the pre-commit hooks and the unit tests. # (## Documentation) For this pr I made the routes and implementation based on the current get and create methods. The unit tests were not able to handle this test due to the mock interface in use, which did not allow for effective CRUD to be tested. I instead created an integration test to match the existing ones in the test_openai_responses. | ||
|  | cc19b56c87 | chore: OpenAI compatibility for Milvus (#2470) # What does this PR do? Closes https://github.com/meta-llama/llama-stack/issues/2461 ## Test Plan Tested with the `ollama` distriubtion template and updated the vector_io provider to: ```yaml vector_io: - provider_id: milvus provider_type: inline::milvus config: db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/ollama}/milvus_store.db kvstore: type: sqlite db_name: milvus_registry.db ``` Ran the stack ```bash llama stack run ./llama_stack/templates/ollama/run.yaml --image-type venv --env OLLAMA_URL="http://0.0.0.0:11434" ``` Ran the tests: ``` pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` Output passed. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> | ||
|  | 1d3f27fe5b | fix: resume responses with tool call output (#2524) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 12s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 10s Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 17s Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 6s Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 9s Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 6s Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 11s Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 10s Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Python Package Build Test / build (3.12) (push) Failing after 5s Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 9s Unit Tests / unit-tests (3.12) (push) Failing after 5s Update ReadTheDocs / update-readthedocs (push) Failing after 3s Python Package Build Test / build (3.13) (push) Failing after 49s Test External Providers / test-external-providers (venv) (push) Failing after 49s Unit Tests / unit-tests (3.13) (push) Failing after 49s Pre-commit / pre-commit (push) Successful in 2m5s # What does this PR do? closes #2522 ## Test Plan added integration test LLAMA_STACK_CONFIG=http://localhost:8321 pytest -v tests/integration/agents/test_openai_responses.py --text-model "accounts/fireworks/models/llama-v3p3-70b-instruct" -vv -k 'function_call' | ||
|  | cfee63bd0d | feat: Add search_mode support to OpenAI vector store API (#2500) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 17s Python Package Build Test / build (3.13) (push) Failing after 5s Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 18s Test Llama Stack Build / build-single-provider (push) Failing after 8s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 13s Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 11s Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 17s Unit Tests / unit-tests (3.12) (push) Failing after 7s Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 17s Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 16s Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 18s Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Unit Tests / unit-tests (3.13) (push) Failing after 8s Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 19s Test Llama Stack Build / build (push) Failing after 5s Update ReadTheDocs / update-readthedocs (push) Failing after 44s Test External Providers / test-external-providers (venv) (push) Failing after 47s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 50s Pre-commit / pre-commit (push) Successful in 2m12s # What does this PR do? Add search_mode parameter (vector/keyword/hybrid) to openai_search_vector_store method. Fixes OpenAPI code generation by using str instead of Literal type. Closes: #2459 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> | ||
|  | f394c7f2d9 | feat: Add missing Vector Store Files API surface (#2468) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 16s Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 26s Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 19s Python Package Build Test / build (3.11) (push) Failing after 5s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 6s Python Package Build Test / build (3.12) (push) Failing after 3s Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 18s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 17s Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 15s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 18s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Python Package Build Test / build (3.13) (push) Failing after 5s Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 24s Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 20s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 15s Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 21s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 12s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 15s Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 22s Unit Tests / unit-tests (3.11) (push) Failing after 7s Update ReadTheDocs / update-readthedocs (push) Failing after 4s Unit Tests / unit-tests (3.12) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 48s Test External Providers / test-external-providers (venv) (push) Failing after 43s Unit Tests / unit-tests (3.13) (push) Failing after 52s Pre-commit / pre-commit (push) Successful in 2m4s # What does this PR do? This adds the ability to list, retrieve, update, and delete Vector Store Files. It implements these new APIs for the faiss and sqlite-vec providers, since those are the two that also have the rest of the vector store files implementation. Closes #2445 ## Test Plan ### test_openai_vector_stores Integration Tests There are a number of new integration tests added, which I ran for each provider as outlined below. faiss (from ollama distro): ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io/test_openai_vector_stores.py \ --embedding-model=all-MiniLM-L6-v2 ``` sqlite-vec (from starter distro): ``` llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io/test_openai_vector_stores.py \ --embedding-model=all-MiniLM-L6-v2 ``` ### file_search verification tests I also ensured the file_search verification tests continue to work, both for faiss and sqlite-vec. faiss (ollama distro): ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.2-3B-Instruct ``` sqlite-vec (starter distro): ``` llama stack run llama_stack/templates/starter/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=together/meta-llama/Llama-3.2-3B-Instruct-Turbo ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> | ||
|  | 6039d922c0 | fix: allow running vector tests with embedding dimension (#2467) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 6s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 6s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 5s Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 28s Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 24s Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 26s Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 30s Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 28s Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 26s Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 23s Test Llama Stack Build / generate-matrix (push) Successful in 5s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 5s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Test External Providers / test-external-providers (venv) (push) Failing after 5s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 20s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 7s Unit Tests / unit-tests (3.11) (push) Failing after 7s Update ReadTheDocs / update-readthedocs (push) Failing after 6s Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 8s Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 22s Test Llama Stack Build / build (push) Failing after 17s Unit Tests / unit-tests (3.13) (push) Failing after 37s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 1m7s Test Llama Stack Build / build-single-provider (push) Failing after 1m15s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1m17s Unit Tests / unit-tests (3.12) (push) Failing after 1m32s Pre-commit / pre-commit (push) Failing after 2m14s # What does this PR do? Do not force 384 for the embedding dimension, use the one provided by the test run. ## Test Plan ``` pytest -s -vvv tests/integration/vector_io/test_vector_io.py --stack-config=http://localhost:8321 \ -k "not(builtin_tool or safety_with_image or code_interpreter or test_rag)" \ --text-model="meta-llama/Llama-3.2-3B-Instruct" \ --embedding-model=granite-embedding-125m --embedding-dimension=768 Uninstalled 1 package in 16ms Installed 1 package in 11ms INFO 2025-06-18 10:52:03,314 tests.integration.conftest:59 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS /Users/leseb/Documents/AI/llama-stack/.venv/lib/python3.10/site-packages/pytest_asyncio/plugin.py:207: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset. The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session" warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET)) ================================================= test session starts ================================================= platform darwin -- Python 3.10.16, pytest-8.3.4, pluggy-1.5.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python cachedir: .pytest_cache metadata: {'Python': '3.10.16', 'Platform': 'macOS-15.5-arm64-arm-64bit', 'Packages': {'pytest': '8.3.4', 'pluggy': '1.5.0'}, 'Plugins': {'cov': '6.0.0', 'html': '4.1.1', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'asyncio': '0.25.3', 'anyio': '4.8.0', 'nbval': '0.11.0'}} rootdir: /Users/leseb/Documents/AI/llama-stack configfile: pyproject.toml plugins: cov-6.0.0, html-4.1.1, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, asyncio-0.25.3, anyio-4.8.0, nbval-0.11.0 asyncio: mode=strict, asyncio_default_fixture_loop_scope=None collected 8 items tests/integration/vector_io/test_vector_io.py::test_vector_db_retrieve[emb=granite-embedding-125m:dim=768] PASSED tests/integration/vector_io/test_vector_io.py::test_vector_db_register[emb=granite-embedding-125m:dim=768] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case0] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case1] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case2] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case3] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case4] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks_with_precomputed_embeddings[emb=granite-embedding-125m:dim=768] PASSED ================================================== 8 passed in 5.50s ================================================== ``` Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | 985d0b156c | feat: Add suffixto openai_completions  (#2449)
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 9s Integration Tests / test-matrix (http, 3.11, providers) (push) Failing after 5s Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 7s Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 10s Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 13s Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 7s Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 6s Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 9s Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 8s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 7s Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 9s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 7s Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 8s Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 9s Test External Providers / test-external-providers (venv) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 14s Unit Tests / unit-tests (3.10) (push) Failing after 19s Unit Tests / unit-tests (3.11) (push) Failing after 20s Unit Tests / unit-tests (3.12) (push) Failing after 18s Unit Tests / unit-tests (3.13) (push) Failing after 16s Update ReadTheDocs / update-readthedocs (push) Failing after 8s Pre-commit / pre-commit (push) Successful in 58s For code completion apps need "fill in the middle" capabilities. Added option of `suffix` to `openai_completion` to enable this. Updated ollama provider to showcase the same. ### Test Plan ``` pytest -sv --stack-config="inference=ollama" tests/integration/inference/test_openai_completion.py --text-model qwen2.5-coder:1.5b -k test_openai_completion_non_streaming_suffix ``` ### OpenAI Sample script ``` from openai import OpenAI client = OpenAI(base_url="http://localhost:8321/v1/openai/v1") response = client.completions.create( model="qwen2.5-coder:1.5b", prompt="The capital of ", suffix="is Paris.", max_tokens=10, ) print(response.choices[0].text) ``` ### Output ``` France is ____. To answer this question, we ``` | ||
|  | 554ada57b0 | chore: Add OpenAI compatibility for Ollama embeddings (#2440) # What does this PR do? This PR adds OpenAI compatibility for Ollama embeddings. Closes https://github.com/meta-llama/llama-stack/issues/2428 Summary of changes: - `llama_stack/providers/remote/inference/ollama/ollama.py` - Implements the OpenAI embeddings endpoint for Ollama, replacing the NotImplementedError with a full function that validates the model, prepares parameters, calls the client, encodes embedding data (optionally in base64), and returns a correctly structured response. - Updates import statements to include the new embedding response utilities. - `llama_stack/providers/utils/inference/litellm_openai_mixin.py` - Refactors the embedding data encoding logic to use a new shared utility (`b64_encode_openai_embeddings_response`) instead of inline base64 encoding and packing logic. - Cleans up imports accordingly. - `llama_stack/providers/utils/inference/openai_compat.py` - Adds `b64_encode_openai_embeddings_response` to handle encoding OpenAI embedding outputs (including base64 support) in a reusable way. - Adds `prepare_openai_embeddings_params` utility for standardizing embedding parameter preparation. - Updates imports to include the new embedding data class. - `tests/integration/inference/test_openai_embeddings.py` - Removes `"remote::ollama"` from the list of providers that skip OpenAI embeddings tests, since support is now implemented. ## Note There was one minor issue, which required me to override the `OpenAIEmbeddingsResponse.model` name with `self._get_model(model).identifier` name, which is very unsatisfying. ## Test Plan Unit Tests and integration tests --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> | ||
|  | fef670b024 | feat: update openai tests to work with both clients (#2442) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (http, 3.11, post_training) (push) Failing after 18s Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 22s Integration Tests / test-matrix (http, 3.11, providers) (push) Failing after 20s Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 15s Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 8s Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 6s Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 7s Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 10s Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 13s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 7s Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 9s Test External Providers / test-external-providers (venv) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 6s Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 11s Unit Tests / unit-tests (3.11) (push) Failing after 9s Unit Tests / unit-tests (3.13) (push) Failing after 6s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 1m45s Update ReadTheDocs / update-readthedocs (push) Failing after 1m46s Unit Tests / unit-tests (3.12) (push) Failing after 2m1s Unit Tests / unit-tests (3.10) (push) Failing after 2m3s Pre-commit / pre-commit (push) Successful in 3m11s https://github.com/meta-llama/llama-stack-client-python/pull/238 updated llama-stack-client to also support Open AI endpoints for embeddings, files, vector-stores. This updates the test to test all configs -- openai sdk, llama stack sdk and library-as-client. | ||
|  | 0bc1747ed8 | feat: update search for vector_stores (#2441) Updated the `search` functionality return response to match openai. ## Test Plan ``` pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` | ||
|  | 35c2817d0a | fix(weaviate): handle case where distance is 0 by setting score to infinity (#2415) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s Integration Tests / test-matrix (http, 3.11, tool_runtime) (push) Failing after 41s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 10s Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 39s Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 41s Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 8s Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 7s Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 42s Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 38s Integration Tests / test-matrix (http, 3.10, providers) (push) Failing after 46s Integration Tests / test-matrix (http, 3.11, inspect) (push) Failing after 44s Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 42s Integration Tests / test-matrix (http, 3.11, datasets) (push) Failing after 43s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 12s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 40s Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 39s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 15s Test External Providers / test-external-providers (venv) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 14s Unit Tests / unit-tests (3.12) (push) Failing after 9s Unit Tests / unit-tests (3.10) (push) Failing after 1m3s Unit Tests / unit-tests (3.11) (push) Failing after 1m12s Unit Tests / unit-tests (3.13) (push) Failing after 1m10s Pre-commit / pre-commit (push) Successful in 2m23s # What does this PR do? Fixes provider weaviate `query_vector` function for when the distance between the query embedding and an embedding within the vector db is 0 (identical vectors). Catches `ZeroDivisionError` and then sets `score` to infinity, which represent maximum similarity. <!-- If resolving an issue, uncomment and update the line below --> Closes [#2381] ## Test Plan Checkout this PR Execute this code and there will no longer be a `ZeroDivisionError` exception ``` from llama_stack_client import LlamaStackClient base_url = "http://localhost:8321" client = LlamaStackClient(base_url=base_url) models = client.models.list() embedding_model = ( em := next(m for m in models if m.model_type == "embedding") ).identifier embedding_dimension = 384 _ = client.vector_dbs.register( vector_db_id="foo_db", embedding_model=embedding_model, embedding_dimension=embedding_dimension, provider_id="weaviate", ) chunk = { "content": "foo", "mime_type": "text/plain", "metadata": { "document_id": "foo-id" } } client.vector_io.insert(vector_db_id="foo_db", chunks=[chunk]) client.vector_io.query(vector_db_id="foo_db", query="foo") ``` | ||
|  | 5ac43268e8 | feat: Add OpenAI compat /v1/vector_store APIs (#2423) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 12s Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 11s Integration Tests / test-matrix (http, 3.10, post_training) (push) Failing after 41s Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 10s Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 13s Integration Tests / test-matrix (http, 3.10, tool_runtime) (push) Failing after 46s Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 11s Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 14s Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 12s Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 12s Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 14s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 11s Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 5s Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 16s Test External Providers / test-external-providers (venv) (push) Failing after 10s Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 15s Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 13s Update ReadTheDocs / update-readthedocs (push) Failing after 8s Unit Tests / unit-tests (3.13) (push) Failing after 11s Unit Tests / unit-tests (3.12) (push) Failing after 1m31s Unit Tests / unit-tests (3.11) (push) Failing after 1m33s Unit Tests / unit-tests (3.10) (push) Failing after 1m35s Pre-commit / pre-commit (push) Failing after 3h13m41s Adding OpenAI compat `/v1/vector-store` apis. This PR implements the `faiss` provider with followup PRs coming up for other providers. Added routes to create, update, delete, list vector stores. Also added route to search a vector store Inserting into vector stores is missing and will be a follow up diff. ### Test Plan - Added new integration test for testing the faiss provider ``` pytest -sv --stack-config http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` | ||
|  | ee57e58f29 | fix: loosen tool call checks in inference store (#2420) # What does this PR do? This loosens up the tool call function name and arguments checks in `tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls` because the small models we use in CI cannot reliably get the tool call function name or arguments exactly right. Closes #2345 ## Test Plan I ran this flaking test in a loop, let it run many dozens of times, and didn't observe any flakes after the changes. Previously it flaked quite regularly. ``` while uv run pytest -s -v \ 'tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[llama_stack_client-txt=3B-False]' \ --stack-config=http://localhost:8321 \ --text-model="meta-llama/Llama-3.2-3B-Instruct" \ --embedding-model=all-MiniLM-L6-v2; do; sleep 0.1; done ``` Signed-off-by: Ben Browning <bbrownin@redhat.com> | ||
|  | 92b59a3377 | test: skip files integrations tests for library client (#2407) # What does this PR do? ## Test Plan LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/integration/files/test_files.py::test_openai_client_basic_operations | ||
|  | 3c9a10d2fe | feat: reference implementation for files API  (#2330) 
		
			Some checks failed
		
		
	 Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Integration Tests / test-matrix (http, agents) (push) Failing after 10s Integration Tests / test-matrix (http, providers) (push) Failing after 8s Integration Tests / test-matrix (http, inference) (push) Failing after 11s Integration Tests / test-matrix (http, inspect) (push) Failing after 10s Integration Tests / test-matrix (http, datasets) (push) Failing after 11s Integration Tests / test-matrix (library, datasets) (push) Failing after 8s Integration Tests / test-matrix (http, scoring) (push) Failing after 10s Integration Tests / test-matrix (library, inference) (push) Failing after 8s Integration Tests / test-matrix (library, agents) (push) Failing after 10s Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 11s Integration Tests / test-matrix (library, inspect) (push) Failing after 8s Test External Providers / test-external-providers (venv) (push) Failing after 7s Integration Tests / test-matrix (library, post_training) (push) Failing after 9s Integration Tests / test-matrix (library, scoring) (push) Failing after 8s Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 8s Integration Tests / test-matrix (library, providers) (push) Failing after 9s Unit Tests / unit-tests (3.11) (push) Failing after 7s Unit Tests / unit-tests (3.10) (push) Failing after 7s Unit Tests / unit-tests (3.12) (push) Failing after 8s Unit Tests / unit-tests (3.13) (push) Failing after 8s Update ReadTheDocs / update-readthedocs (push) Failing after 6s Pre-commit / pre-commit (push) Successful in 53s # What does this PR do? TSIA Added Files provider to the fireworks template. Might want to add to all templates as a follow-up. ## Test Plan llama-stack pytest tests/unit/files/test_files.py llama-stack llama stack build --template fireworks --image-type conda --run LLAMA_STACK_CONFIG=http://localhost:8321 pytest -s -v tests/integration/files/ | ||
|  | b21050935e | feat: New OpenAI compat embeddings API (#2314) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (http, agents) (push) Failing after 9s Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Integration Tests / test-matrix (library, inference) (push) Failing after 9s Integration Tests / test-matrix (library, inspect) (push) Failing after 9s Integration Tests / test-matrix (library, post_training) (push) Failing after 15s Integration Tests / test-matrix (library, providers) (push) Failing after 14s Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 43s Integration Tests / test-matrix (library, scoring) (push) Failing after 8s Integration Tests / test-matrix (http, inference) (push) Failing after 46s Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 8s Integration Tests / test-matrix (library, agents) (push) Failing after 44s Integration Tests / test-matrix (http, inspect) (push) Failing after 47s Integration Tests / test-matrix (http, providers) (push) Failing after 45s Integration Tests / test-matrix (library, datasets) (push) Failing after 45s Integration Tests / test-matrix (http, post_training) (push) Failing after 46s Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 47s Integration Tests / test-matrix (http, datasets) (push) Failing after 49s Test External Providers / test-external-providers (venv) (push) Failing after 6s Update ReadTheDocs / update-readthedocs (push) Failing after 6s Unit Tests / unit-tests (3.12) (push) Failing after 7s Unit Tests / unit-tests (3.10) (push) Failing after 8s Unit Tests / unit-tests (3.11) (push) Failing after 8s Unit Tests / unit-tests (3.13) (push) Failing after 7s Pre-commit / pre-commit (push) Successful in 1m12s # What does this PR do? Adds a new endpoint that is compatible with OpenAI for embeddings api. `/openai/v1/embeddings` Added providers for OpenAI, LiteLLM and SentenceTransformer. ## Test Plan ``` LLAMA_STACK_CONFIG=http://localhost:8321 pytest -sv tests/integration/inference/test_openai_embeddings.py --embedding-model all-MiniLM-L6-v2,text-embedding-3-small,gemini/text-embedding-004 ``` | ||
|  | f328436831 | feat: Enable ingestion of precomputed embeddings (#2317) 
		
			Some checks failed
		
		
	 Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Integration Tests / test-matrix (http, inspect) (push) Failing after 9s Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Integration Tests / test-matrix (http, agents) (push) Failing after 10s Integration Tests / test-matrix (http, datasets) (push) Failing after 10s Integration Tests / test-matrix (http, inference) (push) Failing after 10s Integration Tests / test-matrix (library, agents) (push) Failing after 9s Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Integration Tests / test-matrix (library, datasets) (push) Failing after 8s Integration Tests / test-matrix (http, providers) (push) Failing after 9s Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 10s Integration Tests / test-matrix (library, inference) (push) Failing after 9s Test External Providers / test-external-providers (venv) (push) Failing after 6s Integration Tests / test-matrix (library, inspect) (push) Failing after 8s Integration Tests / test-matrix (library, providers) (push) Failing after 8s Integration Tests / test-matrix (library, scoring) (push) Failing after 8s Integration Tests / test-matrix (library, post_training) (push) Failing after 10s Unit Tests / unit-tests (3.11) (push) Failing after 7s Unit Tests / unit-tests (3.10) (push) Failing after 9s Unit Tests / unit-tests (3.13) (push) Failing after 7s Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s Unit Tests / unit-tests (3.12) (push) Failing after 9s Update ReadTheDocs / update-readthedocs (push) Failing after 7s Pre-commit / pre-commit (push) Successful in 1m15s |