Matthew Farrellee 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f520e244d9 
								
							 
						 
						
							
							
								
								feat: Add S3 Files Provider ( #3202 )  
							
							... 
							
							
							
							Implements a complete S3-based file storage provider for Llama Stack
with:
    
    Core Implementation:
    - S3FilesImpl class with full OpenAI Files API compatibility
    - Support for file upload, download, listing, deletion operations
    - Sqlite-based metadata storage for fast queries and API compliance
    - Configurable S3 endpoints (AWS, MinIO, LocalStack support)
    
    Key Features:
    - Automatic S3 bucket creation and management
    - Metadata persistence
    - Proper error handling for S3 connectivity and permissions
    
    Dependencies:
    - Adds boto3 for AWS S3 integration
    - Adds moto[s3] for testing infrastructure
    
    Testing:
    
Unit: `./scripts/unit-tests.sh tests/unit/files
tests/unit/providers/files`
    
     Integration:
    
Start MinIO: `podman run --rm -it -p 9000:9000 minio/minio server /data`
    
Start stack w/ S3 provider: `S3_ENDPOINT_URL=http://localhost:9000 
AWS_ACCESS_KEY_ID=minioadmin AWS_SECRET_ACCESS_KEY=minioadmin
S3_BUCKET_NAME=llama-stack-files uv run llama stack build --image-type
venv --providers files=remote::s3 --run`
    
Run integration tests: `./scripts/integration-tests.sh --stack-config
http://localhost:8321  --provider ollama --test-subdirs files` 
							
						 
						
							2025-08-22 10:38:59 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Matthew Farrellee 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								914c7be288 
								
							 
						 
						
							
							
								
								feat: add batches API with OpenAI compatibility (with inference replay) ( #3162 )  
							
							... 
							
							
							
							Add complete batches API implementation with protocol, providers, and
tests:
Core Infrastructure:
- Add batches API protocol using OpenAI Batch types directly
- Add Api.batches enum value and protocol mapping in resolver
- Add OpenAI "batch" file purpose support
- Include proper error handling (ConflictError, ResourceNotFoundError)
Reference Provider:
- Add ReferenceBatchesImpl with full CRUD operations (create, retrieve,
cancel, list)
- Implement background batch processing with configurable concurrency
- Add SQLite KVStore backend for persistence
- Support /v1/chat/completions endpoint with request validation
Comprehensive Test Suite:
- Add unit tests for provider implementation with validation
- Add integration tests for end-to-end batch processing workflows
- Add error handling tests for validation, malformed inputs, and edge
cases
Configuration:
- Add max_concurrent_batches and max_concurrent_requests_per_batch
options
- Add provider documentation with sample configurations
Test with -
```
$ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run &
$ LLAMA_STACK_CONFIG=http://localhost:8321  uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK
```
addresses #3066 
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> 
							
						 
						
							2025-08-15 15:34:15 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ee7631b6cf 
								
							 
						 
						
							
							
								
								Revert "feat: add batches API with OpenAI compatibility" ( #3149 )  
							
							... 
							
							
							
							Reverts llamastack/llama-stack#3088 
The PR broke integration tests. 
							
						 
						
							2025-08-14 10:08:54 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Matthew Farrellee 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								de692162af 
								
							 
						 
						
							
							
								
								feat: add batches API with OpenAI compatibility ( #3088 )  
							
							... 
							
							
	
		
			
	 
	
	
		
	
	
		
			
				
	Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped 
				
			 
		
			
				
	Integration Tests (Replay) / discover-tests (push) Successful in 12s 
				
			 
		
			
				
	Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s 
				
			 
		
			
				
	Python Package Build Test / build (3.12) (push) Failing after 16s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s 
				
			 
		
			
				
	Python Package Build Test / build (3.13) (push) Failing after 17s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 25s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 29s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.12) (push) Failing after 20s 
				
			 
		
			
				
	Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s 
				
			 
		
			
				
	Test External API and Providers / test-external (venv) (push) Failing after 22s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.13) (push) Failing after 18s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 24s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 24s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 25s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 27s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s 
				
			 
		
			
				
	Update ReadTheDocs / update-readthedocs (push) Failing after 38s 
				
			 
		
			
				
	Pre-commit / pre-commit (push) Successful in 1m53s 
				
			 
		
		
	 
 
	 
							
							Add complete batches API implementation with protocol, providers, and
tests:
Core Infrastructure:
- Add batches API protocol using OpenAI Batch types directly
- Add Api.batches enum value and protocol mapping in resolver
- Add OpenAI "batch" file purpose support
- Include proper error handling (ConflictError, ResourceNotFoundError)
Reference Provider:
- Add ReferenceBatchesImpl with full CRUD operations (create, retrieve,
cancel, list)
- Implement background batch processing with configurable concurrency
- Add SQLite KVStore backend for persistence
- Support /v1/chat/completions endpoint with request validation
Comprehensive Test Suite:
- Add unit tests for provider implementation with validation
- Add integration tests for end-to-end batch processing workflows
- Add error handling tests for validation, malformed inputs, and edge
cases
Configuration:
- Add max_concurrent_batches and max_concurrent_requests_per_batch
options
- Add provider documentation with sample configurations
Test with -
```
$ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run &
$ LLAMA_STACK_CONFIG=http://localhost:8321  uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK
```
addresses #3066  
							
						 
						
							2025-08-14 09:42:02 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Kelly Brown 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0cbd93c5cc 
								
							 
						 
						
							
							
								
								docs: Update blocks formatting in docs/source files ( #3120 )  
							
							... 
							
							
							
							**Description:** 
The standard markdown [!NOTE] format is not supported on Sphinx
generated documentation, replacing those instances. Also updating other
Notes, Tips and Warning blocks throughout the source docs
WIP: Working to update the provider code gen 
							
						 
						
							2025-08-13 08:06:31 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7f834339ba 
								
							 
						 
						
							
							
								
								chore(misc): make tests and starter faster ( #3042 )  
							
							... 
							
							
	
		
			
	 
	
	
		
	
	
		
			
				
	Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s 
				
			 
		
			
				
	Python Package Build Test / build (3.12) (push) Failing after 4s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s 
				
			 
		
			
				
	Test Llama Stack Build / generate-matrix (push) Successful in 11s 
				
			 
		
			
				
	Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s 
				
			 
		
			
				
	Test External API and Providers / test-external (venv) (push) Failing after 14s 
				
			 
		
			
				
	Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s 
				
			 
		
			
				
	Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.13) (push) Failing after 14s 
				
			 
		
			
				
	Test Llama Stack Build / build-single-provider (push) Failing after 13s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.12) (push) Failing after 16s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 10s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 18s 
				
			 
		
			
				
	Test Llama Stack Build / build (push) Failing after 12s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s 
				
			 
		
			
				
	Python Package Build Test / build (3.13) (push) Failing after 53s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s 
				
			 
		
			
				
	Update ReadTheDocs / update-readthedocs (push) Failing after 1m6s 
				
			 
		
			
				
	Pre-commit / pre-commit (push) Successful in 1m53s 
				
			 
		
		
	 
 
	 
							
							A bunch of miscellaneous cleanup focusing on tests, but ended up
speeding up starter distro substantially.
- Pulled llama stack client init for tests into `pytest_sessionstart` so
it does not clobber output
- Profiling of that told me where we were doing lots of heavy imports
for starter, so lazied them
- starter now starts 20seconds+ faster on my Mac
- A few other smallish refactors for `compat_client` 
							
						 
						
							2025-08-05 14:55:05 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2665f00102 
								
							 
						 
						
							
							
								
								chore(rename): move llama_stack.distribution to llama_stack.core ( #2975 )  
							
							... 
							
							
							
							We would like to rename the term `template` to `distribution`. To
prepare for that, this is a precursor.
cc @leseb 
							
						 
						
							2025-07-30 23:30:53 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Kelly Brown 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								026caa5551 
								
							 
						 
						
							
							
								
								docs: part 1 - fix warnings in documentation generation ( #2861 )  
							
							... 
							
							
							
							**Description**
This PR removes some of the warnings when uv builds the docs
- Errors appear when generating docs about .md files not appearing in
toctree. ~~Adding content to the `providers-gen.py ` file that adds `---
orphan: true ---` to to each file.~~. Added a toctree generator to the
`providers-gen.py` file, this gets rid of the errors in the builds.
- Deletes the `_openai_compat` files, extension of PR #2849 
- Adds the `files` APIs section to the `providers` toctree on the index
page
- Manually adds the `--- orphan: true ---` to the advanced apis. Ill try
to find a way to modify the providers code gen so it automatically adds
it, but this fixes the errors.
- Adds the `testing.md` to the `contributing` toctree
- Adds `starting_llama_stack_server.md` to `distributions` toctree
There are some other warnings im still looking at but this PR gets rid
of most of the toctree errors
Theres also an issue with the actual distribution-codegen that I can
investigate in another PR. Opened a bug for it here #2873  
							
						 
						
							2025-07-30 10:50:10 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Sébastien Han 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c9a49a80e8 
								
							 
						 
						
							
							
								
								docs: auto generated documentation for providers ( #2543 )  
							
							... 
							
							
							
							# What does this PR do?
Simple approach to get some provider pages in the docs.
Add or update description fields in the provider configuration class
using Pydantic’s Field, ensuring these descriptions are clear and
complete, as they will be used to auto-generate provider documentation
via ./scripts/distro_codegen.py instead of editing the docs manually.
Signed-off-by: Sébastien Han <seb@redhat.com> 
							
						 
						
							2025-06-30 15:13:20 +02:00