Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								b1659369e8 
								
							 
						 
						
							
							
								
								Refactor persistence config to use stores key with unified backends  
							
							... 
							
							
							
							- Add StoresConfig to group all store references under persistence.stores
- Use single 'default' backend instead of separate metadata_backend/inference_backend
- Update resolver to access persistence.stores.{metadata,inference,conversations}
- All SQLite distributions now use single store.db file with shared backend 
							
						 
						
							2025-10-05 13:20:44 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								099750dd00 
								
							 
						 
						
							
							
								
								Migrate remaining distributions to unified persistence config  
							
							
							
						 
						
							2025-10-05 13:17:09 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								42414a1a1b 
								
							 
						 
						
							
							
								
								fix(logging): disable console telemetry sink by default ( #3623 )  
							
							... 
							
							
	
		
			
	 
	
	
		
	
	
		
			
				
	SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s 
				
			 
		
			
				
	Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s 
				
			 
		
			
				
	Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (push) Failing after 3s 
				
			 
		
			
				
	Test Llama Stack Build / generate-matrix (push) Successful in 3s 
				
			 
		
			
				
	Python Package Build Test / build (3.12) (push) Failing after 1s 
				
			 
		
			
				
	Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s 
				
			 
		
			
				
	Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s 
				
			 
		
			
				
	Test External API and Providers / test-external (venv) (push) Failing after 4s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.13) (push) Failing after 3s 
				
			 
		
			
				
	Test Llama Stack Build / build (push) Failing after 4s 
				
			 
		
			
				
	Python Package Build Test / build (3.13) (push) Failing after 21s 
				
			 
		
			
				
	Test Llama Stack Build / build-single-provider (push) Failing after 25s 
				
			 
		
			
				
	Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 27s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.12) (push) Failing after 22s 
				
			 
		
			
				
	API Conformance Tests / check-schema-compatibility (push) Successful in 33s 
				
			 
		
			
				
	UI Tests / ui-tests (22) (push) Successful in 39s 
				
			 
		
			
				
	Pre-commit / pre-commit (push) Successful in 1m12s 
				
			 
		
		
	 
 
	 
							
							The current span processing dumps so much junk on the console that it
makes actual understanding of what is going on in the server impossible.
I am killing the console sink as a default. If you want, you are always
free to change your run.yaml to add it.
Before: 
<img width="1877" height="1107" alt="image"
src="https://github.com/user-attachments/assets/3a7ad261-e2ba-4d40-9820-fcc282c8df37 "
/>
After:
<img width="1919" height="470" alt="image"
src="https://github.com/user-attachments/assets/bc7cf763-fba9-4e95-a4b5-f65f6d1c5332 "
/> 
							
						 
						
							2025-09-30 14:58:05 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Sébastien Han 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f31bcc11bc 
								
							 
						 
						
							
							
								
								feat: add Azure OpenAI inference provider support ( #3396 )  
							
							... 
							
							
							
							# What does this PR do?
Llama-stack now supports a new OpenAI compatible endpoint with Azure
OpenAI. The starter distro has been updated to add the new remote
inference provider.
A few tests have been modified and improved.
## Test Plan
Deploy a model in the Aure portal then:
```
$ AZURE_API_KEY=... AZURE_API_BASE=... uv run llama stack build --image-type venv --providers inference=remote::azure --run
...
$ LLAMA_STACK_CONFIG=http://localhost:8321  uv run --group test pytest -v -ra --text-model azure/gpt-4.1 tests/integration/inference/test_openai_completion.py
...
Results:
```
============================================= test session starts
============================================== platform darwin -- Python
3.12.8, pytest-8.4.1, pluggy-1.6.0 --
/Users/leseb/Documents/AI/llama-stack/.venv/bin/python3 cachedir:
.pytest_cache
metadata: {'Python': '3.12.8', 'Platform':
'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.1',
'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1',
'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0',
'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval':
'0.11.0', 'hydra-core': '1.3.2'}} rootdir:
/Users/leseb/Documents/AI/llama-stack
configfile: pyproject.toml
plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0,
json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1,
nbval-0.11.0, hydra-core-1.3.2 asyncio: mode=Mode.AUTO,
asyncio_default_fixture_loop_scope=None,
asyncio_default_test_loop_scope=function collected 27 items
tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=azure/gpt-5-mini-inference:completion:sanity]
SKIPPED [ 3%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=azure/gpt-5-mini-inference:completion:suffix]
SKIPPED [ 7%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=azure/gpt-5-mini-inference:completion:sanity]
SKIPPED [ 11%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=azure/gpt-5-mini-1]
SKIPPED [ 14%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=azure/gpt-5-mini]
SKIPPED [ 18%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_01]
PASSED [ 22%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01]
PASSED [ 25%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01]
PASSED [ 29%]
tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=azure/gpt-5-mini-True]
PASSED [ 33%]
tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=azure/gpt-5-mini-True]
PASSED [ 37%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=azure/gpt-5-mini]
SKIPPEDed files.) [ 40%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=azure/gpt-5-mini-0]
SKIPPED [ 44%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_02]
PASSED [ 48%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02]
PASSED [ 51%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02]
PASSED [ 55%]
tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=azure/gpt-5-mini-False]
PASSED [ 59%]
tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=azure/gpt-5-mini-False]
PASSED [ 62%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_01]
PASSED [ 66%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01]
PASSED [ 70%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01]
PASSED [ 74%]
tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=azure/gpt-5-mini-True]
PASSED [ 77%]
tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=azure/gpt-5-mini-True]
PASSED [ 81%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_02]
PASSED [ 85%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02]
PASSED [ 88%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02]
PASSED [ 92%]
tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=azure/gpt-5-mini-False]
PASSED [ 96%]
tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=azure/gpt-5-mini-False]
PASSED [100%]
=========================================== short test summary info
============================================ SKIPPED [3]
tests/integration/inference/test_openai_completion.py:63: Model
azure/gpt-5-mini hosted by remote::azure doesn't support OpenAI
completions. SKIPPED [3]
tests/integration/inference/test_openai_completion.py:118: Model
azure/gpt-5-mini hosted by remote::azure doesn't support vllm extra_body
parameters. SKIPPED [1]
tests/integration/inference/test_openai_completion.py:124: Model
azure/gpt-5-mini hosted by remote::azure doesn't support chat completion
calls with base64 encoded files. ================================== 20
passed, 7 skipped, 2 warnings in 51.77s
==================================
```
Signed-off-by: Sébastien Han <seb@redhat.com> 
							
						 
						
							2025-09-11 13:48:38 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Derek Higgins 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								64b2977162 
								
							 
						 
						
							
							
								
								fix: Fix locations of distrubution runtime directories ( #3336 )  
							
							... 
							
							
							
							The defaults were mixed up
Signed-off-by: Derek Higgins <derekh@redhat.com> 
							
						 
						
							2025-09-05 14:09:36 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9fa69b0337 
								
							 
						 
						
							
							
								
								feat(distro): no huggingface provider for starter ( #3258 )  
							
							... 
							
							
							
							The `trl` dependency brings in `accelerate` which brings in nvidia
dependencies for torch. We cannot have that in the starter distro. As
such, no CPU-only post-training for the huggingface provider. 
							
						 
						
							2025-08-26 14:06:36 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7519b73fcc 
								
							 
						 
						
							
							
								
								feat(distro): fork off a starter-gpu distribution ( #3240 )  
							
							... 
							
							
							
							The starter distribution added post-training which added torch
dependencies which pulls in all the nvidia CUDA libraries. This made our
starter container very big. We have worked hard to keep the starter
container small so it serves its purpose as a starter. This PR tries to
get it back to its size by forking off duplicate "-gpu" providers for
post-training. These forked providers are then used for a new
`starter-gpu` distribution which can pull in all dependencies. 
							
						 
						
							2025-08-22 15:47:15 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									slekkala1 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7519ab4024 
								
							 
						 
						
							
							
								
								feat: Code scanner Provider impl for moderations api ( #3100 )  
							
							... 
							
							
							
							# What does this PR do?
Add CodeScanner implementations
## Test Plan
`SAFETY_MODEL=CodeScanner LLAMA_STACK_CONFIG=starter uv run pytest -v
tests/integration/safety/test_safety.py
--text-model=llama3.2:3b-instruct-fp16
--embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama`
This PR need to land after this
https://github.com/meta-llama/llama-stack/pull/3098  
							
						 
						
							2025-08-18 14:15:40 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Matthew Farrellee 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								914c7be288 
								
							 
						 
						
							
							
								
								feat: add batches API with OpenAI compatibility (with inference replay) ( #3162 )  
							
							... 
							
							
							
							Add complete batches API implementation with protocol, providers, and
tests:
Core Infrastructure:
- Add batches API protocol using OpenAI Batch types directly
- Add Api.batches enum value and protocol mapping in resolver
- Add OpenAI "batch" file purpose support
- Include proper error handling (ConflictError, ResourceNotFoundError)
Reference Provider:
- Add ReferenceBatchesImpl with full CRUD operations (create, retrieve,
cancel, list)
- Implement background batch processing with configurable concurrency
- Add SQLite KVStore backend for persistence
- Support /v1/chat/completions endpoint with request validation
Comprehensive Test Suite:
- Add unit tests for provider implementation with validation
- Add integration tests for end-to-end batch processing workflows
- Add error handling tests for validation, malformed inputs, and edge
cases
Configuration:
- Add max_concurrent_batches and max_concurrent_requests_per_batch
options
- Add provider documentation with sample configurations
Test with -
```
$ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run &
$ LLAMA_STACK_CONFIG=http://localhost:8321  uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK
```
addresses #3066 
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> 
							
						 
						
							2025-08-15 15:34:15 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Eran Cohen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a4bad6c0b4 
								
							 
						 
						
							
							
								
								feat: Add Google Vertex AI inference provider support ( #2841 )  
							
							... 
							
							
	
		
			
	 
	
	
		
	
	
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 10s 
				
			 
		
			
				
	Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped 
				
			 
		
			
				
	Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 12s 
				
			 
		
			
				
	Python Package Build Test / build (3.13) (push) Failing after 4s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s 
				
			 
		
			
				
	Test Llama Stack Build / generate-matrix (push) Successful in 8s 
				
			 
		
			
				
	Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s 
				
			 
		
			
				
	Test External API and Providers / test-external (venv) (push) Failing after 11s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s 
				
			 
		
			
				
	Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s 
				
			 
		
			
				
	Test Llama Stack Build / build-single-provider (push) Failing after 16s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.12) (push) Failing after 10s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s 
				
			 
		
			
				
	Update ReadTheDocs / update-readthedocs (push) Failing after 9s 
				
			 
		
			
				
	Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 23s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s 
				
			 
		
			
				
	Test Llama Stack Build / build (push) Failing after 8s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 47s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.13) (push) Failing after 39s 
				
			 
		
			
				
	Pre-commit / pre-commit (push) Successful in 1m37s 
				
			 
		
		
	 
 
	 
							
							# What does this PR do?
- Add new Vertex AI remote inference provider with litellm integration
- Support for Gemini models through Google Cloud Vertex AI platform
- Uses Google Cloud Application Default Credentials (ADC) for
authentication
- Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro,
gemini-2.0-flash.
- Updated provider registry to include vertexai provider
- Updated starter template to support Vertex AI configuration
- Added comprehensive documentation and sample configuration
<!-- If resolving an issue, uncomment and update the line below -->
relates to https://github.com/meta-llama/llama-stack/issues/2747 
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Eran Cohen <eranco@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> 
							
						 
						
							2025-08-11 08:22:04 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7f834339ba 
								
							 
						 
						
							
							
								
								chore(misc): make tests and starter faster ( #3042 )  
							
							... 
							
							
	
		
			
	 
	
	
		
	
	
		
			
				
	Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s 
				
			 
		
			
				
	Python Package Build Test / build (3.12) (push) Failing after 4s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s 
				
			 
		
			
				
	Test Llama Stack Build / generate-matrix (push) Successful in 11s 
				
			 
		
			
				
	Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s 
				
			 
		
			
				
	Test External API and Providers / test-external (venv) (push) Failing after 14s 
				
			 
		
			
				
	Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s 
				
			 
		
			
				
	SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s 
				
			 
		
			
				
	Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.13) (push) Failing after 14s 
				
			 
		
			
				
	Test Llama Stack Build / build-single-provider (push) Failing after 13s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s 
				
			 
		
			
				
	Unit Tests / unit-tests (3.12) (push) Failing after 16s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 10s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 18s 
				
			 
		
			
				
	Test Llama Stack Build / build (push) Failing after 12s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s 
				
			 
		
			
				
	Python Package Build Test / build (3.13) (push) Failing after 53s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s 
				
			 
		
			
				
	Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s 
				
			 
		
			
				
	Update ReadTheDocs / update-readthedocs (push) Failing after 1m6s 
				
			 
		
			
				
	Pre-commit / pre-commit (push) Successful in 1m53s 
				
			 
		
		
	 
 
	 
							
							A bunch of miscellaneous cleanup focusing on tests, but ended up
speeding up starter distro substantially.
- Pulled llama stack client init for tests into `pytest_sessionstart` so
it does not clobber output
- Profiling of that told me where we were doing lots of heavy imports
for starter, so lazied them
- starter now starts 20seconds+ faster on my Mac
- A few other smallish refactors for `compat_client` 
							
						 
						
							2025-08-05 14:55:05 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cc87995e2b 
								
							 
						 
						
							
							
								
								chore: rename templates to distributions ( #3035 )  
							
							... 
							
							
							
							As the title says. Distributions is in, Templates is out.
`llama stack build --template` --> `llama stack build --distro`. For
backward compatibility, the previous option is kept but results in a
warning.
Updated `server.py` to remove the "config_or_template" backward
compatibility since it has been a couple releases since that change. 
							
						 
						
							2025-08-04 11:34:17 -07:00