mirror of
				https://github.com/meta-llama/llama-stack.git
				synced 2025-10-25 09:05:37 +00:00 
			
		
		
		
	
	
		
			10 commits
		
	
	
	| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|  | de6919ecdd | refactor: install external providers from module (#2637) # What does this PR do?
Today, external providers are installed via the `external_providers_dir`
in the config. This necessitates users to understand the `ProviderSpec`
and set up their directories accordingly. This process splits up the
config for the stack across multiple files, directories, and formats.
Most (if not all) external providers today have a
[get_provider_spec]( | ||
|  | cd8715d327 | chore: Added openai compatible vector io endpoints for chromadb (#2489) 
		
			Some checks failed
		
		
	 Integration Tests / discover-tests (push) Successful in 3s Coverage Badge / unit-tests (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s Test Llama Stack Build / generate-matrix (push) Successful in 3s Python Package Build Test / build (3.13) (push) Failing after 2s Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Python Package Build Test / build (3.12) (push) Failing after 12s Test External Providers / test-external-providers (venv) (push) Failing after 12s Update ReadTheDocs / update-readthedocs (push) Failing after 10s Test Llama Stack Build / build-single-provider (push) Failing after 15s SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Unit Tests / unit-tests (3.13) (push) Failing after 14s Test Llama Stack Build / build (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s Unit Tests / unit-tests (3.12) (push) Failing after 14s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 49s Integration Tests / test-matrix (push) Failing after 53s Pre-commit / pre-commit (push) Successful in 1m42s # What does this PR do? This PR implements the openai compatible endpoints for chromadb Closes #2462 ## Test Plan Ran ollama llama stack server and ran the command `pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2` 8 failed, 27 passed, 8 skipped, 1 xfailed The failed ones are regarding files api --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com> Co-authored-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> | ||
|  | 3c43a2f529 | fix: store configs (#2593) # What does this PR do? https://github.com/meta-llama/llama-stack/pull/2490 broke postgres_demo, as the config expected a str but the value was converted to int. This PR: 1. Updates the type of port in sqlstore to be int 2. template generation uses `dict` instead of `StackRunConfig` so as to avoid failing pydantic typechecks. 3. Adds `replace_env_vars` to StackRunConfig instantiation in `configure.py` (not sure why this wasn't needed before). ## Test Plan `llama stack build --template postgres_demo --image-type conda --run` | ||
|  | 25268854bc | fix: allow default empty vars for conditionals (#2570) # What does this PR do?
We were not using conditionals correctly, conditionals can only be used
when the env variable is set, so `${env.ENVIRONMENT:+}` would return
None is ENVIRONMENT is not set.
If you want to create a conditional value, you need to do
`${env.ENVIRONMENT:=}`, this will pick the value of ENVIRONMENT if set,
otherwise will return None.
Closes: https://github.com/meta-llama/llama-stack/issues/2564
Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | 0883944bc3 | fix: Some missed env variable changes from PR 2490 (#2538) 
		
			Some checks failed
		
		
	 Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 25s Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 23s Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 17s Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 15s Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 13s Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 4s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 28s Python Package Build Test / build (3.13) (push) Failing after 2s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Test Llama Stack Build / generate-matrix (push) Successful in 6s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Test External Providers / test-external-providers (venv) (push) Failing after 3s Unit Tests / unit-tests (3.12) (push) Failing after 5s Python Package Build Test / build (3.12) (push) Failing after 9s Test Llama Stack Build / build-single-provider (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Test Llama Stack Build / build (push) Failing after 6s Unit Tests / unit-tests (3.13) (push) Failing after 8s Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 34s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 30s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 32s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 24s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 29s Pre-commit / pre-commit (push) Successful in 1m1s # What does this PR do?
Some templates were still using the old environment variable substition
syntax instead of the new one and were not getting substituted properly.
Also, some places didn't handle the new None vs old empty string ("")
values that come from the conditional environment variable substitution.
This gets the starter and remote-vllm distributions starting again, and
I tested various permutations of the starter as chroma and pgvector
needed some adjustments to their config classes to handle the new
possible `None` values. And, I had to tweak our `Provider` class to also
handle `None` values, for cases where we disable providers in the
starter config via environment variables.
This may not have caught everything that was missed, but I did grep
around quite a bit to try and find anything lingering.
## Test Plan
The following permutations now all run (or attempt to run to the point
of complaining that they can't connect to chroma, vllm, etc) when before
they failed immediately on startup because of bad environment variable
substitions:
```
uv run llama stack run llama_stack/templates/starter/run.yaml
ENABLE_SQLITE_VEC=true uv run llama stack run llama_stack/templates/starter/run.yaml
ENABLE_PGVECTOR=true uv run llama stack run llama_stack/templates/starter/run.yaml
ENABLE_CHROMADB=true uv run llama stack run llama_stack/templates/starter/run.yaml
uv run llama stack run llama_stack/templates/remote-vllm/run.yaml
```
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Co-authored-by: raghotham <rsm@meta.com> | ||
|  | 43c1f39bd6 | refactor(env)!: enhanced environment variable substitution (#2490) # What does this PR do?
This commit significantly improves the environment variable substitution
functionality in Llama Stack configuration files:
* The version field in configuration files has been changed from string
to integer type for better type consistency across build and run
configurations.
* The environment variable substitution system for ${env.FOO:} was fixed
and properly returns an error
* The environment variable substitution system for ${env.FOO+} returns
None instead of an empty strings, it better matches type annotations in
config fields
* The system includes automatic type conversion for boolean, integer,
and float values.
* The error messages have been enhanced to provide clearer guidance when
environment variables are missing, including suggestions for using
default values or conditional syntax.
* Comprehensive documentation has been added to the configuration guide
explaining all supported syntax patterns, best practices, and runtime
override capabilities.
* Multiple provider configurations have been updated to use the new
conditional syntax for optional API keys, making the system more
flexible for different deployment scenarios. The telemetry configuration
has been improved to properly handle optional endpoints with appropriate
validation, ensuring that required endpoints are specified when their
corresponding sinks are enabled.
* There were many instances of ${env.NVIDIA_API_KEY:} that should have
caused the code to fail. However, due to a bug, the distro server was
still being started, and early validation wasn’t triggered. As a result,
failures were likely being handled downstream by the providers. I’ve
maintained similar behavior by using ${env.NVIDIA_API_KEY:+}, though I
believe this is incorrect for many configurations. I’ll leave it to each
provider to correct it as needed.
* Environment variable substitution now uses the same syntax as Bash
parameter expansion.
Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | ac5fd57387 | chore: remove nested imports (#2515) # What does this PR do? * Given that our API packages use "import *" in `__init.py__` we don't need to do `from llama_stack.apis.models.models` but simply from llama_stack.apis.models. The decision to use `import *` is debatable and should probably be revisited at one point. * Remove unneeded Ruff F401 rule * Consolidate Ruff F403 rule in the pyprojectfrom llama_stack.apis.models.models Signed-off-by: Sébastien Han <seb@redhat.com> | ||
|  | a58c0639d5 | chore: update postgres_demo distro config (#2396) 
		
			Some checks failed
		
		
	 Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 5s Integration Tests / test-matrix (http, datasets) (push) Failing after 9s Integration Tests / test-matrix (http, inference) (push) Failing after 9s Integration Tests / test-matrix (http, agents) (push) Failing after 9s Integration Tests / test-matrix (http, inspect) (push) Failing after 10s Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Integration Tests / test-matrix (library, agents) (push) Failing after 8s Integration Tests / test-matrix (http, providers) (push) Failing after 10s Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 9s Integration Tests / test-matrix (library, datasets) (push) Failing after 9s Integration Tests / test-matrix (library, inference) (push) Failing after 9s Test Llama Stack Build / build-single-provider (push) Failing after 6s Integration Tests / test-matrix (library, post_training) (push) Failing after 8s Test Llama Stack Build / generate-matrix (push) Successful in 7s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Integration Tests / test-matrix (library, scoring) (push) Failing after 9s Integration Tests / test-matrix (library, providers) (push) Failing after 9s Test External Providers / test-external-providers (venv) (push) Failing after 6s Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s Unit Tests / unit-tests (3.10) (push) Failing after 7s Test Llama Stack Build / build (push) Failing after 7s Unit Tests / unit-tests (3.12) (push) Failing after 8s Unit Tests / unit-tests (3.11) (push) Failing after 8s Unit Tests / unit-tests (3.13) (push) Failing after 9s Integration Tests / test-matrix (library, inspect) (push) Failing after 30s Pre-commit / pre-commit (push) Successful in 1m17s # What does this PR do? ## Test Plan | ||
|  | b380cb463f | feat: add postgres deps to starter distro (#2360) Once we have this, we can use the starter distro for the Kubernetes cluster demos. | ||
|  | 2603f10f95 | feat: support postgresql inference store (#2310) 
		
			Some checks failed
		
		
	 Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Integration Tests / test-matrix (http, post_training) (push) Failing after 11s Integration Tests / test-matrix (library, inference) (push) Failing after 13s Integration Tests / test-matrix (http, providers) (push) Failing after 15s Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 16s Integration Tests / test-matrix (http, datasets) (push) Failing after 18s Integration Tests / test-matrix (http, scoring) (push) Failing after 16s Integration Tests / test-matrix (http, agents) (push) Failing after 19s Integration Tests / test-matrix (library, datasets) (push) Failing after 16s Integration Tests / test-matrix (http, inspect) (push) Failing after 18s Integration Tests / test-matrix (library, agents) (push) Failing after 18s Integration Tests / test-matrix (http, inference) (push) Failing after 20s Integration Tests / test-matrix (library, inspect) (push) Failing after 9s Integration Tests / test-matrix (library, post_training) (push) Failing after 10s Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 8s Test External Providers / test-external-providers (venv) (push) Failing after 8s Integration Tests / test-matrix (library, scoring) (push) Failing after 9s Integration Tests / test-matrix (library, providers) (push) Failing after 11s Unit Tests / unit-tests (3.11) (push) Failing after 8s Unit Tests / unit-tests (3.10) (push) Failing after 8s Unit Tests / unit-tests (3.12) (push) Failing after 8s Unit Tests / unit-tests (3.13) (push) Failing after 8s Pre-commit / pre-commit (push) Successful in 57s # What does this PR do? * Added support postgresql inference store * Added 'oracle' template that demos how to config postgresql stores (except for telemetry, which is not supported currently) ## Test Plan llama stack build --template oracle --image-type conda --run LLAMA_STACK_CONFIG=http://localhost:8321 pytest -s -v tests/integration/ --text-model accounts/fireworks/models/llama-v3p3-70b-instruct -k 'inference_store' |