llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-28 02:53:30 +00:00

Author	SHA1	Message	Date
Ben Browning	0883944bc3	fix: Some missed env variable changes from PR 2490 (#2538 ) Some checks failed Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 25s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 28s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Test Llama Stack Build / build (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 34s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 32s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 29s Details Pre-commit / pre-commit (push) Successful in 1m1s Details # What does this PR do? Some templates were still using the old environment variable substition syntax instead of the new one and were not getting substituted properly. Also, some places didn't handle the new None vs old empty string ("") values that come from the conditional environment variable substitution. This gets the starter and remote-vllm distributions starting again, and I tested various permutations of the starter as chroma and pgvector needed some adjustments to their config classes to handle the new possible `None` values. And, I had to tweak our `Provider` class to also handle `None` values, for cases where we disable providers in the starter config via environment variables. This may not have caught everything that was missed, but I did grep around quite a bit to try and find anything lingering. ## Test Plan The following permutations now all run (or attempt to run to the point of complaining that they can't connect to chroma, vllm, etc) when before they failed immediately on startup because of bad environment variable substitions: ``` uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_SQLITE_VEC=true uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_PGVECTOR=true uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_CHROMADB=true uv run llama stack run llama_stack/templates/starter/run.yaml uv run llama stack run llama_stack/templates/remote-vllm/run.yaml ``` <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: raghotham <rsm@meta.com>	2025-06-26 17:59:15 -07:00
Charlie Doern	d12f195f56	feat: drop python 3.10 support (#2469 ) # What does this PR do? dropped python3.10, updated pyproject and dependencies, and also removed some blocks of code with special handling for enum.StrEnum Closes #2458 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-06-19 12:07:14 +05:30
Ihar Hrachyshka	db21eab713	fix: catch TimeoutError in place of asyncio.TimeoutError (#2131 ) # What does this PR do? As per docs [1], since python 3.11 wait_for() raises TimeoutError. Since we currently support python 3.10+, we have to catch both. [1]: https://docs.python.org/3.12/library/asyncio-task.html#asyncio.wait_for [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan No explicit testing; just code hardening to reflect docs. [//]: # (## Documentation) Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-05-12 11:49:59 +02:00
Ihar Hrachyshka	9e6561a1ec	chore: enable pyupgrade fixes (#1806 ) # What does this PR do? The goal of this PR is code base modernization. Schema reflection code needed a minor adjustment to handle UnionTypes and collections.abc.AsyncIterator. (Both are preferred for latest Python releases.) Note to reviewers: almost all changes here are automatically generated by pyupgrade. Some additional unused imports were cleaned up. The only change worth of note can be found under `docs/openapi_generator` and `llama_stack/strong_typing/schema.py` where reflection code was updated to deal with "newer" types. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-05-01 14:23:50 -07:00
Sébastien Han	69554158fa	feat: add health to all providers through providers endpoint (#1418 ) The `/v1/providers` now reports the health status of each provider when implemented. ``` curl -L http://127.0.0.1:8321/v1/providers\|jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 4072 100 4072 0 0 246k 0 --:--:-- --:--:-- --:--:-- 248k { "data": [ { "api": "inference", "provider_id": "ollama", "provider_type": "remote::ollama", "config": { "url": "http://localhost:11434" }, "health": { "status": "OK" } }, { "api": "vector_io", "provider_id": "faiss", "provider_type": "inline::faiss", "config": { "kvstore": { "type": "sqlite", "namespace": null, "db_path": "/Users/leseb/.llama/distributions/ollama/faiss_store.db" } }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "safety", "provider_id": "llama-guard", "provider_type": "inline::llama-guard", "config": { "excluded_categories": [] }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "agents", "provider_id": "meta-reference", "provider_type": "inline::meta-reference", "config": { "persistence_store": { "type": "sqlite", "namespace": null, "db_path": "/Users/leseb/.llama/distributions/ollama/agents_store.db" } }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "telemetry", "provider_id": "meta-reference", "provider_type": "inline::meta-reference", "config": { "service_name": "llama-stack", "sinks": "console,sqlite", "sqlite_db_path": "/Users/leseb/.llama/distributions/ollama/trace_store.db" }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "eval", "provider_id": "meta-reference", "provider_type": "inline::meta-reference", "config": { "kvstore": { "type": "sqlite", "namespace": null, "db_path": "/Users/leseb/.llama/distributions/ollama/meta_reference_eval.db" } }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "datasetio", "provider_id": "huggingface", "provider_type": "remote::huggingface", "config": { "kvstore": { "type": "sqlite", "namespace": null, "db_path": "/Users/leseb/.llama/distributions/ollama/huggingface_datasetio.db" } }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "datasetio", "provider_id": "localfs", "provider_type": "inline::localfs", "config": { "kvstore": { "type": "sqlite", "namespace": null, "db_path": "/Users/leseb/.llama/distributions/ollama/localfs_datasetio.db" } }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "scoring", "provider_id": "basic", "provider_type": "inline::basic", "config": {}, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "scoring", "provider_id": "llm-as-judge", "provider_type": "inline::llm-as-judge", "config": {}, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "scoring", "provider_id": "braintrust", "provider_type": "inline::braintrust", "config": { "openai_api_key": "******" }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "tool_runtime", "provider_id": "brave-search", "provider_type": "remote::brave-search", "config": { "api_key": "****", "max_results": 3 }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "tool_runtime", "provider_id": "tavily-search", "provider_type": "remote::tavily-search", "config": { "api_key": "****", "max_results": 3 }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "tool_runtime", "provider_id": "code-interpreter", "provider_type": "inline::code-interpreter", "config": {}, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "tool_runtime", "provider_id": "rag-runtime", "provider_type": "inline::rag-runtime", "config": {}, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "tool_runtime", "provider_id": "model-context-protocol", "provider_type": "remote::model-context-protocol", "config": {}, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } }, { "api": "tool_runtime", "provider_id": "wolfram-alpha", "provider_type": "remote::wolfram-alpha", "config": { "api_key": "******" }, "health": { "status": "Not Implemented", "message": "Provider does not implement health check" } } ] } ``` Per providers too: ``` curl -L http://127.0.0.1:8321/v1/providers/ollama {"api":"inference","provider_id":"ollama","provider_type":"remote::ollama","config":{"url":"http://localhost:11434"},"health":{"status":"OK"}} ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-04-14 11:59:36 +02:00
Nathan Weinberg	e48af78b76	fix: add shutdown method for ProviderImpl (#1670 ) # What does this PR do? Currently there is no shutdown method implemented for the `ProviderImpl` class This leads to the following warning ```shell INFO: Waiting for application shutdown. INFO 2025-03-17 17:25:13,280 __main__:145 server: Shutting down INFO 2025-03-17 17:25:13,282 __main__:129 server: Shutting down ModelsRoutingTable INFO 2025-03-17 17:25:13,284 __main__:129 server: Shutting down DatasetsRoutingTable INFO 2025-03-17 17:25:13,286 __main__:129 server: Shutting down DatasetIORouter INFO 2025-03-17 17:25:13,287 __main__:129 server: Shutting down TelemetryAdapter INFO 2025-03-17 17:25:13,288 __main__:129 server: Shutting down InferenceRouter INFO 2025-03-17 17:25:13,290 __main__:129 server: Shutting down ShieldsRoutingTable INFO 2025-03-17 17:25:13,291 __main__:129 server: Shutting down SafetyRouter INFO 2025-03-17 17:25:13,292 __main__:129 server: Shutting down VectorDBsRoutingTable INFO 2025-03-17 17:25:13,293 __main__:129 server: Shutting down VectorIORouter INFO 2025-03-17 17:25:13,294 __main__:129 server: Shutting down ToolGroupsRoutingTable INFO 2025-03-17 17:25:13,295 __main__:129 server: Shutting down ToolRuntimeRouter INFO 2025-03-17 17:25:13,296 __main__:129 server: Shutting down MetaReferenceAgentsImpl INFO 2025-03-17 17:25:13,297 __main__:129 server: Shutting down ScoringFunctionsRoutingTable INFO 2025-03-17 17:25:13,298 __main__:129 server: Shutting down ScoringRouter INFO 2025-03-17 17:25:13,299 __main__:129 server: Shutting down BenchmarksRoutingTable INFO 2025-03-17 17:25:13,300 __main__:129 server: Shutting down EvalRouter INFO 2025-03-17 17:25:13,301 __main__:129 server: Shutting down DistributionInspectImpl INFO 2025-03-17 17:25:13,303 __main__:129 server: Shutting down ProviderImpl WARNING 2025-03-17 17:25:13,304 __main__:134 server: No shutdown method for ProviderImpl INFO: Application shutdown complete. INFO: Finished server process [1] ``` ## Test Plan Start a server and shut it down Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-03-17 14:55:40 -07:00
Xi Yan	33b096cc21	fix: OpenAPI with provider get (#1627 ) # What does this PR do? - https://github.com/meta-llama/llama-stack/pull/1429 introduces GetProviderResponse in OpenAPI, which is not needed, and not correctly defined. cc @cdoern [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan ``` llama-stack-client providers list ``` <img width="610" alt="image" src="https://github.com/user-attachments/assets/2f7b62a5-daf2-4bf9-9505-69755c7025fc" /> [//]: # (## Documentation)	2025-03-13 19:56:32 -07:00
Charlie Doern	a062723d03	feat: add provider API for listing and inspecting provider info (#1429 ) # What does this PR do? currently the `inspect` API for providers is really a `list` API. Create a new `providers` API which has a GET `providers/{provider_id}` inspect API which returns "user friendly" configuration to the end user. Also add a GET `/providers` endpoint which returns the list of providers as `inspect/providers` does today. This API follows CRUD and is more intuitive/RESTful. This work is part of the RFC at https://github.com/meta-llama/llama-stack/pull/1359 sensitive fields are redacted using `redact_sensetive_fields` on the server side before returning a response: <img width="456" alt="Screenshot 2025-03-13 at 4 40 21 PM" src="https://github.com/user-attachments/assets/9465c221-2a26-42f8-a08a-6ac4a9fecce8" /> ## Test Plan using https://github.com/meta-llama/llama-stack-client-python/pull/181 a user is able to to run the following: `llama stack build --template ollama --image-type venv` `llama stack run --image-type venv ~/.llama/distributions/ollama/ollama-run.yaml` `llama-stack-client providers inspect ollama` <img width="378" alt="Screenshot 2025-03-13 at 4 39 35 PM" src="https://github.com/user-attachments/assets/8273d05d-8bc3-44c6-9e4b-ef95e48d5466" /> also, was able to run the new test_list integration test locally with ollama: <img width="1509" alt="Screenshot 2025-03-13 at 11 03 40 AM" src="https://github.com/user-attachments/assets/9b9db166-f02f-45b0-86a4-306d85149bc8" /> Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-03-13 15:07:21 -07:00

8 commits