llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-17 07:32:36 +00:00

Author	SHA1	Message	Date
Charlie Doern	661985e240	feat: remove usage of build yaml (#4192 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Failing after 3s Details Test Llama Stack Build / build (push) Has been skipped Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Failing after 3s Details Test llama stack list-deps / list-deps (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.13) (push) Successful in 19s Details Python Package Build Test / build (3.12) (push) Successful in 23s Details Test Llama Stack Build / build-single-provider (push) Successful in 33s Details Test llama stack list-deps / show-single-provider (push) Successful in 36s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 44s Details Vector IO Integration Tests / test-matrix (push) Failing after 57s Details Test External API and Providers / test-external (venv) (push) Failing after 1m37s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m56s Details UI Tests / ui-tests (22) (push) Successful in 2m2s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m35s Details Pre-commit / pre-commit (22) (push) Successful in 3m16s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m34s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 3m59s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m30s Details # What does this PR do? the build.yaml is only used in the following ways: 1. list-deps 2. distribution code-gen since `llama stack build` no longer exists, I found myself asking "why do we need two different files for list-deps and run"? Removing the BuildConfig and altering the usage of the DistributionTemplate in llama stack list-deps is the first step in removing the build yaml entirely. Removing the BuildConfig and build.yaml cuts the files users need to maintain in half, and allows us to focus on the stability of _just_ the run.yaml This PR removes the build.yaml, BuildConfig datatype, and its usage throughout the codebase. Users are now expected to point to run.yaml files when running list-deps, and our codebase automatically uses these types now for things like `get_provider_registry`. Additionally, two renames: `StackRunConfig` -> `StackConfig` and `run.yaml` -> `config.yaml`. The build.yaml made sense for when we were managing the build process for the user and actually _producing_ a run.yaml _from_ the build.yaml, but now that we are simply just getting the provider registry and listing the deps, switching to config.yaml simplifies the scope here greatly. ## Test Plan existing list-deps usage should work in the tests. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-10 10:12:12 +01:00
paulengineer	e5a55f3677	docs: use 'uv pip' to avoid pitfalls of using 'pip' in virtual environment (#4122 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 25s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2s Details UI Tests / ui-tests (22) (push) Successful in 53s Details # What does this PR do? In the Detailed Tutorial, at Step 3, the Install with venv option creates a new virtual environment `client`, activates it then attempts to install the llama-stack-client using pip. ``` uv venv client --python 3.12 source client/bin/activate pip install llama-stack-client <- this is the problematic line ``` However, the pip command will likely fail because the `uv venv` command doesn't, by default, include adding the pip command to the virtual environment that is created. The pip command will error either because pip doesn't exist at all, or, if the pip command does exist outside of the virtual environment, return a different error message. The latter may be unclear to the user why it is failing. This PR changes 'pip' to 'uv pip', allowing the install action to function in the virtual environment as intended, and without the need for pip to be installed. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan 1. Use linux or WSL (virtual environments on Windows use `Scripts` folder instead of `bin` [virtualenv #993ba13](`993ba1316a`) which doesn't align with the tutorial) 2. Clone the `llama-stack` repo 3. Run the following and verify success: ``` uv venv client --python 3.12 source client/bin/activate ``` 5. Run the updated command: ``` uv pip install llama-stack-client ``` 6. Observe the console output confirms that the virtual environment `client` was used: > Using Python 3.12.3 environment at: client	2025-11-11 07:49:03 -05:00
Ashwin Bharambe	cb40da210f	fix: update tests for OpenAI-style models endpoint (#4053 ) The llama-stack-client now uses /`v1/openai/v1/models` which returns OpenAI-compatible model objects with 'id' and 'custom_metadata' fields instead of the Resource-style 'identifier' field. Updated api_recorder to handle the new endpoint and modified tests to access model metadata appropriately. Deleted stale model recordings for re-recording. NOTE: CI will be red on this one since it is dependent on https://github.com/llamastack/llama-stack-client-python/pull/291/files landing. I verified locally that it is green.	2025-11-03 17:30:08 -08:00
Charlie Doern	573e783ff0	docs: fix sidebar of `Detailed Tutorial` (#3856 ) # What does this PR do? the sidebar currently has an extra `ii. Run the Script` because its incorrectly put into the doc as an H3 not an H4 (like the other ones) <img width="239" height="218" alt="Screenshot 2025-10-20 at 1 04 54 PM" src="https://github.com/user-attachments/assets/eb8cb26e-7ea9-4b61-9101-d64965b39647" /> Fix this which will update the sidebar Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-20 13:10:50 -04:00
Charlie Doern	b11bcfde11	refactor(build): rework CLI commands and build process (1/2) (#2974 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 22s Details Test llama stack list-deps / show-single-provider (push) Failing after 53s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 18s Details Python Package Build Test / build (3.13) (push) Failing after 24s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 26s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 27s Details Unit Tests / unit-tests (3.12) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (push) Failing after 44s Details API Conformance Tests / check-schema-compatibility (push) Successful in 52s Details Test llama stack list-deps / generate-matrix (push) Successful in 52s Details Test Llama Stack Build / build (push) Failing after 29s Details Test External API and Providers / test-external (venv) (push) Failing after 53s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1m2s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m30s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 1m59s Details Test llama stack list-deps / list-deps (push) Failing after 1m10s Details UI Tests / ui-tests (22) (push) Successful in 2m26s Details Pre-commit / pre-commit (push) Successful in 3m8s Details # What does this PR do? This PR does a few things outlined in #2878 namely: 1. adds `llama stack list-deps` a command which simply takes the build logic and instead of executing one of the `build_...` scripts, it displays all of the providers' dependencies using the `module` and `uv`. 2. deprecated `llama stack build` in favor of `llama stack list-deps` 3. updates all tests to use `list-deps` alongside `build`. PR 2/2 will migrate `llama stack run`'s default behavior to be `llama stack build --run` and use the new `list-deps` command under the hood before running the server. examples of `llama stack list-deps starter` ``` llama stack list-deps starter --format json { "name": "starter", "description": "Quick start template for running Llama Stack with several popular providers. This distribution is intended for CPU-only environments.", "apis": [ { "api": "inference", "provider": "remote::cerebras" }, { "api": "inference", "provider": "remote::ollama" }, { "api": "inference", "provider": "remote::vllm" }, { "api": "inference", "provider": "remote::tgi" }, { "api": "inference", "provider": "remote::fireworks" }, { "api": "inference", "provider": "remote::together" }, { "api": "inference", "provider": "remote::bedrock" }, { "api": "inference", "provider": "remote::nvidia" }, { "api": "inference", "provider": "remote::openai" }, { "api": "inference", "provider": "remote::anthropic" }, { "api": "inference", "provider": "remote::gemini" }, { "api": "inference", "provider": "remote::vertexai" }, { "api": "inference", "provider": "remote::groq" }, { "api": "inference", "provider": "remote::sambanova" }, { "api": "inference", "provider": "remote::azure" }, { "api": "inference", "provider": "inline::sentence-transformers" }, { "api": "vector_io", "provider": "inline::faiss" }, { "api": "vector_io", "provider": "inline::sqlite-vec" }, { "api": "vector_io", "provider": "inline::milvus" }, { "api": "vector_io", "provider": "remote::chromadb" }, { "api": "vector_io", "provider": "remote::pgvector" }, { "api": "files", "provider": "inline::localfs" }, { "api": "safety", "provider": "inline::llama-guard" }, { "api": "safety", "provider": "inline::code-scanner" }, { "api": "agents", "provider": "inline::meta-reference" }, { "api": "telemetry", "provider": "inline::meta-reference" }, { "api": "post_training", "provider": "inline::torchtune-cpu" }, { "api": "eval", "provider": "inline::meta-reference" }, { "api": "datasetio", "provider": "remote::huggingface" }, { "api": "datasetio", "provider": "inline::localfs" }, { "api": "scoring", "provider": "inline::basic" }, { "api": "scoring", "provider": "inline::llm-as-judge" }, { "api": "scoring", "provider": "inline::braintrust" }, { "api": "tool_runtime", "provider": "remote::brave-search" }, { "api": "tool_runtime", "provider": "remote::tavily-search" }, { "api": "tool_runtime", "provider": "inline::rag-runtime" }, { "api": "tool_runtime", "provider": "remote::model-context-protocol" }, { "api": "batches", "provider": "inline::reference" } ], "pip_dependencies": [ "pandas", "opentelemetry-exporter-otlp-proto-http", "matplotlib", "opentelemetry-sdk", "sentence-transformers", "datasets", "pymilvus[milvus-lite]>=2.4.10", "codeshield", "scipy", "torchvision", "tree_sitter", "h11>=0.16.0", "aiohttp", "pymongo", "tqdm", "pythainlp", "pillow", "torch", "emoji", "grpcio>=1.67.1,<1.71.0", "fireworks-ai", "langdetect", "psycopg2-binary", "asyncpg", "redis", "together", "torchao>=0.12.0", "openai", "sentencepiece", "aiosqlite", "google-cloud-aiplatform", "faiss-cpu", "numpy", "sqlite-vec", "nltk", "scikit-learn", "mcp>=1.8.1", "transformers", "boto3", "huggingface_hub", "ollama", "autoevals", "sqlalchemy[asyncio]", "torchtune>=0.5.0", "chromadb-client", "pypdf", "requests", "anthropic", "chardet", "aiosqlite", "fastapi", "fire", "httpx", "uvicorn", "opentelemetry-sdk", "opentelemetry-exporter-otlp-proto-http" ] } ``` <img width="1500" height="420" alt="Screenshot 2025-10-16 at 5 53 03 PM" src="https://github.com/user-attachments/assets/765929fb-93e2-44d7-9c3d-8918b70fc721" /> --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-17 19:52:14 -07:00
IAN MILLER	007efa6eb5	refactor: replace default all-MiniLM-L6-v2 embedding model by nomic-embed-text-v1.5 in Llama Stack (#3183 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this PR is to replace the Llama Stack's default embedding model by nomic-embed-text-v1.5. These are the key reasons why Llama Stack community decided to switch from all-MiniLM-L6-v2 to nomic-embed-text-v1.5: 1. The training data for [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2#training-data) includes a lot of data sets with various licensing terms, so it is tricky to know when/whether it is appropriate to use this model for commercial applications. 2. The model is not particularly competitive on major benchmarks. For example, if you look at the [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) and click on Miscellaneous/BEIR to see English information retrieval accuracy, you see that the top of the leaderboard is dominated by enormous models but also that there are many, many models of relatively modest size whith much higher Retrieval scores. If you want to look closely at the data, I recommend clicking "Download Table" because it is easier to browse that way. More discussion info can be founded [here](https://github.com/llamastack/llama-stack/issues/2418) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2418 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> 1. Run `./scripts/unit-tests.sh` 2. Integration tests via CI wokrflow --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-10-14 10:44:20 -04:00
ehhuang	a3f5072776	chore!: remove --env from `llama stack run` (#3711 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Installer CI / lint (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Installer CI / smoke-test-on-dev (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? user can simply set env vars in the beginning of the command.`FOO=BAR llama stack run ...` ## Test Plan Run TELEMETRY_SINKS=coneol uv run --with llama-stack llama stack build --distro=starter --image-type=venv --run --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3711). * #3714 * __->__ #3711	2025-10-07 20:58:15 -07:00
Alexey Rybak	6101c8e015	docs: fix broken links (#3540 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> - Fixes broken links and Docusaurus search Closes #3518 ## Test Plan The following should produce a clean build with no warnings and search enabled: ``` npm install npm run gen-api-docs all npm run build npm run serve ``` <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:16:31 -07:00
Alexey Rybak	8537ada11b	docs: MDX leftover fixes (#3536 ) # What does this PR do? - Fixes Docusaurus build errors <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan - `npm run build` compiles the build properly - Broken links expected and will be fixed in a follow-on PR <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:14:32 -07:00
Alexey Rybak	c71ce8df61	docs: concepts and building_applications migration (#3534 ) # What does this PR do? - Migrates the remaining documentation sections to the new documentation format <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan - Partial migration <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:05:30 -07:00

10 commits