llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-18 15:27:16 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	07fc8013eb	fix(tests): reduce some test noise (#3825 ) a bunch of logger.info()s are good for server code to help debug in production, but we don't want them killing our unit test output :) --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-16 09:52:16 -07:00
Sébastien Han	1136daf310	fix: replace python-jose with PyJWT for JWT handling (#3756 ) # What does this PR do? This commit migrates the authentication system from python-jose to PyJWT to eliminate the dependency on the archived rsa package. The migration includes: - Refactored OAuth2TokenAuthProvider to use PyJWT's PyJWKClient for clean JWKS handling - Removed manual JWKS fetching, caching and key extraction logic in favor of PyJWT's built-in functionality The new implementation is cleaner, more maintainable, and follows PyJWT best practices while maintaining full backward compatibility. ## Test Plan Unit tests. Auth CI. --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-14 09:35:48 +02:00
Sébastien Han	7ee0ee7843	chore!: remove model mgmt from CLI for Hugging Face CLI (#3700 ) This change removes the `llama model` and `llama download` subcommands from the CLI, replacing them with recommendations to use the Hugging Face CLI instead. Rationale for this change: - The model management functionality was largely duplicating what Hugging Face CLI already provides, leading to unnecessary maintenance overhead (except the download source from Meta?) - Maintaining our own implementation required fixing bugs and keeping up with changes in model repositories and download mechanisms - The Hugging Face CLI is more mature, widely adopted, and better maintained - This allows us to focus on the core Llama Stack functionality rather than reimplementing model management tools Changes made: - Removed all model-related CLI commands and their implementations - Updated documentation to recommend using `huggingface-cli` for model downloads - Removed Meta-specific download logic and statements - Simplified the CLI to focus solely on stack management operations Users should now use: - `huggingface-cli download` for downloading models - `huggingface-cli scan-cache` for listing downloaded models This is a breaking change as it removes previously available CLI commands. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-09 16:50:33 -07:00
Ashwin Bharambe	a055a32ee4	fix(tests): remove chroma and qdrant from vector io unit tests (#3759 ) These vector databases are already thoroughly tested in integration tests. Unit tests now focus on sqlite_vec, faiss, and pgvector with mocked dependencies, removing the need for external service dependencies. ## Changes: - Deleted test_qdrant.py unit test file - Removed chroma/qdrant fixtures and parametrization from conftest.py - Fixed SqliteKVStoreConfig import to use correct location - Removed chromadb, qdrant-client, pymilvus, milvus-lite, and weaviate-client from unit test dependencies in pyproject.toml	2025-10-09 14:36:34 -07:00
Matthew Farrellee	d23ed26238	chore: turn OpenAIMixin into a pydantic.BaseModel (#3671 ) # What does this PR do? - implement get_api_key instead of relying on LiteLLMOpenAIMixin.get_api_key - remove use of LiteLLMOpenAIMixin - add default initialize/shutdown methods to OpenAIMixin - remove __init__s to allow proper pydantic construction - remove dead code from vllm adapter and associated / duplicate unit tests - update vllm adapter to use openaimixin for model registration - remove ModelRegistryHelper from fireworks & together adapters - remove Inference from nvidia adapter - complete type hints on embedding_model_metadata - allow extra fields on OpenAIMixin, for model_store, __provider_id__, etc - new recordings for ollama - enhance the list models error handling - update cerebras (remove cerebras-cloud-sdk) and anthropic (custom model listing) inference adapters - parametrized test_inference_client_caching - remove cerebras, databricks, fireworks, together from blanket mypy exclude - removed unnecessary litellm deps ## Test Plan ci	2025-10-06 11:33:19 -04:00
Francisco Arceo	7ec7e0c1ac	chore: Add weaviate client to unit group in pyproject.toml and uv.lock (#3675 ) # What does this PR do? `uv add "weaviate-client>=4.16.4" --group unit` ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-03 14:02:20 -07:00
Francisco Arceo	a20e8eac8c	feat: Add OpenAI Conversations API (#3429 ) # What does this PR do? Initial implementation for `Conversations` and `ConversationItems` using `AuthorizedSqlStore` with endpoints to: - CREATE - UPDATE - GET/RETRIEVE/LIST - DELETE Set `level=LLAMA_STACK_API_V1`. NOTE: This does not currently incorporate changes for Responses, that'll be done in a subsequent PR. Closes https://github.com/llamastack/llama-stack/issues/3235 ## Test Plan - Unit tests - Integration tests Also comparison of [OpenAPI spec for OpenAI API](https://github.com/openai/openai-openapi/tree/manual_spec) ```bash oasdiff breaking --fail-on ERR docs/static/llama-stack-spec.yaml https://raw.githubusercontent.com/openai/openai-openapi/refs/heads/manual_spec/openapi.yaml --strip-prefix-base "/v1/openai/v1" \ --match-path '(^/v1/openai/v1/conversations.\|^/conversations.)' ``` Note I still have some uncertainty about this, I borrowed this info from @cdoern on https://github.com/llamastack/llama-stack/pull/3514 but need to spend more time to confirm it's working, at the moment it suggests it does. UPDATE on `oasdiff`, I investigated the OpenAI spec further and it looks like currently the spec does not list Conversations, so that analysis is useless. Noting for future reference. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-03 08:47:18 -07:00
Alexey Rybak	382eb25398	docs: fix more broken links (#3649 ) # What does this PR do? * Fixes some more documentation links ## Test Plan * Manual testing	2025-10-02 10:43:49 +02:00
github-actions[bot]	4487b88ffe	build: Bump version to 0.2.23	2025-09-26 21:11:51 +00:00
Matthew Farrellee	926c3ada41	chore: prune mypy exclude list (#3561 ) # What does this PR do? prune the mypy exclude list, build a stronger foundation for quality code ## Test Plan ci	2025-09-26 11:44:43 -04:00
ehhuang	48a551ecbc	chore(perf): run guidellm benchmarks (#3421 ) Some checks failed Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m9s Details # What does this PR do? - Mostly AI-generated scripts to run guidellm (https://github.com/vllm-project/guidellm) benchmarks on k8s setup - Stack is using image built from main on 9/11 ## Test Plan See updated README.md	2025-09-24 10:18:33 -07:00
github-actions[bot]	ececc323d3	build: Bump version to 0.2.22 Some checks failed Pre-commit / pre-commit (push) Successful in 1m14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 31s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details	2025-09-16 19:44:03 +00:00
slekkala1	3defdf7d3a	fix: docker failing to start container[pydantic] (#3460 ) # What does this PR do? Pinning to latest pydantic version 2.11.9 as sometime we are picking older version and failing to start container in github actions : `1775026312` Closes https://github.com/llamastack/llama-stack/issues/3461 ## Test Plan Tested locally with the following commands to start a container Build container `llama stack build --distro starter --image-type container` start container `docker run -d -p 8321:8321 --name llama-stack-test distribution-starter:0.2.21` check health http://localhost:8321/v1/health Couldnt repro with older version(`2.8.2`), but `2.11.9` pydantic is able to start the container https://pypi.org/project/pydantic/#history , 2.11.9 is the latest version	2025-09-16 11:33:43 -07:00
Matthew Farrellee	8cf2128b40	chore(tests): always show slowest tests (#3431 ) # What does this PR do? help developers identify slow tests by always passing --duration to pytest ## Test Plan n/a	2025-09-13 09:28:04 -07:00
Sébastien Han	8e05c68d15	chore: remove openai dependency from providers (#3398 ) # What does this PR do? The openai package is already a dependency of the llama-stack project itself, so let's the project dictate which openai version we need and avoid potential breakage with unsatisfiable dependency resolution. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-11 10:19:59 +02:00
github-actions[bot]	28696c3f30	build: Bump version to 0.2.21 Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 2s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 41s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Test External API and Providers / test-external (venv) (push) Failing after 41s Details Pre-commit / pre-commit (push) Successful in 2m0s Details	2025-09-08 22:30:03 +00:00
Swapna Lekkala	09141361fb	fix: use dataset version 4.0.0 or above	2025-09-08 13:22:43 -07:00
dependabot[bot]	5c873d53db	chore(python-deps): bump pymilvus from 2.6.0 to 2.6.1 (#3285 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 0s Details Pre-commit / pre-commit (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Failing after 1s Details Test Llama Stack Build / build-single-provider (push) Failing after 1s Details Test Llama Stack Build / build (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 0s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 0s Details Update ReadTheDocs / update-readthedocs (push) Failing after 0s Details UI Tests / ui-tests (22) (push) Failing after 1s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Bumps [pymilvus](https://github.com/milvus-io/pymilvus) from 2.6.0 to 2.6.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/milvus-io/pymilvus/releases">pymilvus's releases</a>.</em></p> <blockquote> <h2>PyMilvus v2.6.1 Release Notes</h2> <h2>What's Changed</h2> <ul> <li>Avoid describe_collection when query by ids by <a href="https://github.com/yhmo"><code>@yhmo</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2930">milvus-io/pymilvus#2930</a></li> <li>bulkImport add objectUrls/token paramster & add example use by <a href="https://github.com/lentitude2tk"><code>@lentitude2tk</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2934">milvus-io/pymilvus#2934</a></li> <li>support stageManager & stageFileManager by <a href="https://github.com/lentitude2tk"><code>@lentitude2tk</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2935">milvus-io/pymilvus#2935</a></li> <li>fix: Fix the existing version fmt by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2960">milvus-io/pymilvus#2960</a></li> <li>enhance: Add unixmsec in every RPC call by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2961">milvus-io/pymilvus#2961</a></li> <li>enhance: Multiple cherry picks from master branch by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2962">milvus-io/pymilvus#2962</a></li> <li>fix: Passing unknown req.is_refresh to wait by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2964">milvus-io/pymilvus#2964</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/milvus-io/pymilvus/compare/v2.6.0...v2.6.1">https://github.com/milvus-io/pymilvus/compare/v2.6.0...v2.6.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0237c9f1bd`"><code>0237c9f</code></a> fix: [2.6]Passing unknown req.is_refresh to wait (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2964">#2964</a>)</li> <li><a href="`a083622d8f`"><code>a083622</code></a> enhance: Multiple cherry picks from master branch (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2962">#2962</a>)</li> <li><a href="`87e3c5acc1`"><code>87e3c5a</code></a> enhance: Add unixmsec in every RPC call (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2961">#2961</a>)</li> <li><a href="`98077a27c9`"><code>98077a2</code></a> fix: [2.6]Fix the existing version fmt (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2960">#2960</a>)</li> <li><a href="`80e2e09323`"><code>80e2e09</code></a> feat: Add partial update support for upsert operations (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2938">#2938</a>) (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2940">#2940</a>)</li> <li><a href="`0210ee92e6`"><code>0210ee9</code></a> [cherry-pick] support stageManager & stageFileManager (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2935">#2935</a>)</li> <li><a href="`00fb8e6f23`"><code>00fb8e6</code></a> [cherry-pick] bulkImport add objectUrls/token paramster & add example use (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2">#2</a>...</li> <li><a href="`442ef15806`"><code>442ef15</code></a> Avoid describe_collection when query by ids (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2930">#2930</a>)</li> <li><a href="`e704dd29b5`"><code>e704dd2</code></a> fix: Correct github actions on branch 2.6 (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2926">#2926</a>)</li> <li>See full diff in <a href="https://github.com/milvus-io/pymilvus/compare/v2.6.0...v2.6.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pymilvus&package-manager=uv&previous-version=2.6.0&new-version=2.6.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-01 20:24:22 -04:00
IAN MILLER	4a59961a6c	refactor: remove lama-api-client from pyproject.toml (#3299 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Vector IO Integration Tests / test-matrix (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Failing after 0s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1s Details Test Llama Stack Build / build (push) Has been skipped Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1s Details UI Tests / ui-tests (22) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Has started running Details Test Llama Stack Build / build-single-provider (push) Has started running Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is eliminating `lama-api-client` dependency at `pyproject.toml` because it's not used in Llama Stack codebase <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ` ./scripts/unit-tests.sh`	2025-09-01 16:50:50 +02:00
dependabot[bot]	9625ac6d02	chore(python-deps): bump locust from 2.39.0 to 2.39.1 (#3284 ) Bumps [locust](https://github.com/locustio/locust) from 2.39.0 to 2.39.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/locustio/locust/releases">locust's releases</a>.</em></p> <blockquote> <h2>2.39.1</h2> <h2>What's Changed</h2> <ul> <li>Avoid broken gevent version for now by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3196">locustio/locust#3196</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/JumboBear"><code>@JumboBear</code></a> made their first contribution in <a href="https://redirect.github.com/locustio/locust/pull/3195">locustio/locust#3195</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/locustio/locust/compare/2.39.0...2.39.1">https://github.com/locustio/locust/compare/2.39.0...2.39.1</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/locustio/locust/blob/master/CHANGELOG.md">locust's changelog</a>.</em></p> <blockquote> <h1>Detailed changelog</h1> <p>The most important changes can also be found in <a href="https://docs.locust.io/en/latest/changelog.html">the documentation</a>.</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`934c5c33e4`"><code>934c5c3</code></a> changelog</li> <li><a href="`9350084ec0`"><code>9350084</code></a> disable macos build for now</li> <li><a href="`705e2f658b`"><code>705e2f6</code></a> Disable another unit test on macos because of annoying behavior on GH (really...</li> <li><a href="`d888b9db2b`"><code>d888b9d</code></a> Disable another unit test on macos because of annoying behavior on GH</li> <li><a href="`45bc4d84fd`"><code>45bc4d8</code></a> Disable annoying test case on macos for now. Only has issues on GH. <a href="https://github.com/amadeupp"><code>@amadeupp</code></a>...</li> <li><a href="`9d7710a2da`"><code>9d7710a</code></a> unit tests: give extra time for testing on macOS</li> <li><a href="`fcbc740e04`"><code>fcbc740</code></a> Avoid broken gevent version for now (<a href="https://redirect.github.com/locustio/locust/issues/3196">#3196</a>)</li> <li><a href="`cd1f600d44`"><code>cd1f600</code></a> mypy</li> <li><a href="`0cf52dc990`"><code>0cf52dc</code></a> Autogen changelog for 2.39.0</li> <li><a href="`094395e024`"><code>094395e</code></a> Merge pull request <a href="https://redirect.github.com/locustio/locust/issues/3195">#3195</a> from JumboBear/pyproject</li> <li>Additional commits viewable in <a href="https://github.com/locustio/locust/compare/2.39.0...2.39.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=locust&package-manager=uv&previous-version=2.39.0&new-version=2.39.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-01 16:49:09 +02:00
Matthew Farrellee	3370d8e557	feat(files, s3, expiration): add expires_after support to S3 files provider (#3283 )	2025-08-29 16:17:24 -07:00
github-actions[bot]	78a78264a7	build: Bump version to 0.2.20	2025-08-29 21:17:47 +00:00
IAN MILLER	3130ca0a78	feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider (#3064 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this task is to implement `openai/v1/vector_stores/{vector_store_id}/search` for PGVector provider. It involves implementing vector similarity search, keyword search and hybrid search for `PGVectorIndex`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3006 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run unit tests: ` ./scripts/unit-tests.sh ` Run integration tests for openai vector stores: 1. Export env vars: ``` export ENABLE_PGVECTOR=true export PGVECTOR_HOST=localhost export PGVECTOR_PORT=5432 export PGVECTOR_DB=llamastack export PGVECTOR_USER=llamastack export PGVECTOR_PASSWORD=llamastack ``` 2. Create DB: ``` psql -h localhost -U postgres -c "CREATE ROLE llamastack LOGIN PASSWORD 'llamastack';" psql -h localhost -U postgres -c "CREATE DATABASE llamastack OWNER llamastack;" psql -h localhost -U llamastack -d llamastack -c "CREATE EXTENSION IF NOT EXISTS vector;" ``` 3. Install sentence-transformers: ` uv pip install sentence-transformers ` 4. Run: ``` uv run --group test pytest -s -v --stack-config="inference=inline::sentence-transformers,vector_io=remote::pgvector" --embedding-model sentence-transformers/all-MiniLM-L6-v2 tests/integration/vector_io/test_openai_vector_stores.py ``` Inspect PGVector vector stores (optional): ``` psql llamastack psql (14.18 (Homebrew)) Type "help" for help. llamastack=# \z Access privileges Schema \| Name \| Type \| Access privileges \| Column privileges \| Policies --------+------------------------------------------------------+-------+-------------------+-------------------+---------- public \| llamastack_kvstore \| table \| \| \| public \| metadata_store \| table \| \| \| public \| vector_store_pgvector_main \| table \| \| \| public \| vector_store_vs_1dfbc061_1f4d_4497_9165_ecba2622ba3a \| table \| \| \| public \| vector_store_vs_2085a9fb_1822_4e42_a277_c6a685843fa7 \| table \| \| \| public \| vector_store_vs_2b3dae46_38be_462a_afd6_37ee5fe661b1 \| table \| \| \| public \| vector_store_vs_2f438de6_f606_4561_9d50_ef9160eb9060 \| table \| \| \| public \| vector_store_vs_3eeca564_2580_4c68_bfea_83dc57e31214 \| table \| \| \| public \| vector_store_vs_53942163_05f3_40e0_83c0_0997c64613da \| table \| \| \| public \| vector_store_vs_545bac75_8950_4ff1_b084_e221192d4709 \| table \| \| \| public \| vector_store_vs_688a37d8_35b2_4298_a035_bfedf5b21f86 \| table \| \| \| public \| vector_store_vs_70624d9a_f6ac_4c42_b8ab_0649473c6600 \| table \| \| \| public \| vector_store_vs_73fc1dd2_e942_4972_afb1_1e177b591ac2 \| table \| \| \| public \| vector_store_vs_9d464949_d51f_49db_9f87_e033b8b84ac9 \| table \| \| \| public \| vector_store_vs_a1e4d724_5162_4d6d_a6c0_bdafaf6b76ec \| table \| \| \| public \| vector_store_vs_a328fb1b_1a21_480f_9624_ffaa60fb6672 \| table \| \| \| public \| vector_store_vs_a8981bf0_2e66_4445_a267_a8fff442db53 \| table \| \| \| public \| vector_store_vs_ccd4b6a4_1efd_4984_ad03_e7ff8eadb296 \| table \| \| \| public \| vector_store_vs_cd6420a4_a1fc_4cec_948c_1413a26281c9 \| table \| \| \| public \| vector_store_vs_cd709284_e5cf_4a88_aba5_dc76a35364bd \| table \| \| \| public \| vector_store_vs_d7a4548e_fbc1_44d7_b2ec_b664417f2a46 \| table \| \| \| public \| vector_store_vs_e7f73231_414c_4523_886c_d1174eee836e \| table \| \| \| public \| vector_store_vs_ffd53588_819f_47e8_bb9d_954af6f7833d \| table \| \| \| (23 rows) llamastack=# ``` Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-08-29 16:30:12 +02:00
github-actions[bot]	963305c84d	build: Bump version to 0.2.19	2025-08-26 22:02:47 +00:00
Matthew Farrellee	f520e244d9	feat: Add S3 Files Provider (#3202 ) Implements a complete S3-based file storage provider for Llama Stack with: Core Implementation: - S3FilesImpl class with full OpenAI Files API compatibility - Support for file upload, download, listing, deletion operations - Sqlite-based metadata storage for fast queries and API compliance - Configurable S3 endpoints (AWS, MinIO, LocalStack support) Key Features: - Automatic S3 bucket creation and management - Metadata persistence - Proper error handling for S3 connectivity and permissions Dependencies: - Adds boto3 for AWS S3 integration - Adds moto[s3] for testing infrastructure Testing: Unit: `./scripts/unit-tests.sh tests/unit/files tests/unit/providers/files` Integration: Start MinIO: `podman run --rm -it -p 9000:9000 minio/minio server /data` Start stack w/ S3 provider: `S3_ENDPOINT_URL=http://localhost:9000 AWS_ACCESS_KEY_ID=minioadmin AWS_SECRET_ACCESS_KEY=minioadmin S3_BUCKET_NAME=llama-stack-files uv run llama stack build --image-type venv --providers files=remote::s3 --run` Run integration tests: `./scripts/integration-tests.sh --stack-config http://localhost:8321 --provider ollama --test-subdirs files`	2025-08-22 10:38:59 -04:00
github-actions[bot]	7f0b2a8764	build: Bump version to 0.2.18	2025-08-19 22:38:23 +00:00
Varsha	8cc4925f7d	chore: Enable keyword search for Milvus inline (#3073 ) # What does this PR do? With https://github.com/milvus-io/milvus-lite/pull/294 - Milvus Lite supports keyword search using BM25. While introducing keyword search we had explicitly disabled it for inline milvus. This PR removes the need for the check, and enables `inline::milvus` for tests. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Run llama stack with `inline::milvus` enabled: ``` pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v ``` ``` INFO 2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS =========================================================================================== test session starts ============================================================================================ platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}} rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack configfile: pyproject.toml plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0 asyncio: mode=Mode.AUTO collected 3 items tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED [ 33%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED [ 66%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED [100%] ============================================================================================ 3 passed in 4.75s ============================================================================================= ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-08-19 13:01:23 -04:00
Ashwin Bharambe	27d6becfd0	fix(misc): pin openai dependency to < 1.100.0 (#3192 ) This OpenAI client release `0843a11164` ends up breaking litellm `169a17400f/litellm/types/llms/openai.py (L40)` Update the dependency pin. Also make the imports a bit more defensive anyhow if something else during `llama stack build` ends up moving openai to a previous version. ## Test Plan Run pre-release script integration tests.	2025-08-18 12:20:50 -07:00
ehhuang	d6ae54723d	chore: setup for performance benchmarking (#3096 ) # What does this PR do? 1. Added a simple mock openai-compat server that serves chat/completion 2. Add a benchmark server in EKS that includes mock inference server 3. Add locust (https://locust.io/) file for load testing ## Test Plan bash apply.sh kubectl port-forward service/locust-web-ui 8089:8089 Go to localhost:8089 to start a load test <img width="1392" height="334" alt="image" src="https://github.com/user-attachments/assets/d6aa3deb-583a-42ed-889b-751262b8e91c" /> <img width="1362" height="881" alt="image" src="https://github.com/user-attachments/assets/6a28b9b4-05e6-44e2-b504-07e60c12d35e" />	2025-08-13 10:58:22 -07:00
Matthew Farrellee	b70e2f1f09	fix(dep): update to openai >= 1.99.6 and use new Function location (#3087 ) # What does this PR do? closes #3072 ## Test Plan ci	2025-08-12 08:40:32 -07:00
Mustafa Elbehery	b5b5f5b9ae	chore: add `mypy` prompt guard (#2678 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-08-11 08:40:40 -07:00
Mohamed Rebai	7eff1bb3ec	ci(pre-commit): enforce presence of 'upload-time' field in uv.lock (#2920 ) # What does this PR do? This PR adds a minimum version `0.7.0` to the project. The diff issue happens because an `upload-time` field in the `uv.lock` file did not exist in older uv versions (pre `0.6.15`). This effectively prevents large diffs in PRs from devs that use older versions of uv. Closes #2887 --------- Co-authored-by: Charlie Doern <charlie@doern.me>	2025-08-06 07:46:59 -07:00
github-actions[bot]	e565b91182	build: Bump version to 0.2.17 Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 7s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 12s Details Pre-commit / pre-commit (push) Successful in 1m38s Details	2025-08-05 01:43:30 +00:00
Ashwin Bharambe	cc87995e2b	chore: rename templates to distributions (#3035 ) As the title says. Distributions is in, Templates is out. `llama stack build --template` --> `llama stack build --distro`. For backward compatibility, the previous option is kept but results in a warning. Updated `server.py` to remove the "config_or_template" backward compatibility since it has been a couple releases since that change.	2025-08-04 11:34:17 -07:00
Matthew Farrellee	4411e6e362	chore(ci): remove reportlab dep (#3033 ) # What does this PR do? remove reportlab dep. change dynamic pdf generation into a pre-computed pdf. ## Test Plan ci	2025-08-04 09:36:13 -07:00
Francisco Arceo	33cca26154	chore: Enabling Integration tests for Weaviate (#2882 ) # What does this PR do? This PR (1) enables the files API for Weaviate and (2) enables integration tests for Weaviate, which adds a docker container to the github action. This PR also handles a couple of edge cases for in creating the collection and ensuring the tests all pass. ## Test Plan CI enabled --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-31 20:29:50 -04:00
Ashwin Bharambe	2665f00102	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 ) We would like to rename the term `template` to `distribution`. To prepare for that, this is a precursor. cc @leseb	2025-07-30 23:30:53 -07:00
IAN MILLER	25d3dfa30f	fix: fix No module named 'ollama' in test_inference_recordings.py (#2967 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fixes the following error in unit test that was running on up to date main branch: ``` FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_recording_mode - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_replay_mode - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_replay_missing_recording - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_embeddings_recording - ModuleNotFoundError: No module named 'ollama' =============================== 4 failed, 499 passed, 198 warnings in 34.50s ================================ ``` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run `./scripts/unit-tests.sh`	2025-07-30 16:33:33 -04:00
Nathan Weinberg	870a37ff4b	feat: add base64 encoded PDF support for OpenAI Chat Completions (#2881 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests / discover-tests (push) Successful in 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Test Llama Stack Build / build-single-provider (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Details Test External API and Providers / test-external (venv) (push) Failing after 16s Details Test Llama Stack Build / build (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 23s Details Update ReadTheDocs / update-readthedocs (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 29s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 58s Details Python Package Build Test / build (3.13) (push) Failing after 54s Details Integration Tests / test-matrix (push) Failing after 56s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m4s Details Pre-commit / pre-commit (push) Successful in 2m15s Details # What does this PR do? OpenAI Chat Completions supports passing a base64 encoded PDF file to a model, but Llama Stack currently does not allow for this behavior. This PR extends our implementation of the OpenAI API spec to change that. Closes #2129 ## Test Plan A new functional test has been added to test the validity of such a request Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-29 06:23:41 -04:00
github-actions[bot]	cf8722079c	build: Bump version to 0.2.16 Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests / discover-tests (push) Successful in 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Integration Tests / test-matrix (push) Failing after 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 12s Details Test Llama Stack Build / build-single-provider (push) Failing after 35s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 44s Details Pre-commit / pre-commit (push) Successful in 1m23s Details	2025-07-28 23:13:50 +00:00
Ashwin Bharambe	9583f468f8	feat(starter)!: simplify starter distro; litellm model registry changes (#2916 )	2025-07-25 15:02:04 -07:00
Sarthak Deshpande	cd8715d327	chore: Added openai compatible vector io endpoints for chromadb (#2489 ) Some checks failed Integration Tests / discover-tests (push) Successful in 3s Details Coverage Badge / unit-tests (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Test External Providers / test-external-providers (venv) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Test Llama Stack Build / build-single-provider (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 49s Details Integration Tests / test-matrix (push) Failing after 53s Details Pre-commit / pre-commit (push) Successful in 1m42s Details # What does this PR do? This PR implements the openai compatible endpoints for chromadb Closes #2462 ## Test Plan Ran ollama llama stack server and ran the command `pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2` 8 failed, 27 passed, 8 skipped, 1 xfailed The failed ones are regarding files api --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com> Co-authored-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-07-23 13:51:58 -07:00
Ashwin Bharambe	ade075152e	chore: kill inline::vllm (#2824 ) Inline _inference_ providers haven't proved to be very useful -- they are rarely used. And for good reason -- it is almost never a good idea to include a complex (distributed) inference engine bundled into a distributed stateful front-end server serving many other things. Responsibility should be split properly. See Discord discussion: `1395849853`	2025-07-18 15:52:18 -07:00
Christian Zaccaria	e45543f7f3	test: Measure and track code coverage (#2636 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - Added coverage badge to README. - [See my fork](https://github.com/ChristianZaccaria/llama-stack) - Added a GitHub Actions workflow that runs the tests and updates the coverage badge. - [See run](`4574811323`) - Documented steps in `testing.md` for running the tests locally, and viewing the `html` report. - Excluded non-essential files from coverage reporting to provide a more accurate measurement. Automatically created PR to update coverage badge: https://github.com/ChristianZaccaria/llama-stack/pull/9 # Note for reviewers 1. Currently the coverage report shows a 45% coverage. Wondering if there are other files or directories that should also be excluded from the report to increase the percentage. The directories with the least test coverage are `llama_stack/cli`, `llama_stack/models`, and `llama_stack/ui`. - Should we exclude these? 2. [Required] The `GITHUB_TOKEN` should have write permissions to open a PR to update the coverage badge. # GitHub Issue <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2355 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> The `testing.md` file describes how to run the unit tests locally.	2025-07-18 18:08:36 +02:00
Mustafa Elbehery	b78b8e1486	chore: add `mypy` inference parallel utils (#2670 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-18 12:01:10 +02:00
Mustafa Elbehery	ca7edcd6a4	chore(api): add `mypy` coverage to `chat_format` (#2654 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-18 11:56:53 +02:00
IAN MILLER	b57db11bed	feat: create dynamic model registration for OpenAI and Llama compat remote inference providers (#2745 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 7s Details Integration Tests / discover-tests (push) Successful in 13s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s Details Test External Providers / test-external-providers (venv) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 35s Details Python Package Build Test / build (3.12) (push) Failing after 51s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 57s Details Unit Tests / unit-tests (3.13) (push) Failing after 53s Details Pre-commit / pre-commit (push) Successful in 1m42s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this task is to create a solution that can automatically detect when new models are added, deprecated, or removed by OpenAI and Llama API providers, and automatically update the list of supported models in LLamaStack. This feature is vitally important in order to avoid missing new models and editing the entries manually hence I created automation allowing users to dynamically register: - any models from OpenAI provider available at [https://api.openai.com/v1/models](https://api.openai.com/v1/models) that are not in [https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/openai/models.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/openai/models.py) - any models from Llama API provider available at [https://api.llama.com/v1/models](https://api.llama.com/v1/models) that are not in [https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/llama_openai_compat/models.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/llama_openai_compat/models.py) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2504 this PR is dependant on #2710 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> 1. Create venv at root llamastack directory: `uv venv .venv --python 3.12 --seed` 2. Activate venv: `source .venv/bin/activate` 3. `uv pip install -e .` 4. Create OpenAI distro modifying run.yaml 5. Build distro: `llama stack build --template starter --image-type venv` 6. Then run LlamaStack, but before navigate to templates/starter folder: `llama stack run run.yaml --image-type venv OPENAI_API_KEY=<YOUR_KEY> ENABLE_OPENAI=openai` 7. Then try to register dummy llm that doesn't exist in OpenAI provider: ` llama-stack-client models register ianm/ianllm --provider-model-id=ianllm --provider-id=openai ` You should receive this output - combined list of static config + fetched available models from OpenAI: <img width="1380" height="474" alt="Screenshot 2025-07-14 at 12 48 50" src="https://github.com/user-attachments/assets/d26aad18-6b15-49ee-9c49-b01b2d33f883" /> 8. Then register real llm from OpenAI: llama-stack-client models register openai/gpt-4-turbo-preview --provider-model-id=gpt-4-turbo-preview --provider-id=openai <img width="1253" height="613" alt="Screenshot 2025-07-14 at 13 43 02" src="https://github.com/user-attachments/assets/60a5c9b1-3468-4eb9-9e92-cd7d21de3ca0" /> <img width="1288" height="655" alt="Screenshot 2025-07-14 at 13 43 11" src="https://github.com/user-attachments/assets/c1e48871-0e24-4bd9-a0b8-8c95552a51ee" /> We correctly fetched all available models from OpenAI As for Llama API, as a non-US person I don't have access to Llama API Key but I joined wait list. The implementation for Llama is the same as for OpenAI since Llama is openai compatible. So, the response from GET endpoint has the same structure as OpenAI https://llama.developer.meta.com/docs/api/models	2025-07-16 12:49:38 -04:00
Ashwin Bharambe	95fdc8ea94	build: Bump version to 0.2.15	2025-07-15 20:29:08 -07:00
Kelly Brown	b096794959	docs: Reorganize documentation on the webpage (#2651 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s Details Integration Tests / discover-tests (push) Successful in 2s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 14s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details Test Llama Stack Build / generate-matrix (push) Successful in 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Test External Providers / test-external-providers (venv) (push) Failing after 17s Details Update ReadTheDocs / update-readthedocs (push) Failing after 15s Details Test Llama Stack Build / build-single-provider (push) Failing after 21s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s Details Python Package Build Test / build (3.13) (push) Failing after 44s Details Test Llama Stack Build / build (push) Failing after 25s Details Integration Tests / test-matrix (push) Failing after 46s Details Pre-commit / pre-commit (push) Successful in 2m24s Details # What does this PR do? Reorganizes the Llama stack webpage into more concise index pages, introduce more of a workflow, and reduce repetition of content. New nav structure so far based on #2637 Further discussions in https://github.com/meta-llama/llama-stack/discussions/2585 Preview: ![Screenshot 2025-07-09 at 2 31 53 PM](https://github.com/user-attachments/assets/4c1f3845-b328-4f12-9f20-3f09375007af) You can also build a full local preview locally Feedback Looking for feedback on page titles and general feedback on the new structure Follow up documentation I plan on reducing some sections and standardizing some terminology in a follow up PR. More discussions on that in https://github.com/meta-llama/llama-stack/discussions/2585	2025-07-15 14:19:35 -07:00
Matthew Farrellee	68e7978c88	chore: block network access from unit tests (#2732 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 16s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Integration Tests / test-matrix (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Test Llama Stack Build / build (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Pre-commit / pre-commit (push) Successful in 1m0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s Details Integration Tests / discover-tests (push) Successful in 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 7s Details # What does this PR do? this blocks network access for all `tests/unit/` tests. `tests/integration/` are untouched. it also introduces an `allow_network` marker to explicitly allow network access. ## Test Plan `./scripts/unit-tests.sh`	2025-07-12 16:53:54 -07:00

1 2 3 4

171 commits