mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-08-15 14:08:00 +00:00
103 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
d6ae54723d
|
chore: setup for performance benchmarking (#3096)
# What does this PR do? 1. Added a simple mock openai-compat server that serves chat/completion 2. Add a benchmark server in EKS that includes mock inference server 3. Add locust (https://locust.io/) file for load testing ## Test Plan bash apply.sh kubectl port-forward service/locust-web-ui 8089:8089 Go to localhost:8089 to start a load test <img width="1392" height="334" alt="image" src="https://github.com/user-attachments/assets/d6aa3deb-583a-42ed-889b-751262b8e91c" /> <img width="1362" height="881" alt="image" src="https://github.com/user-attachments/assets/6a28b9b4-05e6-44e2-b504-07e60c12d35e" /> |
||
|
88c4fdc5d7
|
chore(python-deps): bump chromadb from 1.0.15 to 1.0.16 (#3083)
Bumps [chromadb](https://github.com/chroma-core/chroma) from 1.0.15 to 1.0.16. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/chroma-core/chroma/releases">chromadb's releases</a>.</em></p> <blockquote> <h2>1.0.16</h2> <p>Version: <code>1.0.16</code> Git ref: <code>refs/tags/1.0.16</code> Build Date: <code>2025-08-08T00:26</code> PIP Package: <code>chroma-1.0.16.tar.gz</code> Github Container Registry Image: <code>:1.0.16</code> DockerHub Image: <code>:1.0.16</code></p> <h2>What's Changed</h2> <ul> <li>[ENH]: add cache mount & tolerations to garbage collector template in Helm chart by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5016">chroma-core/chroma#5016</a></li> <li>[DOC] Fix docs typo by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5018">chroma-core/chroma#5018</a></li> <li>[CLN] Change GenericQuotaError from 429 to 422 by <a href="https://github.com/drewkim"><code>@drewkim</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5022">chroma-core/chroma#5022</a></li> <li>[CHORE] Fix type error in batch_utils by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5024">chroma-core/chroma#5024</a></li> <li>[ENH] Add block-level metrics by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/4801">chroma-core/chroma#4801</a></li> <li>[ENH]: return error on /add if embeddings are not provided by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5033">chroma-core/chroma#5033</a></li> <li>[DOC] Docs Polish 07/2025 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5032">chroma-core/chroma#5032</a></li> <li>[DOC] Flatten public txt files by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5040">chroma-core/chroma#5040</a></li> <li>[ENH]: require embeddings & require min embedding dimension on /add by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5037">chroma-core/chroma#5037</a></li> <li>[ENH] - Adds in dark mode support for hero image by <a href="https://github.com/tjkrusinskichroma"><code>@tjkrusinskichroma</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5042">chroma-core/chroma#5042</a></li> <li>[BLD] Use 8core runners for all our windows jobs by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5027">chroma-core/chroma#5027</a></li> <li>[TST] More benchmark queries for regex by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/4910">chroma-core/chroma#4910</a></li> <li>[BUG]: refactor otel/tracing initialization in the frontend to be independent of hosted entry point by <a href="https://github.com/c-gamble"><code>@c-gamble</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5028">chroma-core/chroma#5028</a></li> <li>[BUG] js client: handle 422 billing errors as QuotaExceeded instead of ChromaConnectionError by <a href="https://github.com/philipithomas"><code>@philipithomas</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5049">chroma-core/chroma#5049</a></li> <li>[BUG] RLS should use 32MB GRPC payload size limit by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5044">chroma-core/chroma#5044</a></li> <li>[BUG] Sync protoc arch and version in dockerfile by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5045">chroma-core/chroma#5045</a></li> <li>[BLD] Fix windows runner label by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5052">chroma-core/chroma#5052</a></li> <li>[PERF]: Prefetch segments in get and query by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5053">chroma-core/chroma#5053</a></li> <li>[PERF]: Parallelize fetching blocks for brute force regex by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5051">chroma-core/chroma#5051</a></li> <li>[RELEASE] JS 3.0.7 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5059">chroma-core/chroma#5059</a></li> <li>[ENH] Add a delete_many call to the storage API. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5020">chroma-core/chroma#5020</a></li> <li>[ENH] Consume delete_many from the wal3 garbage collector. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5021">chroma-core/chroma#5021</a></li> <li>[ENH]: limit number of concurrent get_all_block_ids() when using buffer_unordered() by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5062">chroma-core/chroma#5062</a></li> <li>[ENH]: use new <code>delete_many()</code> storage method in DeleteUnusedFiles operator by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5061">chroma-core/chroma#5061</a></li> <li>[BUG]: Disable aws stalled stream protection by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5063">chroma-core/chroma#5063</a></li> <li>[DOC] Update manage collections docs with correct delete collection info by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5066">chroma-core/chroma#5066</a></li> <li>[BUG] Improve wal3 robustness with better shutdown handling and error recovery by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5046">chroma-core/chroma#5046</a></li> <li>[ENH] Do not do any mutations of the manifest from within GC. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5050">chroma-core/chroma#5050</a></li> <li>[CHORE]: enable change notifier otel/tracing by <a href="https://github.com/c-gamble"><code>@c-gamble</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5073">chroma-core/chroma#5073</a></li> <li>[CHORE] Add pprof server to query service by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5072">chroma-core/chroma#5072</a></li> <li>[ENH]: Dedup inserts to the same key in foyer by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5074">chroma-core/chroma#5074</a></li> <li>[ENH] "Failed to fetch: status: NotFound" be gone. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5064">chroma-core/chroma#5064</a></li> <li>[CLN] Remove the the top most spammy log lines from rls/wal3. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5071">chroma-core/chroma#5071</a></li> <li>[DOC] Fix badge in readme by <a href="https://github.com/kylediaz"><code>@kylediaz</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5025">chroma-core/chroma#5025</a></li> <li>[ENH] A tool for patching logs that were deleted before a new manifest was installed. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5083">chroma-core/chroma#5083</a></li> <li>[BUG] Add billing errors to JS client by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5084">chroma-core/chroma#5084</a></li> <li>[CHORE]: Add s3 get metrics and pod name to tracing spans by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5086">chroma-core/chroma#5086</a></li> <li>[RELEASE] JS 3.0.8 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5087">chroma-core/chroma#5087</a></li> <li>[ENH] A tool to purge the cache. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5085">chroma-core/chroma#5085</a></li> <li>[DOC] Update PR template for migration and observability by <a href="https://github.com/HammadB"><code>@HammadB</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5089">chroma-core/chroma#5089</a></li> <li>[CHORE]: Fix s3 get metric name by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5091">chroma-core/chroma#5091</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
393f3714b0
|
chore(python-deps): bump torch from 2.7.1 to 2.8.0 (#3082)
Bumps [torch](https://github.com/pytorch/pytorch) from 2.7.1 to 2.8.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pytorch/pytorch/releases">torch's releases</a>.</em></p> <blockquote> <h1>PyTorch 2.8.0 Release Notes</h1> <ul> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#highlights">Highlights</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#backwards-incompatible-changes">Backwards Incompatible Changes</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#deprecations">Deprecations</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#new-features">New Features</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#improvements">Improvements</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#bug-fixes">Bug fixes</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#performance">Performance</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#documentation">Documentation</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#developers">Developers</a></li> </ul> <h1>Highlights</h1> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
b70e2f1f09
|
fix(dep): update to openai >= 1.99.6 and use new Function location (#3087)
# What does this PR do? closes #3072 ## Test Plan ci |
||
|
e565b91182 |
build: Bump version to 0.2.17
Some checks failed
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 7s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s
Test Llama Stack Build / generate-matrix (push) Successful in 8s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 9s
Python Package Build Test / build (3.12) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s
Test Llama Stack Build / build-single-provider (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s
Test External API and Providers / test-external (venv) (push) Failing after 7s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s
Unit Tests / unit-tests (3.12) (push) Failing after 7s
Python Package Build Test / build (3.13) (push) Failing after 9s
Update ReadTheDocs / update-readthedocs (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s
Unit Tests / unit-tests (3.13) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
Test Llama Stack Build / build (push) Failing after 12s
Pre-commit / pre-commit (push) Successful in 1m38s
|
||
|
55a2694c80
|
chore(python-deps): bump openai from 1.97.1 to 1.98.0 (#3025)
Bumps [openai](https://github.com/openai/openai-python) from 1.97.1 to 1.98.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/openai/openai-python/releases">openai's releases</a>.</em></p> <blockquote> <h2>v1.98.0</h2> <h2>1.98.0 (2025-07-30)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.97.2...v1.98.0">v1.97.2...v1.98.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> manual updates (<a href=" |
||
|
12f964437a
|
chore(python-deps): bump opentelemetry-exporter-otlp-proto-http from 1.35.0 to 1.36.0 (#3027)
Some checks failed
Test Llama Stack Build / generate-matrix (push) Successful in 8s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s
Python Package Build Test / build (3.13) (push) Failing after 1s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s
Python Package Build Test / build (3.12) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s
Test Llama Stack Build / build-single-provider (push) Failing after 19s
Update ReadTheDocs / update-readthedocs (push) Failing after 7s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 30s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 28s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 34s
Unit Tests / unit-tests (3.12) (push) Failing after 13s
Test External API and Providers / test-external (venv) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 25s
Unit Tests / unit-tests (3.13) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 30s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 30s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 29s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 31s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Has started running
Test Llama Stack Build / build (push) Failing after 12s
Pre-commit / pre-commit (push) Successful in 1m46s
Bumps [opentelemetry-exporter-otlp-proto-http](https://github.com/open-telemetry/opentelemetry-python) from 1.35.0 to 1.36.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/open-telemetry/opentelemetry-python/blob/main/CHANGELOG.md">opentelemetry-exporter-otlp-proto-http's changelog</a>.</em></p> <blockquote> <h2>Version 1.36.0/0.57b0 (2025-07-29)</h2> <ul> <li> <p>Add missing Prometheus exporter documentation (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/pull/4485">#4485</a>)</p> </li> <li> <p>Overwrite logging.config.fileConfig and logging.config.dictConfig to ensure the OTLP <code>LogHandler</code> remains attached to the root logger. Fix a bug that can cause a deadlock to occur over <code>logging._lock</code> in some cases (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/pull/4636">#4636</a>).</p> </li> <li> <p>otlp-http-exporter: set default value for param <code>timeout_sec</code> in <code>_export</code> method (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/pull/4691">#4691</a>)</p> </li> <li> <p>Update OTLP gRPC/HTTP exporters: calling shutdown will now interrupt exporters that are sleeping before a retry attempt, and cause them to return failure immediately. Update BatchSpan/LogRecordProcessors: shutdown will now complete after 30 seconds of trying to finish exporting any buffered telemetry, instead of continuing to export until all telemetry was exported. (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/pull/4638">#4638</a>).</p> </li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
48b49e318f
|
chore(python-deps): bump weaviate-client from 4.16.4 to 4.16.5 (#3026)
[//]: # (dependabot-start) ⚠️ **Dependabot is rebasing this PR** ⚠️ Rebasing might not happen immediately, so don't worry if this takes some time. Note: if you make any changes to this PR yourself, they will take precedence over the rebase. --- [//]: # (dependabot-end) Bumps [weaviate-client](https://github.com/weaviate/weaviate-python-client) from 4.16.4 to 4.16.5. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/weaviate/weaviate-python-client/releases">weaviate-client's releases</a>.</em></p> <blockquote> <h2>v3.13.0 - Support for Weaviate v1.18</h2> <h2>What's Changed</h2> <ul> <li>Extend CRUD operations for single data objects and reference with consistency level by <a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/234">weaviate/weaviate-python-client#234</a></li> <li>Extend batch operations with consistency level by <a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/240">weaviate/weaviate-python-client#240</a></li> <li>Add Cursor api by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/241">weaviate/weaviate-python-client#241</a></li> <li>Add support for backup Azure module by <a href="https://github.com/antas-marcin"><code>@antas-marcin</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/246">weaviate/weaviate-python-client#246</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> made their first contribution in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/234">weaviate/weaviate-python-client#234</a></li> <li><a href="https://github.com/antas-marcin"><code>@antas-marcin</code></a> made their first contribution in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/246">weaviate/weaviate-python-client#246</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/weaviate/weaviate-python-client/compare/v3.12.0...v3.13.0">https://github.com/weaviate/weaviate-python-client/compare/v3.12.0...v3.13.0</a></p> <h2>v3.12.1b - Support for weaviate v1.18</h2> <h2>What's Changed</h2> <ul> <li>Extend CRUD operations for single data objects and reference with consistency level by <a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/234">weaviate/weaviate-python-client#234</a></li> <li>Extend batch operations with consistency level by <a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/240">weaviate/weaviate-python-client#240</a></li> <li>Add Cursor api by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/241">weaviate/weaviate-python-client#241</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> made their first contribution in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/234">weaviate/weaviate-python-client#234</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/weaviate/weaviate-python-client/compare/v3.12.0...v3.12.1b">https://github.com/weaviate/weaviate-python-client/compare/v3.12.0...v3.12.1b</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/weaviate/weaviate-python-client/blob/main/docs/changelog.rst">weaviate-client's changelog</a>.</em></p> <blockquote> <h2>Version 4.16.5</h2> <p>This patch version includes: - Add <code>dimensions</code> property to Google vectorizers in <code>Configure.Vectors</code></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
4411e6e362
|
chore(ci): remove reportlab dep (#3033)
# What does this PR do? remove reportlab dep. change dynamic pdf generation into a pre-computed pdf. ## Test Plan ci |
||
|
edc19698fb
|
chore(python-deps): bump huggingface-hub from 0.34.2 to 0.34.3 (#3028)
Bumps [huggingface-hub](https://github.com/huggingface/huggingface_hub) from 0.34.2 to 0.34.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/huggingface_hub/releases">huggingface-hub's releases</a>.</em></p> <blockquote> <h2>[v0.34.3] Jobs improvements and <code>whoami</code> user prefix</h2> <ul> <li>[Jobs] Update uv image <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3270">#3270</a> by <a href="https://github.com/lhoestq"><code>@lhoestq</code></a></li> <li>[Update] HF Jobs Documentation <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3268">#3268</a> by <a href="https://github.com/ariG23498"><code>@ariG23498</code></a></li> <li>Add 'user:' prefix to whoami command output <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3267">#3267</a> by <a href="https://github.com/gary149"><code>@gary149</code></a></li> </ul> <p>Full Changelog: <a href="https://github.com/huggingface/huggingface_hub/compare/v0.34.2...v0.34.3">https://github.com/huggingface/huggingface_hub/compare/v0.34.2...v0.34.3</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
33cca26154
|
chore: Enabling Integration tests for Weaviate (#2882)
# What does this PR do? This PR (1) enables the files API for Weaviate and (2) enables integration tests for Weaviate, which adds a docker container to the github action. This PR also handles a couple of edge cases for in creating the collection and ensuring the tests all pass. ## Test Plan CI enabled --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> |
||
|
25d3dfa30f
|
fix: fix No module named 'ollama' in test_inference_recordings.py (#2967)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fixes the following error in unit test that was running on up to date main branch: ``` FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_recording_mode - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_replay_mode - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_replay_missing_recording - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_embeddings_recording - ModuleNotFoundError: No module named 'ollama' =============================== 4 failed, 499 passed, 198 warnings in 34.50s ================================ ``` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Run `./scripts/unit-tests.sh` |
||
|
870a37ff4b
|
feat: add base64 encoded PDF support for OpenAI Chat Completions (#2881)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests / discover-tests (push) Successful in 3s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s
Test External API and Providers / test-external (venv) (push) Failing after 16s
Test Llama Stack Build / build (push) Failing after 9s
Python Package Build Test / build (3.12) (push) Failing after 23s
Update ReadTheDocs / update-readthedocs (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 29s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 31s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 58s
Python Package Build Test / build (3.13) (push) Failing after 54s
Integration Tests / test-matrix (push) Failing after 56s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m4s
Pre-commit / pre-commit (push) Successful in 2m15s
# What does this PR do? OpenAI Chat Completions supports passing a base64 encoded PDF file to a model, but Llama Stack currently does not allow for this behavior. This PR extends our implementation of the OpenAI API spec to change that. Closes #2129 ## Test Plan A new functional test has been added to test the validity of such a request Signed-off-by: Nathan Weinberg <nweinber@redhat.com> |
||
|
cf8722079c |
build: Bump version to 0.2.16
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s
Integration Tests / discover-tests (push) Successful in 8s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s
Python Package Build Test / build (3.12) (push) Failing after 1s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s
Python Package Build Test / build (3.13) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 13s
Test External API and Providers / test-external (venv) (push) Failing after 8s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s
Test Llama Stack Build / build (push) Failing after 7s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Unit Tests / unit-tests (3.13) (push) Failing after 9s
Integration Tests / test-matrix (push) Failing after 8s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 12s
Test Llama Stack Build / build-single-provider (push) Failing after 35s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 42s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 44s
Pre-commit / pre-commit (push) Successful in 1m23s
|
||
|
e189f65548
|
chore(python-deps): bump pydantic from 2.10.6 to 2.11.7 (#2925)
Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.10.6 to 2.11.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pydantic/pydantic/releases">pydantic's releases</a>.</em></p> <blockquote> <h2>v2.11.7 2025-06-14</h2> <!-- raw HTML omitted --> <h2>What's Changed</h2> <h3>Fixes</h3> <ul> <li>Copy <code>FieldInfo</code> instance if necessary during <code>FieldInfo</code> build by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11980">pydantic/pydantic#11980</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/pydantic/pydantic/compare/v2.11.6...v2.11.7">https://github.com/pydantic/pydantic/compare/v2.11.6...v2.11.7</a></p> <h2>v2.11.6 2025-06-13</h2> <h2>v2.11.6 (2025-06-13)</h2> <h3>What's Changed</h3> <h4>Fixes</h4> <ul> <li>Rebuild dataclass fields before schema generation by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11949">#11949</a></li> <li>Always store the original field assignment on <code>FieldInfo</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11946">#11946</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/pydantic/pydantic/compare/v2.11.5...v2.11.6">https://github.com/pydantic/pydantic/compare/v2.11.5...v2.11.6</a></p> <h2>v2.11.5 2025-05-22</h2> <!-- raw HTML omitted --> <h2>What's Changed</h2> <h3>Fixes</h3> <ul> <li>Check if <code>FieldInfo</code> is complete after applying type variable map by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11855">#11855</a></li> <li>Do not delete mock validator/serializer in <code>model_rebuild()</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11890">#11890</a></li> <li>Do not duplicate metadata on model rebuild by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11902">#11902</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5">https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5</a></p> <h2>v2.11.4 2025-04-29</h2> <h3>What's Changed</h3> <h4>Packaging</h4> <ul> <li>Bump <code>mkdocs-llmstxt</code> to v0.2.0 by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11725">#11725</a></li> </ul> <h4>Changes</h4> <ul> <li>Allow config and bases to be specified together in <code>create_model()</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11714">#11714</a>. This change was backported as it was previously possible (although not meant to be supported) to provide <code>model_config</code> as a field, which would make it possible to provide both configuration and bases.</li> </ul> <h4>Fixes</h4> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pydantic/pydantic/blob/main/HISTORY.md">pydantic's changelog</a>.</em></p> <blockquote> <h2>v2.11.7 (2025-06-14)</h2> <p><a href="https://github.com/pydantic/pydantic/releases/tag/v2.11.7">GitHub release</a></p> <h3>What's Changed</h3> <h4>Fixes</h4> <ul> <li>Copy <code>FieldInfo</code> instance if necessary during <code>FieldInfo</code> build by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11898">#11898</a></li> </ul> <h2>v2.11.6 (2025-06-13)</h2> <p><a href="https://github.com/pydantic/pydantic/releases/tag/v2.11.6">GitHub release</a></p> <h3>What's Changed</h3> <h4>Fixes</h4> <ul> <li>Rebuild dataclass fields before schema generation by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11949">#11949</a></li> <li>Always store the original field assignment on <code>FieldInfo</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11946">#11946</a></li> </ul> <h2>v2.11.5 (2025-05-22)</h2> <p><a href="https://github.com/pydantic/pydantic/releases/tag/v2.11.5">GitHub release</a></p> <h3>What's Changed</h3> <h4>Fixes</h4> <ul> <li>Check if <code>FieldInfo</code> is complete after applying type variable map by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11855">#11855</a></li> <li>Do not delete mock validator/serializer in <code>model_rebuild()</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11890">#11890</a></li> <li>Do not duplicate metadata on model rebuild by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11902">#11902</a></li> </ul> <h2>v2.11.4 (2025-04-29)</h2> <p><a href="https://github.com/pydantic/pydantic/releases/tag/v2.11.4">GitHub release</a></p> <h3>What's Changed</h3> <h4>Packaging</h4> <ul> <li>Bump <code>mkdocs-llmstxt</code> to v0.2.0 by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11725">#11725</a></li> </ul> <h4>Changes</h4> <ul> <li>Allow config and bases to be specified together in <code>create_model()</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11714">#11714</a>. This change was backported as it was previously possible (although not meant to be supported) to provide <code>model_config</code> as a field, which would make it possible to provide both configuration and bases.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
9583f468f8
|
feat(starter)!: simplify starter distro; litellm model registry changes (#2916) | ||
|
cd8715d327
|
chore: Added openai compatible vector io endpoints for chromadb (#2489)
Some checks failed
Integration Tests / discover-tests (push) Successful in 3s
Coverage Badge / unit-tests (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 12s
Test External Providers / test-external-providers (venv) (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
Test Llama Stack Build / build (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 49s
Integration Tests / test-matrix (push) Failing after 53s
Pre-commit / pre-commit (push) Successful in 1m42s
# What does this PR do? This PR implements the openai compatible endpoints for chromadb Closes #2462 ## Test Plan Ran ollama llama stack server and ran the command `pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2` 8 failed, 27 passed, 8 skipped, 1 xfailed The failed ones are regarding files api --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com> Co-authored-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> |
||
|
e45543f7f3
|
test: Measure and track code coverage (#2636)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- Added coverage badge to README. - [See my
fork](https://github.com/ChristianZaccaria/llama-stack)
- Added a GitHub Actions workflow that runs the tests and updates the
coverage badge. - [See
run](
|
||
|
b57db11bed
|
feat: create dynamic model registration for OpenAI and Llama compat remote inference providers (#2745)
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 2s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s
Update ReadTheDocs / update-readthedocs (push) Failing after 3s
Test Llama Stack Build / build-single-provider (push) Failing after 7s
Integration Tests / discover-tests (push) Successful in 13s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s
Integration Tests / test-matrix (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s
Test External Providers / test-external-providers (venv) (push) Failing after 17s
Test Llama Stack Build / build (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 35s
Python Package Build Test / build (3.12) (push) Failing after 51s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 57s
Unit Tests / unit-tests (3.13) (push) Failing after 53s
Pre-commit / pre-commit (push) Successful in 1m42s
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this task is to create a solution that can automatically detect when new models are added, deprecated, or removed by OpenAI and Llama API providers, and automatically update the list of supported models in LLamaStack. This feature is vitally important in order to avoid missing new models and editing the entries manually hence I created automation allowing users to dynamically register: - any models from OpenAI provider available at [https://api.openai.com/v1/models](https://api.openai.com/v1/models) that are not in [https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/openai/models.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/openai/models.py) - any models from Llama API provider available at [https://api.llama.com/v1/models](https://api.llama.com/v1/models) that are not in [https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/llama_openai_compat/models.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/llama_openai_compat/models.py) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2504 this PR is dependant on #2710 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> 1. Create venv at root llamastack directory: `uv venv .venv --python 3.12 --seed` 2. Activate venv: `source .venv/bin/activate` 3. `uv pip install -e .` 4. Create OpenAI distro modifying run.yaml 5. Build distro: `llama stack build --template starter --image-type venv` 6. Then run LlamaStack, but before navigate to templates/starter folder: `llama stack run run.yaml --image-type venv OPENAI_API_KEY=<YOUR_KEY> ENABLE_OPENAI=openai` 7. Then try to register dummy llm that doesn't exist in OpenAI provider: ` llama-stack-client models register ianm/ianllm --provider-model-id=ianllm --provider-id=openai ` You should receive this output - combined list of static config + fetched available models from OpenAI: <img width="1380" height="474" alt="Screenshot 2025-07-14 at 12 48 50" src="https://github.com/user-attachments/assets/d26aad18-6b15-49ee-9c49-b01b2d33f883" /> 8. Then register real llm from OpenAI: llama-stack-client models register openai/gpt-4-turbo-preview --provider-model-id=gpt-4-turbo-preview --provider-id=openai <img width="1253" height="613" alt="Screenshot 2025-07-14 at 13 43 02" src="https://github.com/user-attachments/assets/60a5c9b1-3468-4eb9-9e92-cd7d21de3ca0" /> <img width="1288" height="655" alt="Screenshot 2025-07-14 at 13 43 11" src="https://github.com/user-attachments/assets/c1e48871-0e24-4bd9-a0b8-8c95552a51ee" /> We correctly fetched all available models from OpenAI As for Llama API, as a non-US person I don't have access to Llama API Key but I joined wait list. The implementation for Llama is the same as for OpenAI since Llama is openai compatible. So, the response from GET endpoint has the same structure as OpenAI https://llama.developer.meta.com/docs/api/models |
||
|
95fdc8ea94 | build: Bump version to 0.2.15 | ||
|
b096794959
|
docs: Reorganize documentation on the webpage (#2651)
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s
Integration Tests / discover-tests (push) Successful in 2s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s
Python Package Build Test / build (3.12) (push) Failing after 14s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s
Unit Tests / unit-tests (3.13) (push) Failing after 15s
Test Llama Stack Build / generate-matrix (push) Successful in 16s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s
Test External Providers / test-external-providers (venv) (push) Failing after 17s
Update ReadTheDocs / update-readthedocs (push) Failing after 15s
Test Llama Stack Build / build-single-provider (push) Failing after 21s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 28s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s
Python Package Build Test / build (3.13) (push) Failing after 44s
Test Llama Stack Build / build (push) Failing after 25s
Integration Tests / test-matrix (push) Failing after 46s
Pre-commit / pre-commit (push) Successful in 2m24s
# What does this PR do? Reorganizes the Llama stack webpage into more concise index pages, introduce more of a workflow, and reduce repetition of content. New nav structure so far based on #2637 Further discussions in https://github.com/meta-llama/llama-stack/discussions/2585 **Preview:**  You can also build a full local preview locally **Feedback** Looking for feedback on page titles and general feedback on the new structure **Follow up documentation** I plan on reducing some sections and standardizing some terminology in a follow up PR. More discussions on that in https://github.com/meta-llama/llama-stack/discussions/2585 |
||
|
68e7978c88
|
chore: block network access from unit tests (#2732)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s
Integration Tests / discover-tests (push) Successful in 5s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 5s
Test External Providers / test-external-providers (venv) (push) Failing after 4s
Test Llama Stack Build / build-single-provider (push) Failing after 7s
Python Package Build Test / build (3.12) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 10s
Unit Tests / unit-tests (3.12) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s
Python Package Build Test / build (3.13) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 16s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Integration Tests / test-matrix (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s
Test Llama Stack Build / build (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
Pre-commit / pre-commit (push) Successful in 1m0s
# What does this PR do? this blocks network access for all `tests/unit/` tests. `tests/integration/` are untouched. it also introduces an `allow_network` marker to explicitly allow network access. ## Test Plan `./scripts/unit-tests.sh` |
||
|
51d9fd4808
|
fix: Don't cache clients for passthrough auth providers (#2728)
Some checks failed
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 43s
Unit Tests / unit-tests (3.12) (push) Failing after 45s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 4s
Integration Tests / discover-tests (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s
Pre-commit / pre-commit (push) Successful in 2m8s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 5s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 12s
Test Llama Stack Build / build-single-provider (push) Failing after 7s
Python Package Build Test / build (3.13) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 7s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s
Test External Providers / test-external-providers (venv) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Integration Tests / test-matrix (push) Failing after 6s
Test Llama Stack Build / build (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 16s
# What does this PR do? Some of our inference providers support passthrough authentication via `x-llamastack-provider-data` header values. This fixes the providers that support passthrough auth to not cache their clients to the backend providers (mostly OpenAI client instances) so that the client connecting to Llama Stack has to provide those auth values on each and every request. ## Test Plan I added some unit tests to ensure we're not caching clients across requests for all the fixed providers in this PR. ``` uv run pytest -sv tests/unit/providers/inference/test_inference_client_caching.py ``` I also ran some of our OpenAI compatible API integration tests for each of the changed providers, just to ensure they still work. Note that these providers don't actually pass all these tests (for unrelated reasons due to quirks of the Groq and Together SaaS services), but enough of the tests passed to confirm the clients are still working as intended. ### Together ``` ENABLE_TOGETHER="together" \ uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv \ tests/integration/inference/test_openai_completion.py \ --text-model "together/meta-llama/Llama-3.1-8B-Instruct" ``` ### OpenAI ``` ENABLE_OPENAI="openai" \ uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv \ tests/integration/inference/test_openai_completion.py \ --text-model "openai/gpt-4o-mini" ``` ### Groq ``` ENABLE_GROQ="groq" \ uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv \ tests/integration/inference/test_openai_completion.py \ --text-model "groq/meta-llama/Llama-3.1-8B-Instruct" ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> |
||
|
30b2e6a495
|
chore: default to pytest asyncio-mode=auto (#2730)
# What does this PR do? previously, developers who ran `./scripts/unit-tests.sh` would get `asyncio-mode=auto`, which meant `@pytest.mark.asyncio` and `@pytest_asyncio.fixture` were redundent. developers who ran `pytest` directly would get pytest's default (strict mode), would run into errors leading them to add `@pytest.mark.asyncio` / `@pytest_asyncio.fixture` to their code. with this change - - `asyncio_mode=auto` is included in `pyproject.toml` making behavior consistent for all invocations of pytest - removes all redundant `@pytest_asyncio.fixture` and `@pytest.mark.asyncio` - for good measure, requires `pytest>=8.4` and `pytest-asyncio>=1.0` ## Test Plan - `./scripts/unit-tests.sh` - `uv run pytest tests/unit` |
||
|
2ebc172f33
|
fix: pin opentelemtry version (#2722)
Some checks failed
Integration Tests / test-matrix (push) Failing after 12s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Python Package Build Test / build (3.13) (push) Failing after 44s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 54s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 56s
Pre-commit / pre-commit (push) Successful in 2m9s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 5s
Integration Tests / discover-tests (push) Successful in 4s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s
Test Llama Stack Build / generate-matrix (push) Successful in 4s
Test External Providers / test-external-providers (venv) (push) Failing after 3s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s
Unit Tests / unit-tests (3.13) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 11s
Test Llama Stack Build / build-single-provider (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 13s
Unit Tests / unit-tests (3.12) (push) Failing after 9s
Test Llama Stack Build / build (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s
Python Package Build Test / build (3.12) (push) Failing after 10s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s
# What does this PR do? Otherwise we can get old versions like 1.11 and experience this error: ``` ModuleNotFoundError: No module named 'opentelemetry.exporter.otlp.proto.http.metric_exporter' ``` Signed-off-by: Sébastien Han <seb@redhat.com> |
||
|
7915551eee
|
build: replace "python-jose" with "python-jose[cryptography]" (#2695)
Some checks failed
Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 6s
Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 6s
Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 7s
Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 6s
Integration Tests / test-matrix (server, 3.13, safety) (push) Failing after 6s
Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 6s
Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 5s
Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 9s
Test Llama Stack Build / generate-matrix (push) Successful in 42s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 46s
Test Llama Stack Build / build-single-provider (push) Failing after 43s
Python Package Build Test / build (3.12) (push) Failing after 1s
Update ReadTheDocs / update-readthedocs (push) Failing after 3s
Test External Providers / test-external-providers (venv) (push) Failing after 6s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Test Llama Stack Build / build (push) Failing after 5s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 54s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 17s
Python Package Build Test / build (3.13) (push) Failing after 15s
Pre-commit / pre-commit (push) Successful in 1m43s
# What does this PR do? `python-jose` recommends using the `cryptography` backend in their installation docs: https://github.com/mpdavis/python-jose?tab=readme-ov-file#cryptographic-backends This PR modifies the LLS dependencies to use this instead of the current `native-python` Signed-off-by: Nathan Weinberg <nweinber@redhat.com> |
||
|
83c89265e0
|
chore: Adding unit tests for Milvus and OpenAI compatibility (#2640)
Some checks failed
Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 13s
Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 9s
Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 11s
Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 7s
Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 5s
Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 5s
Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 4s
Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s
Test Llama Stack Build / generate-matrix (push) Successful in 36s
Test Llama Stack Build / build-single-provider (push) Failing after 36s
Python Package Build Test / build (3.13) (push) Failing after 2s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 36s
Test External Providers / test-external-providers (venv) (push) Failing after 4s
Test Llama Stack Build / build (push) Failing after 3s
Update ReadTheDocs / update-readthedocs (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 45s
Python Package Build Test / build (3.12) (push) Failing after 17s
Unit Tests / unit-tests (3.13) (push) Failing after 18s
Pre-commit / pre-commit (push) Successful in 1m35s
# What does this PR do? - Enabling Unit tests for Milvus to start to test OpenAI compatibility and fixing a few bugs. - Also fixed an inconsistency in the Milvus config between remote and inline. - Added pymilvus to extras for testing in CI I'm going to refactor this later to include the other inline providers so that we can catch issues sooner. I have another PR where I've been testing to find other bugs in the implementation (and required changes drafted here: https://github.com/meta-llama/llama-stack/pull/2617). ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> |
||
|
f1c62e0af0 | build: Bump version to 0.2.14 | ||
|
21669b14e7
|
fix(docs): add setuptools explicitly (#2547)
Some checks failed
Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 31s
Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 35s
Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s
Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s
Test Llama Stack Build / build-single-provider (push) Failing after 6s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s
Python Package Build Test / build (3.12) (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Test External Providers / test-external-providers (venv) (push) Failing after 8s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s
Test Llama Stack Build / build (push) Failing after 10s
Python Package Build Test / build (3.13) (push) Failing after 30s
Pre-commit / pre-commit (push) Successful in 1m23s
Given the shift to python3.12, we need to explicitly depend on `setuptools` for the pkg_resources import ## Test Plan Run ``` cd local/llama-stack UV_PROJECT_ENVIRONMENT=/tmp/docs uv sync --frozen --group docs cd /tmp/docs uv run python -m sphinx -T -b html -d _build/doctrees -D language=en \ ~/local/llama-stack/docs/source/ \ /tmp/docs/html ``` |
||
|
709eb7da33 | build: Bump version to 0.2.13 | ||
|
9baa16e498
|
fix(security): Upgrade protobuf and aiohttp. Fixes CVE-2025-4565 (#2541)
# What does this PR do? Fixes CVE-2025-4565 and the following warning: ``` warning: `aiohttp==3.11.13` is yanked (reason: "Regression: https://github.com/aio-libs/aiohttp/issues/10617") ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> |
||
|
40fdce79b3
|
fix(security): Upgrade urllib3 to v2.5.0. Fixes CVE-2025-50181 and CVE-2025-50182 (#2534)
Some checks failed
Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 16s
Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 15s
Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 2s
Python Package Build Test / build (3.13) (push) Failing after 3s
Python Package Build Test / build (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Test Llama Stack Build / build (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
Test Llama Stack Build / build-single-provider (push) Failing after 36s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 34s
Test External Providers / test-external-providers (venv) (push) Failing after 32s
Pre-commit / pre-commit (push) Successful in 1m21s
This fixes CVE-2025-50181 and CVE-2025-50182. Changes via: ``` uv sync --upgrade-package urllib3 uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> |
||
|
dbdc811d16
|
chore: isolate bare minimum project dependencies (#2282)
Some checks failed
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 12s
Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 20s
Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 14s
Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 7s
Test Llama Stack Build / generate-matrix (push) Successful in 7s
Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 16s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 16s
Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 18s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s
Python Package Build Test / build (3.12) (push) Failing after 5s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 17s
Python Package Build Test / build (3.13) (push) Failing after 4s
Test Llama Stack Build / build-single-provider (push) Failing after 8s
Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s
Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 26s
Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 19s
Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 15s
Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s
Test External Providers / test-external-providers (venv) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Unit Tests / unit-tests (3.12) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 4s
Test Llama Stack Build / build (push) Failing after 7s
Pre-commit / pre-commit (push) Successful in 48s
# What does this PR do? The goal is to promote the minimal set of dependencies the project needs to run, this includes: * dependencies needed to work with the CLI * dependencies needed for the server to run with no providers This also: * Relocate redundant dependencies out of the core project and into the individual providers that actually require them. * Include all necessary server dependencies so the project can run standalone, even without any providers. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Build and run distro a server. Signed-off-by: Sébastien Han <seb@redhat.com> |
||
|
9c8be89fb6
|
chore: bump python supported version to 3.12 (#2475)
Some checks failed
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 12s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 16s
Test Llama Stack Build / build-single-provider (push) Failing after 9s
Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 7s
Python Package Build Test / build (3.13) (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s
Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 14s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 15s
Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 14s
Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 12s
Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 13s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 14s
Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 11s
Unit Tests / unit-tests (3.12) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 8s
Test Llama Stack Build / build (push) Failing after 6s
Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 41s
Python Package Build Test / build (3.12) (push) Failing after 33s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 36s
Test External Providers / test-external-providers (venv) (push) Failing after 31s
Pre-commit / pre-commit (push) Successful in 1m54s
# What does this PR do? The project now supports Python >= 3.12 Signed-off-by: Sébastien Han <seb@redhat.com> |
||
|
6fde601765
|
chore: upgrade hf hub dependency (#2487)
Some checks failed
Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s
Test Llama Stack Build / generate-matrix (push) Successful in 7s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s
Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 9s
Python Package Build Test / build (3.11) (push) Failing after 2s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 10s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 9s
Python Package Build Test / build (3.13) (push) Failing after 2s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s
Python Package Build Test / build (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Test External Providers / test-external-providers (venv) (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 11s
Unit Tests / unit-tests (3.11) (push) Failing after 13s
Test Llama Stack Build / build (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 33s
Test Llama Stack Build / build-single-provider (push) Failing after 31s
Pre-commit / pre-commit (push) Successful in 1m12s
# What does this PR do? CI tests have been failing with .venv/lib/python3.12/site-packages/peft/auto.py:21: in <module> from transformers import ( .venv/lib/python3.12/site-packages/transformers/__init__.py:27: in <module> from . import dependency_versions_check .venv/lib/python3.12/site-packages/transformers/dependency_versions_check.py:57: in <module> require_version_core(deps[pkg]) .venv/lib/python3.12/site-packages/transformers/utils/versions.py:117: in require_version_core return require_version(requirement, hint) .venv/lib/python3.12/site-packages/transformers/utils/versions.py:111: in require_version _compare_versions(op, got_ver, want_ver, requirement, pkg, hint) .venv/lib/python3.12/site-packages/transformers/utils/versions.py:44: in _compare_versions raise ImportError( E ImportError: huggingface-hub>=0.30.0,<1.0 is required for a normal functioning of this module, but found huggingface-hub==0.29.0. E Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main ------------------------------ Captured log setup ------------------------------ INFO llama_stack.providers.remote.inference.ollama.ollama:ollama.py:106 checking connectivity to Ollama at `http://0.0.0.0:11434`.../ =========================== short test summary info ============================ ERROR tests/integration/providers/test_providers.py::TestProviders::test_providers - ImportError: huggingface-hub>=0.30.0,<1.0 is required for a normal functioning of this module, but found huggingface-hub==0.29.0. Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main =================== 1 skipped, 4 warnings, 1 error in 9.52s ==================== ## Test Plan CI |
||
|
d70573bd47 | build: Bump version to 0.2.12 | ||
|
d12f195f56
|
feat: drop python 3.10 support (#2469)
# What does this PR do? dropped python3.10, updated pyproject and dependencies, and also removed some blocks of code with special handling for enum.StrEnum Closes #2458 Signed-off-by: Charlie Doern <cdoern@redhat.com> |
||
|
7d812e3bf0 |
build: Bump version to 0.2.11
Some checks failed
Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 12s
Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.10, vector_io) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 4s
Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 4s
Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 5s
Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s
Test External Providers / test-external-providers (venv) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 10s
Unit Tests / unit-tests (3.10) (push) Failing after 7s
Update ReadTheDocs / update-readthedocs (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 10s
Unit Tests / unit-tests (3.11) (push) Failing after 9s
Unit Tests / unit-tests (3.13) (push) Failing after 17s
Pre-commit / pre-commit (push) Successful in 55s
|
||
|
f6718b2408
|
fix(security): Upgrade requests to 2.32.4. Fixes CVE-2024-47081 (#2425)
# What does this PR do? This address https://github.com/advisories/GHSA-9hjg-9r4m-mvj7. Diff was generated via: ``` uv sync --upgrade-package requests uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> |
||
|
a34cef925b
|
fix(faiss): handle case where distance is 0 by setting d to minimum positive… (#2387)
# What does this PR do? Adds try-catch to faiss `query_vector` function for when the distance between the query embedding and an embedding within the vector db is 0 (identical vectors). Catches `ZeroDivisionError` and then appends `(1.0 / sys.float_info.min)` to `scores` to represent maximum similarity. <!-- If resolving an issue, uncomment and update the line below --> Closes [#2381] ## Test Plan Checkout this PR Execute this code and there will no longer be a `ZeroDivisionError` exception ``` from llama_stack_client import LlamaStackClient base_url = "http://localhost:8321" client = LlamaStackClient(base_url=base_url) models = client.models.list() embedding_model = ( em := next(m for m in models if m.model_type == "embedding") ).identifier embedding_dimension = 384 _ = client.vector_dbs.register( vector_db_id="foo_db", embedding_model=embedding_model, embedding_dimension=embedding_dimension, provider_id="faiss", ) chunk = { "content": "foo", "mime_type": "text/plain", "metadata": { "document_id": "foo-id" } } client.vector_io.insert(vector_db_id="foo_db", chunks=[chunk]) client.vector_io.query(vector_db_id="foo_db", query="foo") ``` ### Running unit tests `uv run pytest tests/unit/rag/test_rag_query.py -v` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Ben Browning <bbrownin@redhat.com> |
||
|
692709cd45 |
build: Bump version to 0.2.10
Some checks failed
Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 9s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s
Test External Providers / test-external-providers (venv) (push) Failing after 7s
Unit Tests / unit-tests (3.10) (push) Failing after 8s
Unit Tests / unit-tests (3.11) (push) Failing after 7s
Unit Tests / unit-tests (3.12) (push) Failing after 7s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Unit Tests / unit-tests (3.13) (push) Failing after 9s
Test Llama Stack Build / build-single-provider (push) Failing after 27s
Test Llama Stack Build / build (push) Failing after 7s
Pre-commit / pre-commit (push) Failing after 1m16s
|
||
|
102516f33c
|
fix: Pin fastapi to avoid picking up spurious versions in test pypi (#2409)
as titled |
||
|
04592b9590
|
fix: update pyproject to include recursive LS deps (#2404)
trying to run `llama` cli after installing wheel fails with this error ``` Traceback (most recent call last): File "/tmp/tmp.wdZath9U6j/.venv/bin/llama", line 4, in <module> from llama_stack.cli.llama import main File "/tmp/tmp.wdZath9U6j/.venv/lib/python3.10/site-packages/llama_stack/__init__.py", line 7, in <module> from llama_stack.distribution.library_client import ( # noqa: F401 ModuleNotFoundError: No module named 'llama_stack.distribution.library_client' ``` This PR fixes it by ensurring that all sub-directories of `llama_stack` are also included. Also, fixes the missing `fastapi` dependency issue. |
||
|
3c9a10d2fe
|
feat: reference implementation for files API (#2330)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Integration Tests / test-matrix (http, post_training) (push) Failing after 9s
Integration Tests / test-matrix (http, agents) (push) Failing after 10s
Integration Tests / test-matrix (http, providers) (push) Failing after 8s
Integration Tests / test-matrix (http, inference) (push) Failing after 11s
Integration Tests / test-matrix (http, inspect) (push) Failing after 10s
Integration Tests / test-matrix (http, datasets) (push) Failing after 11s
Integration Tests / test-matrix (library, datasets) (push) Failing after 8s
Integration Tests / test-matrix (http, scoring) (push) Failing after 10s
Integration Tests / test-matrix (library, inference) (push) Failing after 8s
Integration Tests / test-matrix (library, agents) (push) Failing after 10s
Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 11s
Integration Tests / test-matrix (library, inspect) (push) Failing after 8s
Test External Providers / test-external-providers (venv) (push) Failing after 7s
Integration Tests / test-matrix (library, post_training) (push) Failing after 9s
Integration Tests / test-matrix (library, scoring) (push) Failing after 8s
Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 8s
Integration Tests / test-matrix (library, providers) (push) Failing after 9s
Unit Tests / unit-tests (3.11) (push) Failing after 7s
Unit Tests / unit-tests (3.10) (push) Failing after 7s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 8s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Pre-commit / pre-commit (push) Successful in 53s
# What does this PR do? TSIA Added Files provider to the fireworks template. Might want to add to all templates as a follow-up. ## Test Plan llama-stack pytest tests/unit/files/test_files.py llama-stack llama stack build --template fireworks --image-type conda --run LLAMA_STACK_CONFIG=http://localhost:8321 pytest -s -v tests/integration/files/ |
||
|
ad15276da1 |
build: Bump version to 0.2.9
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s
Integration Tests / test-matrix (http, inspect) (push) Failing after 9s
Integration Tests / test-matrix (http, providers) (push) Failing after 9s
Integration Tests / test-matrix (http, agents) (push) Failing after 10s
Integration Tests / test-matrix (library, agents) (push) Failing after 8s
Integration Tests / test-matrix (http, scoring) (push) Failing after 9s
Integration Tests / test-matrix (http, datasets) (push) Failing after 10s
Integration Tests / test-matrix (http, post_training) (push) Failing after 10s
Integration Tests / test-matrix (http, inference) (push) Failing after 11s
Integration Tests / test-matrix (library, inference) (push) Failing after 8s
Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 10s
Integration Tests / test-matrix (library, datasets) (push) Failing after 10s
Test External Providers / test-external-providers (venv) (push) Failing after 5s
Integration Tests / test-matrix (library, inspect) (push) Failing after 7s
Integration Tests / test-matrix (library, post_training) (push) Failing after 8s
Unit Tests / unit-tests (3.10) (push) Failing after 7s
Integration Tests / test-matrix (library, providers) (push) Failing after 9s
Integration Tests / test-matrix (library, scoring) (push) Failing after 9s
Unit Tests / unit-tests (3.11) (push) Failing after 8s
Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Pre-commit / pre-commit (push) Failing after 1m34s
|
||
|
63a9f08c9e
|
chore: use starlette built-in Route class (#2267)
# What does this PR do? Use a more common pattern and known terminology from the ecosystem, where Route is more approved than Endpoint. Signed-off-by: Sébastien Han <seb@redhat.com> |
||
|
4f3f28f718
|
chore: use dependency-groups for dev (#2287)
# What does this PR do? The previous `[project.optional-dependencies]` was misrepresenting what the packages were. They were NOT optional dependencies to the project but development dependencies. Unlike optional dependencies, development dependencies are local-only and will not be included in the project requirements when published to PyPI or other indexes. As such, development dependencies are not included in the [project] table. Additionally, the dev group is synced by default. Source: https://docs.astral.sh/uv/concepts/projects/dependencies/#development-dependencies Signed-off-by: Sébastien Han <seb@redhat.com> |
||
|
7105a25b0f | build: Bump version to 0.2.8 | ||
|
448f00903d
|
chore: mark blobpath as optional (#2271)
# What does this PR do? This is not a core dependency of the distro server. It's only necessary when using `inline::rag-runtime` or `inline::meta-reference` providers. Signed-off-by: Sébastien Han <seb@redhat.com> |
||
|
3faf1e4a79
|
feat: enable MCP execution in Responses impl (#2240)
## Test Plan ``` pytest -s -v 'tests/verifications/openai_api/test_responses.py' \ --provider=stack:together --model meta-llama/Llama-4-Scout-17B-16E-Instruct ``` |