llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-18 03:27:19 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	c01cada741	disable flaky test	2025-08-18 16:57:01 -07:00
Ashwin Bharambe	b5af8ac901	try once more	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	707ff21c84	cleanup	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	6b9d14042e	yes4	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	3114e188a8	yes3	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	be40df8a1b	yes3	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	f550d921be	yes2	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	0d25c62ab2	yes	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	0dd4a72bbc	fix	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	780f1c41f2	.	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	8ce9b3b53d	cant make git editable doofus	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	bc0ec3ef25	fix	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	7c893f85b0	we must run llama stack build properly	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	9517eef70c	.	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	a4e53cc4f3	stop auto-activating environment	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	9c9f83540f	make a separate venv	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	aeb4d0ee62	more debug	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	1a01ce6a2c	debug and fix	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	3bfbc212fa	add debug	2025-08-18 16:52:40 -07:00
Ashwin Bharambe	a01c2ae583	fix(test): fix an age old test and record	2025-08-18 16:52:40 -07:00
Francisco Arceo	ac78e9f66a	chore: Adding UI unit tests in CI (#3191 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Test External API and Providers / test-external (venv) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1m2s Details Python Package Build Test / build (3.13) (push) Failing after 1m4s Details UI Tests / ui-tests (22) (push) Successful in 1m33s Details Pre-commit / pre-commit (push) Successful in 2m38s Details	2025-08-18 16:48:21 -06:00
Ashwin Bharambe	89661b984c	revert: "feat(cli): make venv the default image type" (#3196 ) Reverts llamastack/llama-stack#3187	2025-08-18 15:31:01 -07:00
Ashwin Bharambe	2e7ca07423	feat(cli): make venv the default image type (#3187 ) We have removed conda now so we can make `venv` the default. Just doing `llama stack build --distro starter` is now enough for the most part.	2025-08-18 14:58:23 -07:00
slekkala1	7519ab4024	feat: Code scanner Provider impl for moderations api (#3100 ) # What does this PR do? Add CodeScanner implementations ## Test Plan `SAFETY_MODEL=CodeScanner LLAMA_STACK_CONFIG=starter uv run pytest -v tests/integration/safety/test_safety.py --text-model=llama3.2:3b-instruct-fp16 --embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama` This PR need to land after this https://github.com/meta-llama/llama-stack/pull/3098	2025-08-18 14:15:40 -07:00
Ashwin Bharambe	27d6becfd0	fix(misc): pin openai dependency to < 1.100.0 (#3192 ) This OpenAI client release `0843a11164` ends up breaking litellm `169a17400f/litellm/types/llms/openai.py (L40)` Update the dependency pin. Also make the imports a bit more defensive anyhow if something else during `llama stack build` ends up moving openai to a previous version. ## Test Plan Run pre-release script integration tests.	2025-08-18 12:20:50 -07:00
IAN MILLER	f8398d25ff	fix: kill build_conda_env.sh (#3190 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> I noticed somehow [build_conda_env.sh](https://github.com/llamastack/llama-stack/blob/main/llama_stack/core/build_conda_env.sh) exists in main branch. We need to kill it to be consistent with [#2969](https://github.com/llamastack/llama-stack/pull/2969) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-18 12:17:44 -07:00
Maor Friedman	739b18edf8	feat: add support for postgres ssl mode and root cert (#3182 ) this PR adds support for configuring `sslmode` and `sslrootcert` when initiating the psycopg2 connection. closes #3181	2025-08-18 10:24:24 -07:00
Francisco Arceo	fa431e15e0	chore: Update TRIAGERS.md (#3186 ) # What does this PR do? Update triagers to current state ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-18 10:23:51 -07:00
Charlie Doern	4ae39b94ff	fix: remove category prints (#3189 ) # What does this PR do? commands where the output is important like `llama stack build --print-deps-only` (soon to be `llama stack show`) print some log.py `cprint`'s on _every_ execution of the CLI for example: <img width="912" height="331" alt="Screenshot 2025-08-18 at 1 16 30 PM" src="https://github.com/user-attachments/assets/e5bf18fb-74a1-438c-861a-8a26eea7d014" /> the yellow text is likely unnecessary. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-08-18 10:23:23 -07:00
Ashwin Bharambe	f4cecaade9	chore(ci): dont run llama stack server always (#3188 ) Sometimes the server has already been started (e.g., via docker). Just a convenience here so we can reuse this script more.	2025-08-18 10:11:55 -07:00
Francisco Arceo	a8091d0c6a	chore: Update benchmarking location in contributing docs (#3180 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 10s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s Details Test External API and Providers / test-external (venv) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (push) Failing after 19s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 24s Details Python Package Build Test / build (3.12) (push) Failing after 22s Details Unit Tests / unit-tests (3.13) (push) Failing after 57s Details Pre-commit / pre-commit (push) Successful in 2m11s Details # What does this PR do? Small docs change as requested in https://github.com/llamastack/llama-stack/pull/3160#pullrequestreview-3125038932 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-18 08:04:21 -04:00
Ashwin Bharambe	5e7c2250be	test(recording): add a script to schedule recording workflow (#3170 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Pre-commit / pre-commit (push) Successful in 1m19s Details See comment here: https://github.com/llamastack/llama-stack/pull/3162#issuecomment-3192859097 -- TL;DR it is quite complex to invoke the recording workflow correctly for an end developer writing tests. This script simplifies the work. No more manual GitHub UI navigation! ## Script Functionality - Auto-detects your current branch and associated PR - Finds the right repository context (works from forks!) - Runs the workflow where it can actually commit back - Validates prerequisites and provides helpful error messages ## How to Use First ensure you are on the branch which introduced a new test and want it recorded. Make sure you have pushed this branch remotely, easiest is to create a PR. ``` # Record tests for current branch ./scripts/github/schedule-record-workflow.sh # Record specific test subdirectories ./scripts/github/schedule-record-workflow.sh --test-subdirs "agents,inference" # Record with vision tests enabled ./scripts/github/schedule-record-workflow.sh --run-vision-tests # Record tests matching a pattern ./scripts/github/schedule-record-workflow.sh --test-pattern "test_streaming" ``` ## Test Plan Ran `./scripts/github/schedule-record-workflow.sh -s inference -k tool_choice` which started `4820409329` which successfully committed recorded outputs.	2025-08-15 16:54:34 -07:00
Matthew Farrellee	914c7be288	feat: add batches API with OpenAI compatibility (with inference replay) (#3162 ) Add complete batches API implementation with protocol, providers, and tests: Core Infrastructure: - Add batches API protocol using OpenAI Batch types directly - Add Api.batches enum value and protocol mapping in resolver - Add OpenAI "batch" file purpose support - Include proper error handling (ConflictError, ResourceNotFoundError) Reference Provider: - Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list) - Implement background batch processing with configurable concurrency - Add SQLite KVStore backend for persistence - Support /v1/chat/completions endpoint with request validation Comprehensive Test Suite: - Add unit tests for provider implementation with validation - Add integration tests for end-to-end batch processing workflows - Add error handling tests for validation, malformed inputs, and edge cases Configuration: - Add max_concurrent_batches and max_concurrent_requests_per_batch options - Add provider documentation with sample configurations Test with - ``` $ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run & $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK ``` addresses #3066 --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-08-15 15:34:15 -07:00
Ashwin Bharambe	f4ccdee200	fix(ci): skip batches directory for library client testing	2025-08-15 15:30:03 -07:00
Ashwin Bharambe	0e8bb94bf3	feat(ci): make recording workflow simpler, more parameterizable (#3169 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details Test External API and Providers / test-external (venv) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (push) Failing after 28s Details Unit Tests / unit-tests (3.12) (push) Failing after 27s Details Unit Tests / unit-tests (3.13) (push) Failing after 51s Details Pre-commit / pre-commit (push) Successful in 2m6s Details # What does this PR do? Recording tests has become a nightmare. This is the first part of making that process simpler by making it _less_ automatic. I tried to be too clever earlier. It simplifies the record-integration-tests workflow to use workflow dispatch inputs instead of PR labels. No more opaque stuff. Just go to the GitHub UI and run the workflow with inputs. I will soon add a helper script for this also. Other things to aid re-running just the small set of things you need to re-record: - Replaces the `test-types` JSON array parameter with a more intuitive `test-subdirs` comma-separated list. The whole JSON array crap was for matrix. - Adds a new `test-pattern` parameter to allow filtering tests using pytest's `-k` option ## Test Plan Note that this PR is in a fork not the source repository. - Replay tests on this PR are green - Manually [ran](`1699856292`) the replay workflow with a test-subdir and test-pattern filter, worked - Manually [ran](`4819508034`) the record workflow with a simple pattern, it has worked and updated _this_ PR. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-08-15 14:47:20 -07:00
Ashwin Bharambe	a6e2c18909	Revert "refactor(agents): migrate to OpenAI chat completions API" (#3167 ) Reverts llamastack/llama-stack#3097 It has broken agents tests.	2025-08-15 12:01:07 -07:00
ehhuang	2c06b24c77	test: benchmark scripts (#3160 ) # What does this PR do? 1. Add our own benchmark script instead of locust (doesn't support measuring streaming latency well) 2. Simplify k8s deployment 3. Add a simple profile script for locally running server ## Test Plan ❮ ./run-benchmark.sh --target stack --duration 180 --concurrent 10 ============================================================ BENCHMARK RESULTS ============================================================ Total time: 180.00s Concurrent users: 10 Total requests: 1636 Successful requests: 1636 Failed requests: 0 Success rate: 100.0% Requests per second: 9.09 Response Time Statistics: Mean: 1.095s Median: 1.721s Min: 0.136s Max: 3.218s Std Dev: 0.762s Percentiles: P50: 1.721s P90: 1.751s P95: 1.756s P99: 1.796s Time to First Token (TTFT) Statistics: Mean: 0.037s Median: 0.037s Min: 0.023s Max: 0.211s Std Dev: 0.011s TTFT Percentiles: P50: 0.037s P90: 0.040s P95: 0.044s P99: 0.055s Streaming Statistics: Mean chunks per response: 64.0 Total chunks received: 104775	2025-08-15 11:24:29 -07:00
dependabot[bot]	2114214fe3	chore(python-deps): bump huggingface-hub from 0.34.3 to 0.34.4 (#3084 ) Bumps [huggingface-hub](https://github.com/huggingface/huggingface_hub) from 0.34.3 to 0.34.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/huggingface_hub/releases">huggingface-hub's releases</a>.</em></p> <blockquote> <h2>[v0.34.4] Support Image to Video inference + QoL in jobs API, auth and utilities</h2> <p>Biggest update is the support of Image-To-Video task with inference provider Fal AI</p> <ul> <li>[Inference] Support image to video task <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3289">#3289</a> by <a href="https://github.com/hanouticelina"><code>@hanouticelina</code></a></li> </ul> <pre lang="py"><code>>>> from huggingface_hub import InferenceClient >>> client = InferenceClient() >>> video = client.image_to_video("cat.jpg", model="Wan-AI/Wan2.2-I2V-A14B", prompt="turn the cat into a tiger") >>> with open("tiger.mp4", "wb") as f: ... f.write(video) </code></pre> <p>And some quality of life improvements:</p> <ul> <li>Add type to job owner <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3291">#3291</a> by <a href="https://github.com/drbh"><code>@drbh</code></a></li> <li>Include HF_HUB_DISABLE_XET in the environment dump <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3290">#3290</a> by <a href="https://github.com/hanouticelina"><code>@hanouticelina</code></a></li> <li>Whoami: custom message only on unauthorized <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3288">#3288</a> by <a href="https://github.com/Wauplin"><code>@Wauplin</code></a></li> <li>Add validation warnings for repository limits in upload_large_folder <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3280">#3280</a> by <a href="https://github.com/davanstrien"><code>@davanstrien</code></a></li> <li>Add timeout info to Jobs guide docs <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3281">#3281</a> by <a href="https://github.com/davanstrien"><code>@davanstrien</code></a></li> <li>[Jobs] Use current or stored token in a Job secrets <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3272">#3272</a> by <a href="https://github.com/lhoestq"><code>@lhoestq</code></a></li> <li>Fix bash history expansion in hf jobs example <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3277">#3277</a> by <a href="https://github.com/nyuuzyou"><code>@nyuuzyou</code></a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4">https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`84a92a92c2`"><code>84a92a9</code></a> Release: v0.34.4</li> <li><a href="`6196ac2cbc`"><code>6196ac2</code></a> Add type to job owner (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3291">#3291</a>)</li> <li><a href="`4f6975f697`"><code>4f6975f</code></a> Include <code>HF_HUB_DISABLE_XET</code> in the environment dump (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3290">#3290</a>)</li> <li><a href="`3720a5096f`"><code>3720a50</code></a> [Inference] Support image to video task (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3289">#3289</a>)</li> <li><a href="`bb5e4c7a2c`"><code>bb5e4c7</code></a> Whoami: custom message only on unauthorized (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3288">#3288</a>)</li> <li><a href="`a725256f31`"><code>a725256</code></a> Add validation warnings for repository limits in upload_large_folder (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3280">#3280</a>)</li> <li><a href="`a181b0f088`"><code>a181b0f</code></a> Add timeout info to Jobs guide docs (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3281">#3281</a>)</li> <li><a href="`4d38925c8d`"><code>4d38925</code></a> [Jobs] Use current or stored token in a Job secrets (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3272">#3272</a>)</li> <li><a href="`1580ce18c7`"><code>1580ce1</code></a> Fix bash history expansion in hf jobs example (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3277">#3277</a>)</li> <li>See full diff in <a href="https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=huggingface-hub&package-manager=uv&previous-version=0.34.3&new-version=0.34.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-08-15 10:55:43 -07:00
dependabot[bot]	a275282685	chore(python-deps): bump pymilvus from 2.5.14 to 2.6.0 (#3086 ) Bumps [pymilvus](https://github.com/milvus-io/pymilvus) from 2.5.14 to 2.6.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/milvus-io/pymilvus/releases">pymilvus's releases</a>.</em></p> <blockquote> <h2>PyMilvus v2.6.0 Release Notes</h2> <h2>New Features</h2> <ol> <li>Add APIs in MilvusClient</li> </ol> <ul> <li>enhance: add describe and alter database in MilvusClient by <a href="https://github.com/smellthemoon"><code>@smellthemoon</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2433">milvus-io/pymilvus#2433</a></li> <li>enhance: support milvus-client iterator by <a href="https://github.com/MrPresent-Han"><code>@MrPresent-Han</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2461">milvus-io/pymilvus#2461</a></li> <li>enhance: Enable resource group api in milvus client by <a href="https://github.com/weiliu1031"><code>@weiliu1031</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2513">milvus-io/pymilvus#2513</a></li> <li>enhance: add release_collection, drop_index, create_partition, drop_partition, load_partition and release_partition by <a href="https://github.com/brcarry"><code>@brcarry</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2525">milvus-io/pymilvus#2525</a></li> <li>enhance: enable describe_replica api in milvus client by <a href="https://github.com/weiliu1031"><code>@weiliu1031</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2541">milvus-io/pymilvus#2541</a></li> <li>enhance: support recalls for milvus_client by <a href="https://github.com/chasingegg"><code>@chasingegg</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2552">milvus-io/pymilvus#2552</a></li> <li>enhance: add use_database by <a href="https://github.com/czs007"><code>@czs007</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2491">milvus-io/pymilvus#2491</a></li> </ul> <ol start="2"> <li>Add AsyncMilvusClient</li> </ol> <ul> <li>[FEAT] Asyncio support by <a href="https://github.com/brcarry"><code>@brcarry</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2411">milvus-io/pymilvus#2411</a></li> <li>Add async DDL funcs & DDL examples by <a href="https://github.com/Shawnzheng011019"><code>@Shawnzheng011019</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2852">milvus-io/pymilvus#2852</a></li> </ul> <ol start="3"> <li>Other features</li> </ol> <ul> <li>enhance: support Int8Vector by <a href="https://github.com/cydrain"><code>@cydrain</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2611">milvus-io/pymilvus#2611</a></li> <li>feat: support recalls field in SearchResult by <a href="https://github.com/chasingegg"><code>@chasingegg</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2390">milvus-io/pymilvus#2390</a></li> <li>enhance: Support Python3.13 and upgrade grpcio range by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2684">milvus-io/pymilvus#2684</a></li> <li>enhance: support run analyzer return detail token by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2679">milvus-io/pymilvus#2679</a></li> <li>enhance: Add force_drop parameter to drop_role method for role deletion by <a href="https://github.com/SimFG"><code>@SimFG</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2705">milvus-io/pymilvus#2705</a></li> <li>enhance: add property func for AnalyzeToken by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2704">milvus-io/pymilvus#2704</a></li> <li>enhance: grant/revoke v2 optional db and collection params by <a href="https://github.com/shaoting-huang"><code>@shaoting-huang</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2386">milvus-io/pymilvus#2386</a></li> <li>extend unlimted offset for query iterator(<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2418">#2418</a>) by <a href="https://github.com/MrPresent-Han"><code>@MrPresent-Han</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2419">milvus-io/pymilvus#2419</a></li> <li>enhance: alterindex & altercollection supports altering properties by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2406">milvus-io/pymilvus#2406</a></li> <li>enhance: alterdatabase support delete property by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2435">milvus-io/pymilvus#2435</a></li> <li>enhance: support hints param by <a href="https://github.com/chasingegg"><code>@chasingegg</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2408">milvus-io/pymilvus#2408</a></li> <li>enhance: create database support properties by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2448">milvus-io/pymilvus#2448</a></li> <li>enhance: Add <code>db_name</code> parameter at <code>bulk_import</code> by <a href="https://github.com/counter2015"><code>@counter2015</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2446">milvus-io/pymilvus#2446</a></li> <li>enhance: add search iterator v2 by <a href="https://github.com/PwzXxm"><code>@PwzXxm</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2395">milvus-io/pymilvus#2395</a></li> <li>enhance: simplify the structure of search_params by <a href="https://github.com/smellthemoon"><code>@smellthemoon</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2507">milvus-io/pymilvus#2507</a></li> <li>enhance: Remove long deprecated Milvus class by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2544">milvus-io/pymilvus#2544</a></li> <li>enhance: Use new model pkg by <a href="https://github.com/junjiejiangjjj"><code>@junjiejiangjjj</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2595">milvus-io/pymilvus#2595</a></li> <li>enhance: Add schema update time verification to insert and upsert to use cache by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2551">milvus-io/pymilvus#2551</a></li> <li>enhance: describecollection output add created_timestamp by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2618">milvus-io/pymilvus#2618</a></li> <li>feat: add external filter func for search iterator v2 by <a href="https://github.com/PwzXxm"><code>@PwzXxm</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2639">milvus-io/pymilvus#2639</a></li> <li>enhance: support run analyzer by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2622">milvus-io/pymilvus#2622</a></li> <li>weighted reranker to allow skip score normalization by <a href="https://github.com/zhengbuqian"><code>@zhengbuqian</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2708">milvus-io/pymilvus#2708</a></li> <li>enhance: Support AddCollectionField API by <a href="https://github.com/congqixia"><code>@congqixia</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2722">milvus-io/pymilvus#2722</a></li> <li>Add 1-Way and 2-Way TLS Support to Bulk Import Functions by <a href="https://github.com/abd-770"><code>@abd-770</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2672">milvus-io/pymilvus#2672</a></li> <li>enhance: Use SearchResult in MilvusClient by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2735">milvus-io/pymilvus#2735</a></li> <li>Support rerank by <a href="https://github.com/junjiejiangjjj"><code>@junjiejiangjjj</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2729">milvus-io/pymilvus#2729</a></li> <li>feat: suppoprt multi analyzer params by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2747">milvus-io/pymilvus#2747</a></li> <li>Add funciton checker by <a href="https://github.com/junjiejiangjjj"><code>@junjiejiangjjj</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2760">milvus-io/pymilvus#2760</a></li> <li>enhance: Support run analyzer by collection and field by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2822">milvus-io/pymilvus#2822</a></li> <li>feat: support load collection/partition with priority(<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2835">#2835</a>) by <a href="https://github.com/MrPresent-Han"><code>@MrPresent-Han</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2836">milvus-io/pymilvus#2836</a></li> <li>enhance: optimize perf for large topk(<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2848">#2848</a>) by <a href="https://github.com/MrPresent-Han"><code>@MrPresent-Han</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2849">milvus-io/pymilvus#2849</a></li> <li>enhance: Add usage guide to manage MilvusClient by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2907">milvus-io/pymilvus#2907</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1e56ce7d31`"><code>1e56ce7</code></a> enhance: Update milvus-proto and readme (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2921">#2921</a>)</li> <li><a href="`75052b1b7c`"><code>75052b1</code></a> enhance: Add usage guide to manage MilvusClient (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2907">#2907</a>)</li> <li><a href="`9f44053086`"><code>9f44053</code></a> add example code for language identifier and multi analyzer (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2919">#2919</a>)</li> <li><a href="`058836de26`"><code>058836d</code></a> fix: Return new pk value for upsert when autoid=true (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2914">#2914</a>)</li> <li><a href="`bbc6777565`"><code>bbc6777</code></a> [cherry-pick] Compatible with the default behavior of free on the cloud (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2913">#2913</a>)</li> <li><a href="`45080c39c5`"><code>45080c3</code></a> fix: Aviod coping functions when init CollectionSchema (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2902">#2902</a>)</li> <li><a href="`52b8461c5b`"><code>52b8461</code></a> [cherry-pick] bulk_import add stageName/dataPaths parameter (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2905">#2905</a>)</li> <li><a href="`a8c3120622`"><code>a8c3120</code></a> [cherry-pick] support stage (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2895">#2895</a>)</li> <li><a href="`3653effa88`"><code>3653eff</code></a> fix: Tidy alias configs when connect fails (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2900">#2900</a>)</li> <li><a href="`728791a7de`"><code>728791a</code></a> enhance: Store alias before wait for ready (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2894">#2894</a>)</li> <li>Additional commits viewable in <a href="https://github.com/milvus-io/pymilvus/compare/v2.5.14...v2.6.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pymilvus&package-manager=uv&previous-version=2.5.14&new-version=2.6.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-15 10:54:09 -07:00
Aakanksha Duggal	e743d3fdf6	refactor(agents): migrate to OpenAI chat completions API (#3097 ) Replace chat_completion calls with openai_chat_completion to eliminate dependency on legacy inference APIs. # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> Closes #3067 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-15 10:51:41 -07:00
ashwinb	f66ae3b3b1	docs(tests): Add a bunch of documentation for our testing systems (#3139 ) # What does this PR do? Creates a structured testing documentation section with multiple detailed pages: - Testing overview explaining the record-replay architecture - Integration testing guide with practical usage examples - Record-replay system technical documentation - Guide for writing effective tests - Troubleshooting guide for common testing issues Hopefully this makes things a bit easier.	2025-08-15 17:45:30 +00:00
Ashwin Bharambe	81ecaf6221	fix(ci): make the Vector IO CI follow the same pattern as others (#3164 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details Python Package Build Test / build (3.13) (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 1m19s Details # What does this PR do? Updates the integration-vector-io-tests workflow to run daily tests on Python 3.13 while limiting regular PR tests to Python 3.12 only. The PR also improves the concurrency configuration to prevent workflow conflicts between main branch runs and PR runs. ## Test Plan [![testinprod](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/WjlTemxb6oA4PgZFmj08/2645295d-f421-49ae-8f3f-f4672d8204e2/testinprod.jpeg)](https://app.graphite.dev/settings/meme-library?org=llamastack)	2025-08-14 21:06:08 -07:00
ashwinb	01b2afd4b5	fix(tests): record missing tests for test_responses_store (#3163 ) # What does this PR do? Updates test recordings. ## Test Plan Started ollama serving the 3.2:3b model. Then ran the server: ``` LLAMA_STACK_TEST_INFERENCE_MODE=record \ LLAMA_STACK_TEST_RECORDING_DIR=tests/integration/recordings/ \ SQLITE_STORE_DIR=$(mktemp -d) \ OLLAMA_URL=http://localhost:11434 \ llama stack build --template starter --image-type venv --run ``` Then ran the tests which needed recording: ``` pytest -sv tests/integration/agents/test_openai_responses.py \ --stack-config=server:starter \ --text-model ollama/llama3.2:3b-instruct-fp16 -k test_responses_store ``` Then, restarted the server with `LLAMA_STACK_TEST_INFERENCE_MODE=replay`, re-ran the tests and verified they passed.	2025-08-15 03:52:45 +00:00
ashwinb	8ed69978f9	refactor(tests): make the responses tests nicer (#3161 ) # What does this PR do? A _bunch_ on cleanup for the Responses tests. - Got rid of YAML test cases, moved them to just use simple pydantic models - Splitting the large monolithic test file into multiple focused test files: - `test_basic_responses.py` for basic and image response tests - `test_tool_responses.py` for tool-related tests - `test_file_search.py` for file search specific tests - Adding a `StreamingValidator` helper class to standardize streaming response validation ## Test Plan Run the tests: ``` pytest -s -v tests/integration/non_ci/responses/ \ --stack-config=starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ```	2025-08-15 00:05:36 +00:00
ashwinb	ba664474de	feat(responses): add mcp list tool streaming event (#3159 ) # What does this PR do? Adds proper streaming events for MCP tool listing (`mcp_list_tools.in_progress` and `mcp_list_tools.completed`). Also refactors things a bit more. ## Test Plan Verified existing integration tests pass with the refactored code. The test `test_response_streaming_multi_turn_tool_execution` has been updated to check for the new MCP list tools streaming events	2025-08-15 00:05:36 +00:00
ashwinb	9324e902f1	refactor(responses): move stuff into some utils and add unit tests (#3158 ) # What does this PR do? Refactors the OpenAI response conversion utilities by moving helper functions from `openai_responses.py` to `utils.py`. Adds unit tests.	2025-08-15 00:05:36 +00:00
ashwinb	47d5af703c	chore(responses): Refactor Responses Impl to be civilized (#3138 ) # What does this PR do? Refactors the OpenAI responses implementation by extracting streaming and tool execution logic into separate modules. This improves code organization by: 1. Creating a new `StreamingResponseOrchestrator` class in `streaming.py` to handle the streaming response generation logic 2. Moving tool execution functionality to a dedicated `ToolExecutor` class in `tool_executor.py` ## Test Plan Existing tests	2025-08-15 00:05:35 +00:00
Francisco Arceo	e69acbafbf	feat(UI): Adding linter and prettier for UI (#3156 )	2025-08-14 15:58:43 -06:00
Ashwin Bharambe	61582f327c	fix(ci): update triggers for the workflows (#3152 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 8s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 20s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Update ReadTheDocs / update-readthedocs (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s Details Test External API and Providers / test-external (venv) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 1m39s Details	2025-08-14 10:27:25 -07:00
Derek Higgins	c15cc7ed77	fix: use ChatCompletionMessageFunctionToolCall (#3142 ) The OpenAI compatibility layer was incorrectly importing ChatCompletionMessageToolCallParam instead of the ChatCompletionMessageFunctionToolCall class. This caused "Cannot instantiate typing.Union" errors when processing agent requests with tool calls. Closes: #3141 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-08-14 10:27:00 -07:00

1 2 3 4 5 ...

2518 commits