Ashwin Bharambe
c01cada741
disable flaky test
2025-08-18 16:57:01 -07:00
Ashwin Bharambe
b5af8ac901
try once more
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
707ff21c84
cleanup
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
6b9d14042e
yes4
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
3114e188a8
yes3
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
be40df8a1b
yes3
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
f550d921be
yes2
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
0d25c62ab2
yes
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
0dd4a72bbc
fix
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
780f1c41f2
.
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
8ce9b3b53d
cant make git editable doofus
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
bc0ec3ef25
fix
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
7c893f85b0
we must run llama stack build properly
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
9517eef70c
.
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
a4e53cc4f3
stop auto-activating environment
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
9c9f83540f
make a separate venv
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
aeb4d0ee62
more debug
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
1a01ce6a2c
debug and fix
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
3bfbc212fa
add debug
2025-08-18 16:52:40 -07:00
Ashwin Bharambe
a01c2ae583
fix(test): fix an age old test and record
2025-08-18 16:52:40 -07:00
Francisco Arceo
ac78e9f66a
chore: Adding UI unit tests in CI ( #3191 )
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Python Package Build Test / build (3.12) (push) Failing after 9s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s
Unit Tests / unit-tests (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (push) Failing after 16s
Test Llama Stack Build / build-single-provider (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Test External API and Providers / test-external (venv) (push) Failing after 14s
Test Llama Stack Build / build (push) Failing after 9s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Update ReadTheDocs / update-readthedocs (push) Failing after 1m2s
Python Package Build Test / build (3.13) (push) Failing after 1m4s
UI Tests / ui-tests (22) (push) Successful in 1m33s
Pre-commit / pre-commit (push) Successful in 2m38s
2025-08-18 16:48:21 -06:00
Ashwin Bharambe
89661b984c
revert: "feat(cli): make venv the default image type" ( #3196 )
...
Reverts llamastack/llama-stack#3187
2025-08-18 15:31:01 -07:00
Ashwin Bharambe
2e7ca07423
feat(cli): make venv the default image type ( #3187 )
...
We have removed conda now so we can make `venv` the default. Just doing
`llama stack build --distro starter` is now enough for the most part.
2025-08-18 14:58:23 -07:00
slekkala1
7519ab4024
feat: Code scanner Provider impl for moderations api ( #3100 )
...
# What does this PR do?
Add CodeScanner implementations
## Test Plan
`SAFETY_MODEL=CodeScanner LLAMA_STACK_CONFIG=starter uv run pytest -v
tests/integration/safety/test_safety.py
--text-model=llama3.2:3b-instruct-fp16
--embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama`
This PR need to land after this
https://github.com/meta-llama/llama-stack/pull/3098
2025-08-18 14:15:40 -07:00
Ashwin Bharambe
27d6becfd0
fix(misc): pin openai dependency to < 1.100.0 ( #3192 )
...
This OpenAI client release
0843a11164
ends up breaking litellm
169a17400f/litellm/types/llms/openai.py (L40)
Update the dependency pin. Also make the imports a bit more defensive
anyhow if something else during `llama stack build` ends up moving
openai to a previous version.
## Test Plan
Run pre-release script integration tests.
2025-08-18 12:20:50 -07:00
IAN MILLER
f8398d25ff
fix: kill build_conda_env.sh ( #3190 )
...
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
I noticed somehow
[build_conda_env.sh](https://github.com/llamastack/llama-stack/blob/main/llama_stack/core/build_conda_env.sh )
exists in main branch. We need to kill it to be consistent with
[#2969 ](https://github.com/llamastack/llama-stack/pull/2969 )
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-08-18 12:17:44 -07:00
Maor Friedman
739b18edf8
feat: add support for postgres ssl mode and root cert ( #3182 )
...
this PR adds support for configuring `sslmode` and `sslrootcert` when
initiating the psycopg2 connection.
closes #3181
2025-08-18 10:24:24 -07:00
Francisco Arceo
fa431e15e0
chore: Update TRIAGERS.md ( #3186 )
...
# What does this PR do?
Update triagers to current state
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-08-18 10:23:51 -07:00
Charlie Doern
4ae39b94ff
fix: remove category prints ( #3189 )
...
# What does this PR do?
commands where the output is important like `llama stack build
--print-deps-only` (soon to be `llama stack show`) print some log.py
`cprint`'s on _every_ execution of the CLI
for example:
<img width="912" height="331" alt="Screenshot 2025-08-18 at 1 16 30 PM"
src="https://github.com/user-attachments/assets/e5bf18fb-74a1-438c-861a-8a26eea7d014 "
/>
the yellow text is likely unnecessary.
Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-08-18 10:23:23 -07:00
Ashwin Bharambe
f4cecaade9
chore(ci): dont run llama stack server always ( #3188 )
...
Sometimes the server has already been started (e.g., via docker). Just a
convenience here so we can reuse this script more.
2025-08-18 10:11:55 -07:00
Francisco Arceo
a8091d0c6a
chore: Update benchmarking location in contributing docs ( #3180 )
...
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s
Python Package Build Test / build (3.13) (push) Failing after 10s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s
Test External API and Providers / test-external (venv) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (push) Failing after 19s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 24s
Python Package Build Test / build (3.12) (push) Failing after 22s
Unit Tests / unit-tests (3.13) (push) Failing after 57s
Pre-commit / pre-commit (push) Successful in 2m11s
# What does this PR do?
Small docs change as requested in
https://github.com/llamastack/llama-stack/pull/3160#pullrequestreview-3125038932
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-08-18 08:04:21 -04:00
Ashwin Bharambe
5e7c2250be
test(recording): add a script to schedule recording workflow ( #3170 )
...
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Successful in 5s
Python Package Build Test / build (3.13) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 9s
Test Llama Stack Build / build-single-provider (push) Failing after 10s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Vector IO Integration Tests / test-matrix (push) Failing after 14s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Test External API and Providers / test-external (venv) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s
Test Llama Stack Build / build (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Pre-commit / pre-commit (push) Successful in 1m19s
See comment here:
https://github.com/llamastack/llama-stack/pull/3162#issuecomment-3192859097
-- TL;DR it is quite complex to invoke the recording workflow correctly
for an end developer writing tests. This script simplifies the work.
No more manual GitHub UI navigation!
## Script Functionality
- Auto-detects your current branch and associated PR
- Finds the right repository context (works from forks!)
- Runs the workflow where it can actually commit back
- Validates prerequisites and provides helpful error messages
## How to Use
First ensure you are on the branch which introduced a new test and want
it recorded. **Make sure you have pushed this branch remotely, easiest
is to create a PR.**
```
# Record tests for current branch
./scripts/github/schedule-record-workflow.sh
# Record specific test subdirectories
./scripts/github/schedule-record-workflow.sh --test-subdirs "agents,inference"
# Record with vision tests enabled
./scripts/github/schedule-record-workflow.sh --run-vision-tests
# Record tests matching a pattern
./scripts/github/schedule-record-workflow.sh --test-pattern "test_streaming"
```
## Test Plan
Ran `./scripts/github/schedule-record-workflow.sh -s inference -k
tool_choice` which started
4820409329
which successfully committed recorded outputs.
2025-08-15 16:54:34 -07:00
Matthew Farrellee
914c7be288
feat: add batches API with OpenAI compatibility (with inference replay) ( #3162 )
...
Add complete batches API implementation with protocol, providers, and
tests:
Core Infrastructure:
- Add batches API protocol using OpenAI Batch types directly
- Add Api.batches enum value and protocol mapping in resolver
- Add OpenAI "batch" file purpose support
- Include proper error handling (ConflictError, ResourceNotFoundError)
Reference Provider:
- Add ReferenceBatchesImpl with full CRUD operations (create, retrieve,
cancel, list)
- Implement background batch processing with configurable concurrency
- Add SQLite KVStore backend for persistence
- Support /v1/chat/completions endpoint with request validation
Comprehensive Test Suite:
- Add unit tests for provider implementation with validation
- Add integration tests for end-to-end batch processing workflows
- Add error handling tests for validation, malformed inputs, and edge
cases
Configuration:
- Add max_concurrent_batches and max_concurrent_requests_per_batch
options
- Add provider documentation with sample configurations
Test with -
```
$ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run &
$ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK
```
addresses #3066
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-08-15 15:34:15 -07:00
Ashwin Bharambe
f4ccdee200
fix(ci): skip batches directory for library client testing
2025-08-15 15:30:03 -07:00
Ashwin Bharambe
0e8bb94bf3
feat(ci): make recording workflow simpler, more parameterizable ( #3169 )
...
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 4s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Python Package Build Test / build (3.12) (push) Failing after 12s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 14s
Update ReadTheDocs / update-readthedocs (push) Failing after 12s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s
Test External API and Providers / test-external (venv) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (push) Failing after 28s
Unit Tests / unit-tests (3.12) (push) Failing after 27s
Unit Tests / unit-tests (3.13) (push) Failing after 51s
Pre-commit / pre-commit (push) Successful in 2m6s
# What does this PR do?
Recording tests has become a nightmare. This is the first part of making
that process simpler by making it _less_ automatic. I tried to be too
clever earlier.
It simplifies the record-integration-tests workflow to use workflow
dispatch inputs instead of PR labels. No more opaque stuff. Just go to
the GitHub UI and run the workflow with inputs. I will soon add a helper
script for this also.
Other things to aid re-running just the small set of things you need to
re-record:
- Replaces the `test-types` JSON array parameter with a more intuitive
`test-subdirs` comma-separated list. The whole JSON array crap was for
matrix.
- Adds a new `test-pattern` parameter to allow filtering tests using
pytest's `-k` option
## Test Plan
Note that this PR is in a fork not the source repository.
- Replay tests on this PR are green
- Manually
[ran](1699856292 )
the replay workflow with a test-subdir and test-pattern filter, worked
- Manually
[ran](4819508034 )
the **record** workflow with a simple pattern, it has worked and updated
_this_ PR.
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-08-15 14:47:20 -07:00
Ashwin Bharambe
a6e2c18909
Revert "refactor(agents): migrate to OpenAI chat completions API" ( #3167 )
...
Reverts llamastack/llama-stack#3097
It has broken agents tests.
2025-08-15 12:01:07 -07:00
ehhuang
2c06b24c77
test: benchmark scripts ( #3160 )
...
# What does this PR do?
1. Add our own benchmark script instead of locust (doesn't support
measuring streaming latency well)
2. Simplify k8s deployment
3. Add a simple profile script for locally running server
## Test Plan
❮ ./run-benchmark.sh --target stack --duration 180 --concurrent 10
============================================================
BENCHMARK RESULTS
============================================================
Total time: 180.00s
Concurrent users: 10
Total requests: 1636
Successful requests: 1636
Failed requests: 0
Success rate: 100.0%
Requests per second: 9.09
Response Time Statistics:
Mean: 1.095s
Median: 1.721s
Min: 0.136s
Max: 3.218s
Std Dev: 0.762s
Percentiles:
P50: 1.721s
P90: 1.751s
P95: 1.756s
P99: 1.796s
Time to First Token (TTFT) Statistics:
Mean: 0.037s
Median: 0.037s
Min: 0.023s
Max: 0.211s
Std Dev: 0.011s
TTFT Percentiles:
P50: 0.037s
P90: 0.040s
P95: 0.044s
P99: 0.055s
Streaming Statistics:
Mean chunks per response: 64.0
Total chunks received: 104775
2025-08-15 11:24:29 -07:00
dependabot[bot]
2114214fe3
chore(python-deps): bump huggingface-hub from 0.34.3 to 0.34.4 ( #3084 )
...
Bumps [huggingface-hub](https://github.com/huggingface/huggingface_hub )
from 0.34.3 to 0.34.4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/huggingface/huggingface_hub/releases ">huggingface-hub's
releases</a>.</em></p>
<blockquote>
<h2>[v0.34.4] Support Image to Video inference + QoL in jobs API, auth
and utilities</h2>
<p>Biggest update is the support of Image-To-Video task with inference
provider Fal AI</p>
<ul>
<li>[Inference] Support image to video task <a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3289 ">#3289</a>
by <a
href="https://github.com/hanouticelina "><code>@hanouticelina</code></a></li>
</ul>
<pre lang="py"><code>>>> from huggingface_hub import
InferenceClient
>>> client = InferenceClient()
>>> video = client.image_to_video("cat.jpg",
model="Wan-AI/Wan2.2-I2V-A14B", prompt="turn the cat into
a tiger")
>>> with open("tiger.mp4", "wb") as f:
... f.write(video)
</code></pre>
<p>And some quality of life improvements:</p>
<ul>
<li>Add type to job owner <a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3291 ">#3291</a>
by <a href="https://github.com/drbh "><code>@drbh</code></a></li>
<li>Include HF_HUB_DISABLE_XET in the environment dump <a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3290 ">#3290</a>
by <a
href="https://github.com/hanouticelina "><code>@hanouticelina</code></a></li>
<li>Whoami: custom message only on unauthorized <a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3288 ">#3288</a>
by <a href="https://github.com/Wauplin "><code>@Wauplin</code></a></li>
<li>Add validation warnings for repository limits in upload_large_folder
<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3280 ">#3280</a>
by <a
href="https://github.com/davanstrien "><code>@davanstrien</code></a></li>
<li>Add timeout info to Jobs guide docs <a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3281 ">#3281</a>
by <a
href="https://github.com/davanstrien "><code>@davanstrien</code></a></li>
<li>[Jobs] Use current or stored token in a Job secrets <a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3272 ">#3272</a>
by <a href="https://github.com/lhoestq "><code>@lhoestq</code></a></li>
<li>Fix bash history expansion in hf jobs example <a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3277 ">#3277</a>
by <a
href="https://github.com/nyuuzyou "><code>@nyuuzyou</code></a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4 ">https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4 </a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="84a92a92c2 "><code>84a92a9</code></a>
Release: v0.34.4</li>
<li><a
href="6196ac2cbc "><code>6196ac2</code></a>
Add type to job owner (<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3291 ">#3291</a>)</li>
<li><a
href="4f6975f697 "><code>4f6975f</code></a>
Include <code>HF_HUB_DISABLE_XET</code> in the environment dump (<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3290 ">#3290</a>)</li>
<li><a
href="3720a5096f "><code>3720a50</code></a>
[Inference] Support image to video task (<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3289 ">#3289</a>)</li>
<li><a
href="bb5e4c7a2c "><code>bb5e4c7</code></a>
Whoami: custom message only on unauthorized (<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3288 ">#3288</a>)</li>
<li><a
href="a725256f31 "><code>a725256</code></a>
Add validation warnings for repository limits in upload_large_folder (<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3280 ">#3280</a>)</li>
<li><a
href="a181b0f088 "><code>a181b0f</code></a>
Add timeout info to Jobs guide docs (<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3281 ">#3281</a>)</li>
<li><a
href="4d38925c8d "><code>4d38925</code></a>
[Jobs] Use current or stored token in a Job secrets (<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3272 ">#3272</a>)</li>
<li><a
href="1580ce18c7 "><code>1580ce1</code></a>
Fix bash history expansion in hf jobs example (<a
href="https://redirect.github.com/huggingface/huggingface_hub/issues/3277 ">#3277</a>)</li>
<li>See full diff in <a
href="https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4 ">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores )
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-08-15 10:55:43 -07:00
dependabot[bot]
a275282685
chore(python-deps): bump pymilvus from 2.5.14 to 2.6.0 ( #3086 )
...
Bumps [pymilvus](https://github.com/milvus-io/pymilvus ) from 2.5.14 to
2.6.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/milvus-io/pymilvus/releases ">pymilvus's
releases</a>.</em></p>
<blockquote>
<h2>PyMilvus v2.6.0 Release Notes</h2>
<h2>New Features</h2>
<ol>
<li>Add APIs in MilvusClient</li>
</ol>
<ul>
<li>enhance: add describe and alter database in MilvusClient by <a
href="https://github.com/smellthemoon "><code>@smellthemoon</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2433 ">milvus-io/pymilvus#2433</a></li>
<li>enhance: support milvus-client iterator by <a
href="https://github.com/MrPresent-Han "><code>@MrPresent-Han</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2461 ">milvus-io/pymilvus#2461</a></li>
<li>enhance: Enable resource group api in milvus client by <a
href="https://github.com/weiliu1031 "><code>@weiliu1031</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2513 ">milvus-io/pymilvus#2513</a></li>
<li>enhance: add release_collection, drop_index, create_partition,
drop_partition, load_partition and release_partition by <a
href="https://github.com/brcarry "><code>@brcarry</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2525 ">milvus-io/pymilvus#2525</a></li>
<li>enhance: enable describe_replica api in milvus client by <a
href="https://github.com/weiliu1031 "><code>@weiliu1031</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2541 ">milvus-io/pymilvus#2541</a></li>
<li>enhance: support recalls for milvus_client by <a
href="https://github.com/chasingegg "><code>@chasingegg</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2552 ">milvus-io/pymilvus#2552</a></li>
<li>enhance: add use_database by <a
href="https://github.com/czs007 "><code>@czs007</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2491 ">milvus-io/pymilvus#2491</a></li>
</ul>
<ol start="2">
<li>Add AsyncMilvusClient</li>
</ol>
<ul>
<li>[FEAT] Asyncio support by <a
href="https://github.com/brcarry "><code>@brcarry</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2411 ">milvus-io/pymilvus#2411</a></li>
<li>Add async DDL funcs & DDL examples by <a
href="https://github.com/Shawnzheng011019 "><code>@Shawnzheng011019</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2852 ">milvus-io/pymilvus#2852</a></li>
</ul>
<ol start="3">
<li>Other features</li>
</ol>
<ul>
<li>enhance: support Int8Vector by <a
href="https://github.com/cydrain "><code>@cydrain</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2611 ">milvus-io/pymilvus#2611</a></li>
<li>feat: support recalls field in SearchResult by <a
href="https://github.com/chasingegg "><code>@chasingegg</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2390 ">milvus-io/pymilvus#2390</a></li>
<li>enhance: Support Python3.13 and upgrade grpcio range by <a
href="https://github.com/XuanYang-cn "><code>@XuanYang-cn</code></a> in
<a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2684 ">milvus-io/pymilvus#2684</a></li>
<li>enhance: support run analyzer return detail token by <a
href="https://github.com/aoiasd "><code>@aoiasd</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2679 ">milvus-io/pymilvus#2679</a></li>
<li>enhance: Add force_drop parameter to drop_role method for role
deletion by <a href="https://github.com/SimFG "><code>@SimFG</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2705 ">milvus-io/pymilvus#2705</a></li>
<li>enhance: add property func for AnalyzeToken by <a
href="https://github.com/aoiasd "><code>@aoiasd</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2704 ">milvus-io/pymilvus#2704</a></li>
<li>enhance: grant/revoke v2 optional db and collection params by <a
href="https://github.com/shaoting-huang "><code>@shaoting-huang</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2386 ">milvus-io/pymilvus#2386</a></li>
<li>extend unlimted offset for query iterator(<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2418 ">#2418</a>)
by <a
href="https://github.com/MrPresent-Han "><code>@MrPresent-Han</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2419 ">milvus-io/pymilvus#2419</a></li>
<li>enhance: alterindex & altercollection supports altering
properties by <a
href="https://github.com/JsDove "><code>@JsDove</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2406 ">milvus-io/pymilvus#2406</a></li>
<li>enhance: alterdatabase support delete property by <a
href="https://github.com/JsDove "><code>@JsDove</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2435 ">milvus-io/pymilvus#2435</a></li>
<li>enhance: support hints param by <a
href="https://github.com/chasingegg "><code>@chasingegg</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2408 ">milvus-io/pymilvus#2408</a></li>
<li>enhance: create database support properties by <a
href="https://github.com/JsDove "><code>@JsDove</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2448 ">milvus-io/pymilvus#2448</a></li>
<li>enhance: Add <code>db_name</code> parameter at
<code>bulk_import</code> by <a
href="https://github.com/counter2015 "><code>@counter2015</code></a> in
<a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2446 ">milvus-io/pymilvus#2446</a></li>
<li>enhance: add search iterator v2 by <a
href="https://github.com/PwzXxm "><code>@PwzXxm</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2395 ">milvus-io/pymilvus#2395</a></li>
<li>enhance: simplify the structure of search_params by <a
href="https://github.com/smellthemoon "><code>@smellthemoon</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2507 ">milvus-io/pymilvus#2507</a></li>
<li>enhance: Remove long deprecated Milvus class by <a
href="https://github.com/XuanYang-cn "><code>@XuanYang-cn</code></a> in
<a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2544 ">milvus-io/pymilvus#2544</a></li>
<li>enhance: Use new model pkg by <a
href="https://github.com/junjiejiangjjj "><code>@junjiejiangjjj</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2595 ">milvus-io/pymilvus#2595</a></li>
<li>enhance: Add schema update time verification to insert and upsert to
use cache by <a
href="https://github.com/JsDove "><code>@JsDove</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2551 ">milvus-io/pymilvus#2551</a></li>
<li>enhance: describecollection output add created_timestamp by <a
href="https://github.com/JsDove "><code>@JsDove</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2618 ">milvus-io/pymilvus#2618</a></li>
<li>feat: add external filter func for search iterator v2 by <a
href="https://github.com/PwzXxm "><code>@PwzXxm</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2639 ">milvus-io/pymilvus#2639</a></li>
<li>enhance: support run analyzer by <a
href="https://github.com/aoiasd "><code>@aoiasd</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2622 ">milvus-io/pymilvus#2622</a></li>
<li>weighted reranker to allow skip score normalization by <a
href="https://github.com/zhengbuqian "><code>@zhengbuqian</code></a> in
<a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2708 ">milvus-io/pymilvus#2708</a></li>
<li>enhance: Support AddCollectionField API by <a
href="https://github.com/congqixia "><code>@congqixia</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2722 ">milvus-io/pymilvus#2722</a></li>
<li>Add 1-Way and 2-Way TLS Support to Bulk Import Functions by <a
href="https://github.com/abd-770 "><code>@abd-770</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2672 ">milvus-io/pymilvus#2672</a></li>
<li>enhance: Use SearchResult in MilvusClient by <a
href="https://github.com/XuanYang-cn "><code>@XuanYang-cn</code></a> in
<a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2735 ">milvus-io/pymilvus#2735</a></li>
<li>Support rerank by <a
href="https://github.com/junjiejiangjjj "><code>@junjiejiangjjj</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2729 ">milvus-io/pymilvus#2729</a></li>
<li>feat: suppoprt multi analyzer params by <a
href="https://github.com/aoiasd "><code>@aoiasd</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2747 ">milvus-io/pymilvus#2747</a></li>
<li>Add funciton checker by <a
href="https://github.com/junjiejiangjjj "><code>@junjiejiangjjj</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2760 ">milvus-io/pymilvus#2760</a></li>
<li>enhance: Support run analyzer by collection and field by <a
href="https://github.com/aoiasd "><code>@aoiasd</code></a> in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2822 ">milvus-io/pymilvus#2822</a></li>
<li>feat: support load collection/partition with priority(<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2835 ">#2835</a>)
by <a
href="https://github.com/MrPresent-Han "><code>@MrPresent-Han</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2836 ">milvus-io/pymilvus#2836</a></li>
<li>enhance: optimize perf for large topk(<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2848 ">#2848</a>)
by <a
href="https://github.com/MrPresent-Han "><code>@MrPresent-Han</code></a>
in <a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2849 ">milvus-io/pymilvus#2849</a></li>
<li>enhance: Add usage guide to manage MilvusClient by <a
href="https://github.com/XuanYang-cn "><code>@XuanYang-cn</code></a> in
<a
href="https://redirect.github.com/milvus-io/pymilvus/pull/2907 ">milvus-io/pymilvus#2907</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1e56ce7d31 "><code>1e56ce7</code></a>
enhance: Update milvus-proto and readme (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2921 ">#2921</a>)</li>
<li><a
href="75052b1b7c "><code>75052b1</code></a>
enhance: Add usage guide to manage MilvusClient (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2907 ">#2907</a>)</li>
<li><a
href="9f44053086 "><code>9f44053</code></a>
add example code for language identifier and multi analyzer (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2919 ">#2919</a>)</li>
<li><a
href="058836de26 "><code>058836d</code></a>
fix: Return new pk value for upsert when autoid=true (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2914 ">#2914</a>)</li>
<li><a
href="bbc6777565 "><code>bbc6777</code></a>
[cherry-pick] Compatible with the default behavior of free on the cloud
(<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2913 ">#2913</a>)</li>
<li><a
href="45080c39c5 "><code>45080c3</code></a>
fix: Aviod coping functions when init CollectionSchema (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2902 ">#2902</a>)</li>
<li><a
href="52b8461c5b "><code>52b8461</code></a>
[cherry-pick] bulk_import add stageName/dataPaths parameter (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2905 ">#2905</a>)</li>
<li><a
href="a8c3120622 "><code>a8c3120</code></a>
[cherry-pick] support stage (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2895 ">#2895</a>)</li>
<li><a
href="3653effa88 "><code>3653eff</code></a>
fix: Tidy alias configs when connect fails (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2900 ">#2900</a>)</li>
<li><a
href="728791a7de "><code>728791a</code></a>
enhance: Store alias before wait for ready (<a
href="https://redirect.github.com/milvus-io/pymilvus/issues/2894 ">#2894</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/milvus-io/pymilvus/compare/v2.5.14...v2.6.0 ">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores )
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-15 10:54:09 -07:00
Aakanksha Duggal
e743d3fdf6
refactor(agents): migrate to OpenAI chat completions API ( #3097 )
...
Replace chat_completion calls with openai_chat_completion to eliminate
dependency on legacy inference APIs.
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
<!-- If resolving an issue, uncomment and update the line below -->
Closes #3067
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-08-15 10:51:41 -07:00
ashwinb
f66ae3b3b1
docs(tests): Add a bunch of documentation for our testing systems ( #3139 )
...
# What does this PR do?
Creates a structured testing documentation section with multiple detailed pages:
- Testing overview explaining the record-replay architecture
- Integration testing guide with practical usage examples
- Record-replay system technical documentation
- Guide for writing effective tests
- Troubleshooting guide for common testing issues
Hopefully this makes things a bit easier.
2025-08-15 17:45:30 +00:00
Ashwin Bharambe
81ecaf6221
fix(ci): make the Vector IO CI follow the same pattern as others ( #3164 )
...
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / discover-tests (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s
Python Package Build Test / build (3.12) (push) Failing after 6s
Test External API and Providers / test-external (venv) (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (push) Failing after 11s
Unit Tests / unit-tests (3.12) (push) Failing after 10s
Python Package Build Test / build (3.13) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s
Pre-commit / pre-commit (push) Successful in 1m19s
# What does this PR do?
Updates the integration-vector-io-tests workflow to run daily tests on
Python 3.13 while limiting regular PR tests to Python 3.12 only.
The PR also improves the concurrency configuration to prevent workflow
conflicts between main branch runs and PR runs.
## Test Plan
[](https://app.graphite.dev/settings/meme-library?org=llamastack )
2025-08-14 21:06:08 -07:00
ashwinb
01b2afd4b5
fix(tests): record missing tests for test_responses_store ( #3163 )
...
# What does this PR do?
Updates test recordings.
## Test Plan
Started ollama serving the 3.2:3b model. Then ran the server:
```
LLAMA_STACK_TEST_INFERENCE_MODE=record \
LLAMA_STACK_TEST_RECORDING_DIR=tests/integration/recordings/ \
SQLITE_STORE_DIR=$(mktemp -d) \
OLLAMA_URL=http://localhost:11434 \
llama stack build --template starter --image-type venv --run
```
Then ran the tests which needed recording:
```
pytest -sv tests/integration/agents/test_openai_responses.py \
--stack-config=server:starter \
--text-model ollama/llama3.2:3b-instruct-fp16 -k test_responses_store
```
Then, restarted the server with `LLAMA_STACK_TEST_INFERENCE_MODE=replay`, re-ran the tests and verified they passed.
2025-08-15 03:52:45 +00:00
ashwinb
8ed69978f9
refactor(tests): make the responses tests nicer ( #3161 )
...
# What does this PR do?
A _bunch_ on cleanup for the Responses tests.
- Got rid of YAML test cases, moved them to just use simple pydantic models
- Splitting the large monolithic test file into multiple focused test files:
- `test_basic_responses.py` for basic and image response tests
- `test_tool_responses.py` for tool-related tests
- `test_file_search.py` for file search specific tests
- Adding a `StreamingValidator` helper class to standardize streaming response validation
## Test Plan
Run the tests:
```
pytest -s -v tests/integration/non_ci/responses/ \
--stack-config=starter \
--text-model openai/gpt-4o \
--embedding-model=sentence-transformers/all-MiniLM-L6-v2 \
-k "client_with_models"
```
2025-08-15 00:05:36 +00:00
ashwinb
ba664474de
feat(responses): add mcp list tool streaming event ( #3159 )
...
# What does this PR do?
Adds proper streaming events for MCP tool listing (`mcp_list_tools.in_progress` and `mcp_list_tools.completed`). Also refactors things a bit more.
## Test Plan
Verified existing integration tests pass with the refactored code. The test `test_response_streaming_multi_turn_tool_execution` has been updated to check for the new MCP list tools streaming events
2025-08-15 00:05:36 +00:00
ashwinb
9324e902f1
refactor(responses): move stuff into some utils and add unit tests ( #3158 )
...
# What does this PR do?
Refactors the OpenAI response conversion utilities by moving helper functions from `openai_responses.py` to `utils.py`. Adds unit tests.
2025-08-15 00:05:36 +00:00
ashwinb
47d5af703c
chore(responses): Refactor Responses Impl to be civilized ( #3138 )
...
# What does this PR do?
Refactors the OpenAI responses implementation by extracting streaming and tool execution logic into separate modules. This improves code organization by:
1. Creating a new `StreamingResponseOrchestrator` class in `streaming.py` to handle the streaming response generation logic
2. Moving tool execution functionality to a dedicated `ToolExecutor` class in `tool_executor.py`
## Test Plan
Existing tests
2025-08-15 00:05:35 +00:00
Francisco Arceo
e69acbafbf
feat(UI): Adding linter and prettier for UI ( #3156 )
2025-08-14 15:58:43 -06:00
Ashwin Bharambe
61582f327c
fix(ci): update triggers for the workflows ( #3152 )
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / discover-tests (push) Successful in 8s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s
Python Package Build Test / build (3.12) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s
Python Package Build Test / build (3.13) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 20s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Update ReadTheDocs / update-readthedocs (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s
Test External API and Providers / test-external (venv) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
Pre-commit / pre-commit (push) Successful in 1m39s
2025-08-14 10:27:25 -07:00
Derek Higgins
c15cc7ed77
fix: use ChatCompletionMessageFunctionToolCall ( #3142 )
...
The OpenAI compatibility layer was incorrectly importing
ChatCompletionMessageToolCallParam instead of the
ChatCompletionMessageFunctionToolCall class. This caused "Cannot
instantiate typing.Union" errors when processing agent requests with
tool calls.
Closes : #3141
Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-08-14 10:27:00 -07:00