docs: Add recent releases to CHANGELOG.md (#2533)
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
Update changelog.
---------
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
build: update temp. created Containerfile (#2492)
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- conditionally created folder /.llama/providers.d if
external_providers_dir is set
- do not create /.cache folder, not in use anywhere
- combine chmod and copy to one command
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
updated test:
```
export CONTAINER_BINARY=podman
LLAMA_STACK_DIR=. uv run llama stack build --template remote-vllm --image-type container --image-name <name>
```
log:
```
Containerfile created successfully in /tmp/tmp.rPMunE39Aw/Containerfile
FROM python:3.11-slim
WORKDIR /app
RUN apt-get update && apt-get install -y iputils-ping net-tools iproute2 dnsutils telnet curl wget telnet git procps psmisc lsof traceroute bubblewrap gcc && rm -rf /var/lib/apt/lists/*
ENV UV_SYSTEM_PYTHON=1
RUN pip install uv
RUN uv pip install --no-cache sentencepiece pillow pypdf transformers pythainlp faiss-cpu opentelemetry-sdk requests datasets chardet scipy nltk numpy matplotlib psycopg2-binary aiosqlite langdetect autoevals tree_sitter tqdm pandas chromadb-client opentelemetry-exporter-otlp-proto-http redis scikit-learn openai pymongo emoji sqlalchemy[asyncio] mcp aiosqlite fastapi fire httpx uvicorn opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
RUN uv pip install --no-cache sentence-transformers --no-deps
RUN uv pip install --no-cache torch torchvision --index-url https://download.pytorch.org/whl/cpu
RUN mkdir -p /.llama/providers.d /.cache
RUN uv pip install --no-cache llama-stack
RUN pip uninstall -y uv
ENTRYPOINT ["python", "-m", "llama_stack.distribution.server.server", "--template", "remote-vllm"]
RUN chmod -R g+rw /app /.llama /.cache
PWD: /tmp/llama-stack
Containerfile: /tmp/tmp.rPMunE39Aw/Containerfile
+ podman build --progress=plain --security-opt label=disable --platform linux/amd64 -t distribution-remote-vllm:0.2.12 -f /tmp/tmp.rPMunE39Aw/Containerfile /tmp/llama-stack
....
Success!
Build Successful!
You can find the newly-built template here: /tmp/llama-stack/llama_stack/templates/remote-vllm/run.yaml
You can run the new Llama Stack distro via: llama stack run /tmp/llama-stack/llama_stack/templates/remote-vllm/run.yaml --image-type container
```
```
podman tag localhost/distribution-remote-vllm:dev quay.io/wenzhou/distribution-remote-vllm:2492_2
podman push quay.io/wenzhou/distribution-remote-vllm:2492_2
docker run --rm -p 8321:8321 -e INFERENCE_MODEL="meta-llama/Llama-2-7b-chat-hf" -e VLLM_URL="http://localhost:8000/v1" quay.io/wenzhou/distribution-remote-vllm:2492_2 --port 8321
INFO 2025-06-26 13:47:31,813 __main__:436 server: Using template remote-vllm config file:
/app/llama-stack-source/llama_stack/templates/remote-vllm/run.yaml
INFO 2025-06-26 13:47:31,818 __main__:438 server: Run configuration:
INFO 2025-06-26 13:47:31,826 __main__:440 server: apis:
- agents
- datasetio
- eval
- inference
- safety
- scoring
- telemetry
- tool_runtime
- vector_io
benchmarks: []
container_image: null
....
```
-----
previous test:
local run` >llama stack build --template remote-vllm --image-type
container`
image stored in `quay.io/wenzhou/distribution-remote-vllm:2492`
---------
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
fix(security): Upgrade urllib3 to v2.5.0. Fixes CVE-2025-50181 and CVE-2025-50182 (#2534)
This fixes CVE-2025-50181 and CVE-2025-50182.
Changes via:
```
uv sync --upgrade-package urllib3
uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt
```
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
fix: dataset metadata without provider_id (#2527)
Fixes an error when inferring dataset provider_id with metadata
Closes #[2506](https://github.com/meta-llama/llama-stack/issues/2506)
Signed-off-by: Juanma Barea <juanmabareamartinez@gmail.com>
fix(security): Upgrade protobuf and aiohttp. Fixes CVE-2025-4565 (#2541)
Fixes CVE-2025-4565 and the following warning:
```
warning: `aiohttp==3.11.13` is yanked (reason: "Regression: https://github.com/aio-libs/aiohttp/issues/10617")
```
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
adding milvus prefix
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
updating CI
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
removing CI tests for now
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
think I got the config correct for CI
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
updated build and run files
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
adding marshmallow constraint
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
removing CI changes
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Update starter.py
updated starter
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
The goal of this PR is code base modernization.
Schema reflection code needed a minor adjustment to handle UnionTypes
and collections.abc.AsyncIterator. (Both are preferred for latest Python
releases.)
Note to reviewers: almost all changes here are automatically generated
by pyupgrade. Some additional unused imports were cleaned up. The only
change worth of note can be found under `docs/openapi_generator` and
`llama_stack/strong_typing/schema.py` where reflection code was updated
to deal with "newer" types.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
# What does this PR do?
It's a dict that may contain different types, as per
resolver:instantiate_provider implementation. (AFAIU it also never
contains ProviderSpecs, but *instances* of provider implementations.)
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
mypy passing if enabled checks for these modules. (See #1543)
[//]: # (## Documentation)
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
# What does this PR do?
See https://github.com/meta-llama/llama-stack/pull/1171 which is the
original PR. Author: @zc277584121
feat: add [Milvus](https://milvus.io/) vectorDB
note: I use the MilvusClient to implement it instead of
AsyncMilvusClient, because when I tested AsyncMilvusClient, it would
raise issues about evenloop, which I think AsyncMilvusClient SDK is not
robust enough to be compatible with llama_stack framework.
## Test Plan
have passed the unit test and ene2end test
Here is my end2end test logs, including the client code, client log,
server logs from inline and remote settings
[test_end2end_logs.zip](https://github.com/user-attachments/files/18964391/test_end2end_logs.zip)
---------
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Cheney Zhang <chen.zhang@zilliz.com>