Commit graph

5 commits

Author SHA1 Message Date
Francisco Javier Arceo
7273a125cd chore: Updating Milvus to use OpenAIVectorStoreMixin
docs: Add recent releases to CHANGELOG.md (#2533)

<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

Update changelog.

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

build: update temp. created Containerfile (#2492)

<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- conditionally created folder /.llama/providers.d if
external_providers_dir is set
- do not create /.cache folder, not in use anywhere
- combine chmod and copy to one command

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
updated test:

```
export CONTAINER_BINARY=podman
LLAMA_STACK_DIR=. uv run llama stack build --template remote-vllm --image-type container --image-name  <name>
```
log:
```
Containerfile created successfully in /tmp/tmp.rPMunE39Aw/Containerfile

FROM python:3.11-slim
WORKDIR /app

RUN apt-get update && apt-get install -y        iputils-ping net-tools iproute2 dnsutils telnet        curl wget telnet git       procps psmisc lsof        traceroute        bubblewrap        gcc        && rm -rf /var/lib/apt/lists/*

ENV UV_SYSTEM_PYTHON=1
RUN pip install uv
RUN uv pip install --no-cache sentencepiece pillow pypdf transformers pythainlp faiss-cpu opentelemetry-sdk requests datasets chardet scipy nltk numpy matplotlib psycopg2-binary aiosqlite langdetect autoevals tree_sitter tqdm pandas chromadb-client opentelemetry-exporter-otlp-proto-http redis scikit-learn openai pymongo emoji sqlalchemy[asyncio] mcp aiosqlite fastapi fire httpx uvicorn opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
RUN uv pip install --no-cache sentence-transformers --no-deps
RUN uv pip install --no-cache torch torchvision --index-url https://download.pytorch.org/whl/cpu
RUN mkdir -p /.llama/providers.d /.cache
RUN uv pip install --no-cache llama-stack
RUN pip uninstall -y uv
ENTRYPOINT ["python", "-m", "llama_stack.distribution.server.server", "--template", "remote-vllm"]

RUN chmod -R g+rw /app /.llama /.cache

PWD: /tmp/llama-stack
Containerfile: /tmp/tmp.rPMunE39Aw/Containerfile
+ podman build --progress=plain --security-opt label=disable --platform linux/amd64 -t distribution-remote-vllm:0.2.12 -f /tmp/tmp.rPMunE39Aw/Containerfile /tmp/llama-stack
....
Success!
Build Successful!
You can find the newly-built template here: /tmp/llama-stack/llama_stack/templates/remote-vllm/run.yaml
You can run the new Llama Stack distro via: llama stack run /tmp/llama-stack/llama_stack/templates/remote-vllm/run.yaml --image-type container
```

```
podman tag localhost/distribution-remote-vllm:dev quay.io/wenzhou/distribution-remote-vllm:2492_2
podman push quay.io/wenzhou/distribution-remote-vllm:2492_2

docker run --rm -p 8321:8321 -e INFERENCE_MODEL="meta-llama/Llama-2-7b-chat-hf" -e VLLM_URL="http://localhost:8000/v1" quay.io/wenzhou/distribution-remote-vllm:2492_2 --port 8321

INFO     2025-06-26 13:47:31,813 __main__:436 server: Using template remote-vllm config file:
         /app/llama-stack-source/llama_stack/templates/remote-vllm/run.yaml
INFO     2025-06-26 13:47:31,818 __main__:438 server: Run configuration:
INFO     2025-06-26 13:47:31,826 __main__:440 server: apis:
         - agents
         - datasetio
         - eval
         - inference
         - safety
         - scoring
         - telemetry
         - tool_runtime
         - vector_io
         benchmarks: []
         container_image: null
....
```
-----
previous test:
local run` >llama stack build --template remote-vllm --image-type
container`
image stored in  `quay.io/wenzhou/distribution-remote-vllm:2492`

---------

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

fix(security): Upgrade urllib3 to v2.5.0. Fixes CVE-2025-50181 and CVE-2025-50182 (#2534)

This fixes CVE-2025-50181 and CVE-2025-50182.

Changes via:
```
uv sync --upgrade-package urllib3
uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt
```

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

fix: dataset metadata without provider_id (#2527)

Fixes an error when inferring dataset provider_id with metadata

Closes #[2506](https://github.com/meta-llama/llama-stack/issues/2506)

Signed-off-by: Juanma Barea <juanmabareamartinez@gmail.com>

fix(security): Upgrade protobuf and aiohttp. Fixes CVE-2025-4565 (#2541)

Fixes CVE-2025-4565 and the following warning:

```
warning: `aiohttp==3.11.13` is yanked (reason: "Regression: https://github.com/aio-libs/aiohttp/issues/10617")
```

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

adding milvus prefix

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

updating CI

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

removing CI tests for now

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

think I got the config correct for CI

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

updated build and run files

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

adding marshmallow constraint

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

removing CI changes

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

Update starter.py

updated starter

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-06-27 18:50:09 -04:00
grs
7c1998db25
feat: fine grained access control policy (#2264)
This allows a set of rules to be defined for determining access to
resources. The rules are (loosely) based on the cedar policy format.

A rule defines a list of action either to permit or to forbid. It may
specify a principal or a resource that must match for the rule to take
effect. It may also specify a condition, either a 'when' or an 'unless',
with additional constraints as to where the rule applies.

A list of rules is held for each type to be protected and tried in order
to find a match. If a match is found, the request is permitted or
forbidden depening on the type of rule. If no match is found, the
request is denied. If no rules are specified for a given type, a rule
that allows any action as long as the resource attributes match the user
attributes is added (i.e. the previous behaviour is the default.

Some examples in yaml:

```
    model:
    - permit:
      principal: user-1
      actions: [create, read, delete]
      comment: user-1 has full access to all models
    - permit:
      principal: user-2
      actions: [read]
      resource: model-1
      comment: user-2 has read access to model-1 only
    - permit:
      actions: [read]
      when:
        user_in: resource.namespaces
      comment: any user has read access to models with matching attributes
    vector_db:
    - forbid:
      actions: [create, read, delete]
      unless:
        user_in: role::admin
      comment: only user with admin role can use vector_db resources
```

---------

Signed-off-by: Gordon Sim <gsim@redhat.com>
2025-06-03 14:51:12 -07:00
Ashwin Bharambe
51e6f529f3
fix: index non-MCP toolgroups at registration time (#2272)
Two somewhat annoying fixes: 

- we are going to index tools for non-MCP toolgroups always (like we
used to do). because there are just random assumptions in our tests,
etc. and I don't want to fix them right now
- we need to handle the funny case of toolgroups like
`builtin::rag/knowledge_search` where we added the tool name to use in
the toolgroup itself.
2025-05-26 20:33:36 -07:00
Ashwin Bharambe
ce33d02443
fix(tools): do not index tools, only index toolgroups (#2261)
When registering a MCP endpoint, we cannot list tools (like we used to)
since the MCP endpoint may be behind an auth wall. Registration can
happen much sooner (via run.yaml).

Instead, we do listing only when the _user_ actually calls listing.
Furthermore, we cache the list in-memory in the server. Currently, the
cache is not invalidated -- we may want to periodically re-list for MCP
servers. Note that they must call `list_tools` before calling
`invoke_tool` -- we use this critically.

This will enable us to list MCP servers in run.yaml

## Test Plan

Existing tests, updated tests accordingly.
2025-05-25 13:27:52 -07:00
Ashwin Bharambe
298721c238
chore: split routing_tables into individual files (#2259) 2025-05-24 23:15:05 -07:00