Commit graph

10 commits

Author SHA1 Message Date
Ashwin Bharambe
cb40da210f
fix: update tests for OpenAI-style models endpoint (#4053)
The llama-stack-client now uses /`v1/openai/v1/models` which returns
OpenAI-compatible model objects with 'id' and 'custom_metadata' fields
instead of the Resource-style 'identifier' field. Updated api_recorder
to handle the new endpoint and modified tests to access model metadata
appropriately. Deleted stale model recordings for re-recording.

**NOTE: CI will be red on this one since it is dependent on
https://github.com/llamastack/llama-stack-client-python/pull/291/files
landing. I verified locally that it is green.**
2025-11-03 17:30:08 -08:00
Francisco Arceo
4566eebe05
feat: Add static file import system for docs (#3882)
# What does this PR do?

Add static file import system for docs

- Use `remark-code-import` plugin to embed code at build time
- Support importing Python code with syntax highlighting using
`raw-loader` + `ReactMarkdown`

One caveat is that currently when embedding markdown with code used the
syntax highlighting isn't behaving but I'll investigate that in a follow
up.

## Test Plan

Python Example:
<img width="1372" height="995" alt="Screenshot 2025-10-23 at 9 22 18 PM"
src="https://github.com/user-attachments/assets/656d2c78-4d9b-45a4-bd5e-3f8490352b85"
/>

Markdown example:
<img width="1496" height="1070" alt="Screenshot 2025-10-23 at 9 22
38 PM"
src="https://github.com/user-attachments/assets/6c0a07ec-ff7c-45aa-b05f-8c46acd4445c"
/>

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-10-24 14:01:33 -04:00
Francisco Arceo
53c20f6113
feat: Adding Demo script (#3870)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 2s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (push) Failing after 5s
Test External API and Providers / test-external (venv) (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Python Package Build Test / build (3.13) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 15s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 15s
API Conformance Tests / check-schema-compatibility (push) Successful in 24s
UI Tests / ui-tests (22) (push) Successful in 50s
Pre-commit / pre-commit (push) Successful in 1m26s
# What does this PR do?
Updated quickstart `demo_script.py` to use OpenAI APIs, which is simply:

```python
import io, requests
from openai import OpenAI

url="https://www.paulgraham.com/greatwork.html"
client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none")

vs = client.vector_stores.create()
response = requests.get(url)
pseudo_file = io.BytesIO(str(response.content).encode('utf-8'))
uploaded_file = client.files.create(file=(url, pseudo_file, "text/html"), purpose="assistants")
client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file.id)

resp = client.responses.create(
    model="openai/gpt-4o",
    input="How do you do great work? Use the existing knowledge_search tool.",
    tools=[{"type": "file_search", "vector_store_ids": [vs.id]}],
    include=["file_search_call.results"],
)

print(resp)
```



<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-10-21 21:31:21 -04:00
Charlie Doern
573e783ff0
docs: fix sidebar of Detailed Tutorial (#3856)
# What does this PR do?

the sidebar currently has an extra `ii. Run the Script` because its
incorrectly put into the doc as an H3 not an H4 (like the other ones)


<img width="239" height="218" alt="Screenshot 2025-10-20 at 1 04 54 PM"
src="https://github.com/user-attachments/assets/eb8cb26e-7ea9-4b61-9101-d64965b39647"
/>

Fix this which will update the sidebar

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-10-20 13:10:50 -04:00
Charlie Doern
b11bcfde11
refactor(build): rework CLI commands and build process (1/2) (#2974)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Successful in 22s
Test llama stack list-deps / show-single-provider (push) Failing after 53s
Test Llama Stack Build / build-single-provider (push) Failing after 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 18s
Python Package Build Test / build (3.13) (push) Failing after 24s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 26s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 27s
Unit Tests / unit-tests (3.12) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (push) Failing after 44s
API Conformance Tests / check-schema-compatibility (push) Successful in 52s
Test llama stack list-deps / generate-matrix (push) Successful in 52s
Test Llama Stack Build / build (push) Failing after 29s
Test External API and Providers / test-external (venv) (push) Failing after 53s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1m2s
Unit Tests / unit-tests (3.13) (push) Failing after 1m30s
Test llama stack list-deps / list-deps-from-config (push) Failing after 1m59s
Test llama stack list-deps / list-deps (push) Failing after 1m10s
UI Tests / ui-tests (22) (push) Successful in 2m26s
Pre-commit / pre-commit (push) Successful in 3m8s
# What does this PR do?

This PR does a few things outlined in #2878 namely:
1. adds `llama stack list-deps` a command which simply takes the build
logic and instead of executing one of the `build_...` scripts, it
displays all of the providers' dependencies using the `module` and `uv`.
2. deprecated `llama stack build` in favor of `llama stack list-deps`
3. updates all tests to use `list-deps` alongside `build`.

PR 2/2 will migrate `llama stack run`'s default behavior to be `llama
stack build --run` and use the new `list-deps` command under the hood
before running the server.

examples of `llama stack list-deps starter`

```
llama stack list-deps starter --format json
{
  "name": "starter",
  "description": "Quick start template for running Llama Stack with several popular providers. This distribution is intended for CPU-only environments.",
  "apis": [
    {
      "api": "inference",
      "provider": "remote::cerebras"
    },
    {
      "api": "inference",
      "provider": "remote::ollama"
    },
    {
      "api": "inference",
      "provider": "remote::vllm"
    },
    {
      "api": "inference",
      "provider": "remote::tgi"
    },
    {
      "api": "inference",
      "provider": "remote::fireworks"
    },
    {
      "api": "inference",
      "provider": "remote::together"
    },
    {
      "api": "inference",
      "provider": "remote::bedrock"
    },
    {
      "api": "inference",
      "provider": "remote::nvidia"
    },
    {
      "api": "inference",
      "provider": "remote::openai"
    },
    {
      "api": "inference",
      "provider": "remote::anthropic"
    },
    {
      "api": "inference",
      "provider": "remote::gemini"
    },
    {
      "api": "inference",
      "provider": "remote::vertexai"
    },
    {
      "api": "inference",
      "provider": "remote::groq"
    },
    {
      "api": "inference",
      "provider": "remote::sambanova"
    },
    {
      "api": "inference",
      "provider": "remote::azure"
    },
    {
      "api": "inference",
      "provider": "inline::sentence-transformers"
    },
    {
      "api": "vector_io",
      "provider": "inline::faiss"
    },
    {
      "api": "vector_io",
      "provider": "inline::sqlite-vec"
    },
    {
      "api": "vector_io",
      "provider": "inline::milvus"
    },
    {
      "api": "vector_io",
      "provider": "remote::chromadb"
    },
    {
      "api": "vector_io",
      "provider": "remote::pgvector"
    },
    {
      "api": "files",
      "provider": "inline::localfs"
    },
    {
      "api": "safety",
      "provider": "inline::llama-guard"
    },
    {
      "api": "safety",
      "provider": "inline::code-scanner"
    },
    {
      "api": "agents",
      "provider": "inline::meta-reference"
    },
    {
      "api": "telemetry",
      "provider": "inline::meta-reference"
    },
    {
      "api": "post_training",
      "provider": "inline::torchtune-cpu"
    },
    {
      "api": "eval",
      "provider": "inline::meta-reference"
    },
    {
      "api": "datasetio",
      "provider": "remote::huggingface"
    },
    {
      "api": "datasetio",
      "provider": "inline::localfs"
    },
    {
      "api": "scoring",
      "provider": "inline::basic"
    },
    {
      "api": "scoring",
      "provider": "inline::llm-as-judge"
    },
    {
      "api": "scoring",
      "provider": "inline::braintrust"
    },
    {
      "api": "tool_runtime",
      "provider": "remote::brave-search"
    },
    {
      "api": "tool_runtime",
      "provider": "remote::tavily-search"
    },
    {
      "api": "tool_runtime",
      "provider": "inline::rag-runtime"
    },
    {
      "api": "tool_runtime",
      "provider": "remote::model-context-protocol"
    },
    {
      "api": "batches",
      "provider": "inline::reference"
    }
  ],
  "pip_dependencies": [
    "pandas",
    "opentelemetry-exporter-otlp-proto-http",
    "matplotlib",
    "opentelemetry-sdk",
    "sentence-transformers",
    "datasets",
    "pymilvus[milvus-lite]>=2.4.10",
    "codeshield",
    "scipy",
    "torchvision",
    "tree_sitter",
    "h11>=0.16.0",
    "aiohttp",
    "pymongo",
    "tqdm",
    "pythainlp",
    "pillow",
    "torch",
    "emoji",
    "grpcio>=1.67.1,<1.71.0",
    "fireworks-ai",
    "langdetect",
    "psycopg2-binary",
    "asyncpg",
    "redis",
    "together",
    "torchao>=0.12.0",
    "openai",
    "sentencepiece",
    "aiosqlite",
    "google-cloud-aiplatform",
    "faiss-cpu",
    "numpy",
    "sqlite-vec",
    "nltk",
    "scikit-learn",
    "mcp>=1.8.1",
    "transformers",
    "boto3",
    "huggingface_hub",
    "ollama",
    "autoevals",
    "sqlalchemy[asyncio]",
    "torchtune>=0.5.0",
    "chromadb-client",
    "pypdf",
    "requests",
    "anthropic",
    "chardet",
    "aiosqlite",
    "fastapi",
    "fire",
    "httpx",
    "uvicorn",
    "opentelemetry-sdk",
    "opentelemetry-exporter-otlp-proto-http"
  ]
}
```

<img width="1500" height="420" alt="Screenshot 2025-10-16 at 5 53 03 PM"
src="https://github.com/user-attachments/assets/765929fb-93e2-44d7-9c3d-8918b70fc721"
/>

---------

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-10-17 19:52:14 -07:00
IAN MILLER
007efa6eb5
refactor: replace default all-MiniLM-L6-v2 embedding model by nomic-embed-text-v1.5 in Llama Stack (#3183)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The purpose of this PR is to replace the Llama Stack's default embedding
model by nomic-embed-text-v1.5.

These are the key reasons why Llama Stack community decided to switch
from all-MiniLM-L6-v2 to nomic-embed-text-v1.5:
1. The training data for
[all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2#training-data)
includes a lot of data sets with various licensing terms, so it is
tricky to know when/whether it is appropriate to use this model for
commercial applications.
2. The model is not particularly competitive on major benchmarks. For
example, if you look at the [MTEB
Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) and click
on Miscellaneous/BEIR to see English information retrieval accuracy, you
see that the top of the leaderboard is dominated by enormous models but
also that there are many, many models of relatively modest size whith
much higher Retrieval scores. If you want to look closely at the data, I
recommend clicking "Download Table" because it is easier to browse that
way.

More discussion info can be founded
[here](https://github.com/llamastack/llama-stack/issues/2418)

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #2418 

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
1. Run `./scripts/unit-tests.sh`
2. Integration tests via CI wokrflow

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
2025-10-14 10:44:20 -04:00
ehhuang
a3f5072776
chore!: remove --env from llama stack run (#3711)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Installer CI / lint (push) Failing after 2s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Installer CI / smoke-test-on-dev (push) Failing after 2s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 2s
Test Llama Stack Build / build-single-provider (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 2s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 1s
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Test Llama Stack Build / build (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
UI Tests / ui-tests (22) (push) Successful in 40s
Pre-commit / pre-commit (push) Successful in 1m18s
# What does this PR do?
user can simply set env vars in the beginning of the command.`FOO=BAR
llama stack run ...`

## Test Plan
Run
TELEMETRY_SINKS=coneol uv run --with llama-stack llama stack build
--distro=starter --image-type=venv --run




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with
[ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3711).
* #3714
* __->__ #3711
2025-10-07 20:58:15 -07:00
Alexey Rybak
6101c8e015
docs: fix broken links (#3540)
# What does this PR do?

<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

<!-- Closes #[issue-number] -->

- Fixes broken links and Docusaurus search

Closes #3518

## Test Plan

The following should produce a clean build with no warnings and search enabled:

```
npm install
npm run gen-api-docs all
npm run build
npm run serve
```

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
2025-09-24 14:16:31 -07:00
Alexey Rybak
8537ada11b
docs: MDX leftover fixes (#3536)
# What does this PR do?

- Fixes Docusaurus build errors

<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

<!-- Closes #[issue-number] -->

## Test Plan

- `npm run build`​ compiles the build properly
- Broken links expected and will be fixed in a follow-on PR

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
2025-09-24 14:14:32 -07:00
Alexey Rybak
c71ce8df61
docs: concepts and building_applications migration (#3534)
# What does this PR do?

- Migrates the remaining documentation sections to the new documentation format

<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

<!-- Closes #[issue-number] -->

## Test Plan

- Partial migration

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
2025-09-24 14:05:30 -07:00