Commit graph

15 commits

Author SHA1 Message Date
Sébastien Han
6371bb1b33
chore(refact)!: simplify config management (#1105)
# What does this PR do?

We are dropping configuration via CLI flag almost entirely. If any
server configuration has to be tweak it must be done through the server
section in the run.yaml.

This is unfortunately a breaking change for whover was using:

* `--tls-*`
* `--disable_ipv6`

`--port` stays around and get a special treatment since we believe, it's
common for user dev to change port for quick experimentations.

Closes: https://github.com/meta-llama/llama-stack/issues/1076

## Test Plan

Simply do `llama stack run <config>` nothing should break :)

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-05-07 09:18:12 -07:00
Ashwin Bharambe
272d3359ee
fix: remove code interpeter implementation (#2087)
# What does this PR do?

The builtin implementation of code interpreter is not robust and has a
really weak sandboxing shell (the `bubblewrap` container). Given the
availability of better MCP code interpreter servers coming up, we should
use them instead of baking an implementation into the Stack and
expanding the vulnerability surface to the rest of the Stack.

This PR only does the removal. We will add examples with how to
integrate with MCPs in subsequent ones.

## Test Plan

Existing tests.
2025-05-01 14:35:08 -07:00
Sébastien Han
4412694018
chore: Remove zero-width space characters from OTEL service name env var defaults (#2060)
# What does this PR do?

Replaced `${env.OTEL_SERVICE_NAME:\u200B}` and similar variants with
properly formatted `${env.OTEL_SERVICE_NAME:}` across all YAML templates
and TelemetryConfig. This prevents silent parsing issues and ensures
consistent environment variable resolution.
Slipped in https://github.com/meta-llama/llama-stack/pull/2058

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-30 17:56:46 +02:00
Roland Huß
5a2bfd6ad5
refactor: Replace SQLITE_DB_PATH by SQLITE_STORE_DIR env in templates (#2055)
# What does this PR do?

The telemetry provider configs is the only one who leverages the env var
`SQLITE_DB_PATH` for pointing to persistent data in the respective
templates, whereas usually `SQLITE_STORE_DIR` is used.

This PR modifies the `sqlite_db_path` in various telemetry configuration
files to use the environment variable `SQLITE_STORE_DIR` instead of
`SQLITE_DB_PATH`. This change ensures that _only_ the SQLITE_STORE_DIR
needs to be set to point to a different persistence location for
providers.

All references to `SQLITE_DB_PATH` have been removed.

Another improvement could be to move `sqlite_db_path` to `db_path` in
the telemetry provider config, to align with the other provider
configurations. That could be done by another PR (if wanted).
2025-04-29 15:28:10 -07:00
ehhuang
7b4eb0967e
test: verification on provider's OAI endpoints (#1893)
# What does this PR do?


## Test Plan
export MODEL=accounts/fireworks/models/llama4-scout-instruct-basic;
LLAMA_STACK_CONFIG=verification pytest -s -v tests/integration/inference
--vision-model $MODEL --text-model $MODEL
2025-04-07 23:06:28 -07:00
ehhuang
2f38851751
chore: Revert "chore(telemetry): remove service_name entirely" (#1785)
Reverts meta-llama/llama-stack#1755 closes #1781
2025-03-25 14:42:05 -07:00
ehhuang
b9fbfed216
chore(telemetry): remove service_name entirely (#1755)
# What does this PR do?


## Test Plan

LLAMA_STACK_CONFIG=dev pytest -s -v
tests/integration/agents/test_agents.py::test_custom_tool
--safety-shield meta-llama/Llama-Guard-3-8B --text-model
accounts/fireworks/models/llama-v3p1-8b-instruct

and verify trace in jaeger UI
https://llama-stack.readthedocs.io/en/latest/building_applications/telemetry.html#
2025-03-21 15:11:56 -07:00
ehhuang
34f89bfbd6
feat(telemetry): use zero-width space to avoid clutter (#1754)
# What does this PR do?
Before 
<img width="858" alt="image"
src="https://github.com/user-attachments/assets/6cefb1ae-5603-4818-85ea-a0c337b986bc"
/>

Note the redundant 'llama-stack' in front of every span

## Test Plan
<img width="1171" alt="image"
src="https://github.com/user-attachments/assets/bdc5fd5b-ff1f-4f10-8b40-cff2ea93dd1f"
/>
2025-03-21 12:02:10 -07:00
Hardik Shah
127bac6869
fix: Default to port 8321 everywhere (#1734)
As titled, moved all instances of 5001 to 8321
2025-03-20 15:50:41 -07:00
Ashwin Bharambe
d072b5fa0c
test: add unit test to ensure all config types are instantiable (#1601) 2025-03-12 22:29:58 -07:00
Dinesh Yeduguru
85501ed875
fix: remove Llama-3.2-1B-Instruct for fireworks (#1558)
# What does this PR do?
remove Llama-3.2-1B-Instruct for fireworks as its no longer appears to
be hosted on website.


## Test Plan

python distro_codegen.py
2025-03-11 11:19:29 -07:00
Ashwin Bharambe
dd0db8038b
refactor(test): unify vector_io tests and make them configurable (#1398)
## Test Plan


`LLAMA_STACK_CONFIG=inference=sentence-transformers,vector_io=sqlite-vec
pytest -s -v test_vector_io.py --embedding-model all-miniLM-L6-V2
--inference-model='' --vision-inference-model=''`

```
test_vector_io.py::test_vector_db_retrieve[txt=:vis=:emb=all-miniLM-L6-V2] PASSED
test_vector_io.py::test_vector_db_register[txt=:vis=:emb=all-miniLM-L6-V2] PASSED
test_vector_io.py::test_insert_chunks[txt=:vis=:emb=all-miniLM-L6-V2-test_case0] PASSED
test_vector_io.py::test_insert_chunks[txt=:vis=:emb=all-miniLM-L6-V2-test_case1] PASSED
test_vector_io.py::test_insert_chunks[txt=:vis=:emb=all-miniLM-L6-V2-test_case2] PASSED
test_vector_io.py::test_insert_chunks[txt=:vis=:emb=all-miniLM-L6-V2-test_case3] PASSED
test_vector_io.py::test_insert_chunks[txt=:vis=:emb=all-miniLM-L6-V2-test_case4] PASSED
```

Same thing with:
- LLAMA_STACK_CONFIG=inference=sentence-transformers,vector_io=faiss
- LLAMA_STACK_CONFIG=fireworks

(Note that ergonomics will soon be improved re: cmd-line options and env
variables)
2025-03-04 13:37:45 -08:00
Ashwin Bharambe
04de2f84e9
fix: register provider model name and HF alias in run.yaml (#1304)
Each model known to the system has two identifiers: 

- the `provider_resource_id` (what the provider calls it) -- e.g.,
`accounts/fireworks/models/llama-v3p1-8b-instruct`
- the `identifier` (`model_id`) under which it is registered and gets
routed to the appropriate provider.

We have so far used the HuggingFace repo alias as the standardized
identifier you can use to refer to the model. So in the above example,
we'd use `meta-llama/Llama-3.1-8B-Instruct` as the name under which it
gets registered. This makes it convenient for users to refer to these
models across providers.

However, we forgot to register the _actual_ provider model ID also. You
should be able to route via `provider_resource_id` also, of course.

This change fixes this (somewhat grave) omission.

*Note*: this change is additive -- more aliases work now compared to
before.

## Test Plan

Run the following for distro=(ollama fireworks together)
```
LLAMA_STACK_CONFIG=$distro \
   pytest -s -v tests/client-sdk/inference/test_text_inference.py \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct --vision-inference-model=""
```
2025-02-27 16:39:23 -08:00
Ashwin Bharambe
63e6acd0c3
feat: add (openai, anthropic, gemini) providers via litellm (#1267)
# What does this PR do?

This PR introduces more non-llama model support to llama stack.
Providers introduced: openai, anthropic and gemini. All of these
providers use essentially the same piece of code -- the implementation
works via the `litellm` library.

We will expose only specific models for providers we enable making sure
they all work well and pass tests. This setup (instead of automatically
enabling _all_ providers and models allowed by LiteLLM) ensures we can
also perform any needed prompt tuning on a per-model basis as needed
(just like we do it for llama models.)

## Test Plan

```bash
#!/bin/bash

args=("$@")
for model in openai/gpt-4o anthropic/claude-3-5-sonnet-latest gemini/gemini-1.5-flash; do
    LLAMA_STACK_CONFIG=dev pytest -s -v tests/client-sdk/inference/test_text_inference.py \
        --embedding-model=all-MiniLM-L6-v2 \
        --vision-inference-model="" \
        --inference-model=$model "${args[@]}"
done
```
2025-02-25 22:07:33 -08:00
Ashwin Bharambe
9b0f783e54
test: add a ci-tests distro template for running e2e tests (#1237) 2025-02-24 14:43:21 -08:00