Commit graph

26 commits

Author SHA1 Message Date
Xi Yan
a54d757ade merge 2025-03-23 15:48:14 -07:00
ehhuang
ea6a4a14ce
feat(api): simplify client imports (#1687)
# What does this PR do?
closes #1554 

## Test Plan
test_agents.py
2025-03-20 10:15:49 -07:00
Xi Yan
c1d18283d2
feat(eval api): (2.2/n) delete eval / scoring / scoring_fn apis (#1700)
# What does this PR do?
- To make it easier, delete existing `eval/scoring/scoring_function`
apis. There will be a bunch of broken impls here. The sequence is:
1. migrate benchmark graders
2. clean up existing scoring functions

- Add a skeleton evaluation impl to make tests pass. 

## Test Plan
tested in following PRs

[//]: # (## Documentation)
2025-03-19 11:04:23 -07:00
Sarthak Deshpande
9c8e88ea9c
fix: Fixed import errors for UI and playground (#1666)
# What does this PR do?
Fixed import errors for playground and ui

---------

Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com>
2025-03-18 15:00:48 -07:00
Jamie Land
f4dc290705
feat: Created Playground Containerfile and Image Workflow (#1256)
# What does this PR do?
Adds a container file that can be used to build the playground UI.

This file will be built by this PR in the stack-ops repo:
https://github.com/meta-llama/llama-stack-ops/pull/9

Docker command in the docs will need to change once I know the address
of the official repository.

## Test Plan

Tested image on my local Openshift Instance using this helm chart:
https://github.com/Jaland/llama-stack-helm/tree/main/llama-stack

[//]: # (## Documentation)

---------

Co-authored-by: Jamie Land <hokie10@gmail.com>
2025-03-18 09:26:49 -07:00
Xi Yan
5287b437ae
feat(api): (1/n) datasets api clean up (#1573)
## PR Stack
- https://github.com/meta-llama/llama-stack/pull/1573
- https://github.com/meta-llama/llama-stack/pull/1625
- https://github.com/meta-llama/llama-stack/pull/1656
- https://github.com/meta-llama/llama-stack/pull/1657
- https://github.com/meta-llama/llama-stack/pull/1658
- https://github.com/meta-llama/llama-stack/pull/1659
- https://github.com/meta-llama/llama-stack/pull/1660

**Client SDK**
- https://github.com/meta-llama/llama-stack-client-python/pull/203

**CI**
- 1391130488
<img width="1042" alt="image"
src="https://github.com/user-attachments/assets/69636067-376d-436b-9204-896e2dd490ca"
/>
-- the test_rag_agent_with_attachments is flaky and not related to this
PR

## Doc
<img width="789" alt="image"
src="https://github.com/user-attachments/assets/b88390f3-73d6-4483-b09a-a192064e32d9"
/>


## Client Usage
```python
client.datasets.register(
    source={
        "type": "uri",
        "uri": "lsfs://mydata.jsonl",
    },
    schema="jsonl_messages",
    # optional 
    dataset_id="my_first_train_data"
)

# quick prototype debugging
client.datasets.register(
    data_reference={
        "type": "rows",
        "rows": [
                "messages": [...],
        ],
    },
    schema="jsonl_messages",
)
```

## Test Plan
- CI:
1387805545

```
LLAMA_STACK_CONFIG=fireworks pytest -v tests/integration/datasets/test_datasets.py
```

```
LLAMA_STACK_CONFIG=fireworks pytest -v tests/integration/scoring/test_scoring.py
```

```
pytest -v -s --nbval-lax ./docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb
```
2025-03-17 16:55:45 -07:00
ehhuang
ca2910d27a
docs: update test_agents to use new Agent SDK API (#1402)
# Summary:
new Agent SDK API is added in
https://github.com/meta-llama/llama-stack-client-python/pull/178

Update docs and test to reflect this.

Closes https://github.com/meta-llama/llama-stack/issues/1365

# Test Plan:
```bash
py.test -v -s --nbval-lax ./docs/getting_started.ipynb

LLAMA_STACK_CONFIG=fireworks \
   pytest -s -v tests/integration/agents/test_agents.py \
  --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct
```
2025-03-06 15:21:12 -08:00
Ellis Tarn
24a27baf7c
chore: Make README code blocks more easily copy pastable (#1420)
# What does this PR do?
When going through READMEs, I found that I had to keep editing the code
blocks since they were prefixed with `$ `. A common pattern is to triple
click (highlight all) a block and then copy paste. This minor change
will make this easier for folks to follow the READMEs.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
N/A

[//]: # (## Documentation)
2025-03-05 09:11:01 -08:00
Xi Yan
e9a37bad63
chore: rename task_config to benchmark_config (#1397)
# What does this PR do?

- This was missed from previous deprecation:
https://github.com/meta-llama/llama-stack/pull/1186
- Part of https://github.com/meta-llama/llama-stack/issues/1396

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
```
pytest -v -s --nbval-lax ./llama-stack/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb 
```

[//]: # (## Documentation)
2025-03-04 12:44:04 -08:00
Sébastien Han
6fa257b475
chore(lint): update Ruff ignores for project conventions and maintainability (#1184)
- Added new ignores from flake8-bugbear (`B007`, `B008`)
- Ignored `C901` (high function complexity) for now, pending review
- Maintained PyTorch conventions (`N812`, `N817`)
- Allowed `E731` (lambda assignments) for flexibility
- Consolidated existing ignores (`E402`, `E501`, `F405`, `C408`, `N812`)
- Documented rationale for each ignored rule

This keeps our linting aligned with project needs while tracking
potential fixes.

Signed-off-by: Sébastien Han <seb@redhat.com>

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-02-28 09:36:49 -08:00
ehhuang
c8a20b8ed0
feat: allow specifying specific tool within toolgroup (#1239)
Summary:

E.g. `builtin::rag::knowledge_search`

Test Plan:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/ --safety-shield meta-llama/Llama-Guard-3-8B
```
2025-02-26 14:07:05 -08:00
Xi Yan
8b655e3cd2
fix!: update eval-tasks -> benchmarks (#1032)
# What does this PR do?

- Update `/eval-tasks` to `/benchmarks`
- ⚠️ Remove differentiation between `app` v.s. `benchmark` eval task
config. Now we only have `BenchmarkConfig`. The overloaded `benchmark`
is confusing and do not add any value. Backward compatibility is being
kept as the "type" is not being used anywhere.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
- This change is backward compatible 
- Run notebook test with

```
pytest -v -s --nbval-lax ./docs/getting_started.ipynb
pytest -v -s --nbval-lax ./docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb
```

<img width="846" alt="image"
src="https://github.com/user-attachments/assets/d2fc06a7-593a-444f-bc1f-10ab9b0c843d"
/>



[//]: # (## Documentation)
[//]: # (- [ ] Added a Changelog entry if the change is significant)

---------

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
Co-authored-by: Ben Browning <ben324@gmail.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Reid <61492567+reidliu41@users.noreply.github.com>
Co-authored-by: reidliu <reid201711@gmail.com>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-13 16:40:58 -08:00
Sébastien Han
e4a1579e63
build: format codebase imports using ruff linter (#1028)
# What does this PR do?

- Configured ruff linter to automatically fix import sorting issues.
- Set --exit-non-zero-on-fix to ensure non-zero exit code when fixes are
applied.
- Enabled the 'I' selection to focus on import-related linting rules.
- Ran the linter, and formatted all codebase imports accordingly.
- Removed the black dep from the "dev" group since we use ruff

Signed-off-by: Sébastien Han <seb@redhat.com>

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]

[//]: # (## Documentation)
[//]: # (- [ ] Added a Changelog entry if the change is significant)

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-02-13 10:06:21 -08:00
Yuan Tang
34ab7a3b6c
Fix precommit check after moving to ruff (#927)
Lint check in main branch is failing. This fixes the lint check after we
moved to ruff in https://github.com/meta-llama/llama-stack/pull/921. We
need to move to a `ruff.toml` file as well as fixing and ignoring some
additional checks.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-02 06:46:45 -08:00
snova-edwardm
22dc684da6
Sambanova inference provider (#555)
# What does this PR do?

This PR adds SambaNova as one of the Provider

- Add SambaNova as a provider

## Test Plan
Test the functional command
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_embeddings.py llama_stack/providers/tests/inference/test_prompt_adapter.py llama_stack/providers/tests/inference/test_text_inference.py llama_stack/providers/tests/inference/test_vision_inference.py --env SAMBANOVA_API_KEY=<sambanova-api-key>
```

Test the distribution template:
```
# Docker
LLAMA_STACK_PORT=5001
docker run -it -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
  llamastack/distribution-sambanova \
  --port $LLAMA_STACK_PORT \
  --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY

# Conda
llama stack build --template sambanova --image-type conda
llama stack run ./run.yaml \
  --port $LLAMA_STACK_PORT \
  --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY
```

## Source
[SambaNova API Documentation](https://cloud.sambanova.ai/apis)

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [Y] Ran pre-commit to handle lint / formatting issues.
- [Y] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [Y] Updated relevant documentation.
- [Y ] Wrote necessary unit or integration tests.

---------

Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-01-23 12:20:28 -08:00
Ashwin Bharambe
f3d8864c36 Rename builtin::memory -> builtin::rag 2025-01-22 20:22:51 -08:00
Ashwin Bharambe
c9e5578151
[memory refactor][5/n] Migrate all vector_io providers (#835)
See https://github.com/meta-llama/llama-stack/issues/827 for the broader
design.

This PR finishes off all the stragglers and migrates everything to the
new naming.
2025-01-22 10:17:59 -08:00
Xi Yan
9d574f4aee
fix playground for v1 (#799)
# What does this PR do?

- update playground callsites for v1 api changes

## Test Plan

```
cd llama_stack/distribution/ui
streamlit run app.py
```


https://github.com/user-attachments/assets/eace11c6-600a-42dc-b4e7-6948a706509f




## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-16 19:32:07 -08:00
Ashwin Bharambe
03ac84a829 Update default port from 5000 -> 8321 2025-01-16 15:26:48 -08:00
Hardik Shah
a51c8b4efc
Convert SamplingParams.strategy to a union (#767)
# What does this PR do?

Cleans up how we provide sampling params. Earlier, strategy was an enum
and all params (top_p, temperature, top_k) across all strategies were
grouped. We now have a strategy union object with each strategy (greedy,
top_p, top_k) having its corresponding params.
Earlier, 
```
class SamplingParams: 
    strategy: enum ()
    top_p, temperature, top_k and other params
```
However, the `strategy` field was not being used in any providers making
it confusing to know the exact sampling behavior purely based on the
params since you could pass temperature, top_p, top_k and how the
provider would interpret those would not be clear.

Hence we introduced -- a union where the strategy and relevant params
are all clubbed together to avoid this confusion.

Have updated all providers, tests, notebooks, readme and otehr places
where sampling params was being used to use the new format.
   

## Test Plan
`pytest llama_stack/providers/tests/inference/groq/test_groq_utils.py`
// inference on ollama, fireworks and together 
`with-proxy pytest -v -s -k "ollama"
--inference-model="meta-llama/Llama-3.1-8B-Instruct"
llama_stack/providers/tests/inference/test_text_inference.py `
// agents on fireworks 
`pytest -v -s -k 'fireworks and create_agent'
--inference-model="meta-llama/Llama-3.1-8B-Instruct"
llama_stack/providers/tests/agents/test_agents.py
--safety-shield="meta-llama/Llama-Guard-3-8B"`

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [X] Ran pre-commit to handle lint / formatting issues.
- [X] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [X] Updated relevant documentation.
- [X] Wrote necessary unit or integration tests.

---------

Co-authored-by: Hardik Shah <hjshah@fb.com>
2025-01-15 05:38:51 -08:00
Yuan Tang
c1987d6143
Fix failing flake8 E226 check (#701)
This fixes the pre-commit check when running locally (not sure why this
was not caught on CI check):

```
> pre-commit run --show-diff-on-failure --color=always --all-files
trim trailing whitespace.................................................Passed
check python ast.........................................................Passed
check for merge conflicts................................................Passed
check for added large files..............................................Passed
fix end of files.........................................................Passed
Insert license in comments...............................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

llama_stack/distribution/ui/page/evaluations/app_eval.py:132:65: E226 missing whitespace around arithmetic operator
llama_stack/distribution/ui/page/evaluations/native_eval.py:235:61: E226 missing whitespace around arithmetic operator
llama_stack/providers/utils/telemetry/trace_protocol.py:56:78: E226 missing whitespace around arithmetic operator


```

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-01-02 09:04:07 -08:00
Xi Yan
75e72cf2fc model_type=llm for filering available models for playground 2024-12-17 19:42:38 -08:00
Xi Yan
af8f1b3531 model selection playground fix 2024-12-17 18:13:52 -08:00
Xi Yan
7301403ce3
Add eval/scoring/datasetio API providers to distribution templates & UI developer guide (#564)
# What does this PR do?

- add /eval, /scoring, /datasetio API providers to distribution
templates
- regenerate build.yaml / run.yaml files
- fix `template.py` to take in list of providers instead of only first
one
- override memory provider as faiss default for all distro (as only 1
memory provider is needed to start basic flow, chromadb/pgvector need
additional setup step).
```
python llama_stack/scripts/distro_codegen.py
```

- updated README to start UI via conda builds. 

## Test Plan

```
python llama_stack/scripts/distro_codegen.py
```

- Use newly generated `run.yaml` to start server
```
llama stack run ./llama_stack/templates/together/run.yaml
```
<img width="1191" alt="image"
src="https://github.com/user-attachments/assets/62f7d179-0cd0-427c-b6e8-e087d4648f09">


#### Registration
```
❯ llama-stack-client datasets register \
--dataset-id "mmlu" \
--provider-id "huggingface" \
--url "https://huggingface.co/datasets/llamastack/evals" \
--metadata '{"path": "llamastack/evals", "name": "evals__mmlu__details", "split": "train"}' \
--schema '{"input_query": {"type": "string"}, "expected_answer": {"type": "string", "chat_completion_input": {"type": "string"}}}'
❯ llama-stack-client datasets list
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ identifier ┃ provider_id ┃ metadata                                ┃ type    ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ mmlu       │ huggingface │ {'path': 'llamastack/evals', 'name':    │ dataset │
│            │             │ 'evals__mmlu__details', 'split':        │         │
│            │             │ 'train'}                                │         │
└────────────┴─────────────┴─────────────────────────────────────────┴─────────┘
```

```
❯ llama-stack-client datasets register \
--dataset-id "simpleqa" \
--provider-id "huggingface" \
--url "https://huggingface.co/datasets/llamastack/evals" \
--metadata '{"path": "llamastack/evals", "name": "evals__simpleqa", "split": "train"}' \
--schema '{"input_query": {"type": "string"}, "expected_answer": {"type": "string", "chat_completion_input": {"type": "string"}}}'
❯ llama-stack-client datasets list
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ identifier ┃ provider_id ┃ metadata                                                      ┃ type    ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ mmlu       │ huggingface │ {'path': 'llamastack/evals', 'name': 'evals__mmlu__details',  │ dataset │
│            │             │ 'split': 'train'}                                             │         │
│ simpleqa   │ huggingface │ {'path': 'llamastack/evals', 'name': 'evals__simpleqa',       │ dataset │
│            │             │ 'split': 'train'}                                             │         │
└────────────┴─────────────┴───────────────────────────────────────────────────────────────┴─────────┘
```

```
❯ llama-stack-client eval_tasks register \
> --eval-task-id meta-reference-mmlu \
> --provider-id meta-reference \
> --dataset-id mmlu \
> --scoring-functions basic::regex_parser_multiple_choice_answer
❯ llama-stack-client eval_tasks register \
--eval-task-id meta-reference-simpleqa \
--provider-id meta-reference \
--dataset-id simpleqa \
--scoring-functions llm-as-judge::405b-simpleqa
❯ llama-stack-client eval_tasks list
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ dataset_id ┃ identifier       ┃ metadata ┃ provider_id    ┃ provider_resour… ┃ scoring_functio… ┃ type      ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ mmlu       │ meta-reference-… │ {}       │ meta-reference │ meta-reference-… │ ['basic::regex_… │ eval_task │
│ simpleqa   │ meta-reference-… │ {}       │ meta-reference │ meta-reference-… │ ['llm-as-judge:… │ eval_task │
└────────────┴──────────────────┴──────────┴────────────────┴──────────────────┴──────────────────┴───────────┘
```

#### Test with UI
```
streamlit run app.py
```

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-05 16:29:32 -08:00
Xi Yan
16769256b7
[llama stack ui] add native eval & inspect distro & playground pages (#541)
# What does this PR do?

New Pages Added: 

- (1) Inspect Distro
- (2) Evaluations: 
  - (a) native evaluations (including generation)
  - (b) application evaluations (no generation, scoring only)
- (3) Playground: 
  - (a) chat
  - (b) RAG  

## Test Plan

```
streamlit run app.py
```

#### Playground

https://github.com/user-attachments/assets/6ca617e8-32ca-49b2-9774-185020ff5204

#### Inspect

https://github.com/user-attachments/assets/01d52b2d-92af-4e3a-b623-a9b8ba22ba99


#### Evaluations (Generation + Scoring)

https://github.com/user-attachments/assets/345845c7-2a2b-4095-960a-9ae40f6a93cf

#### Evaluations (Scoring)

https://github.com/user-attachments/assets/6cc1659f-eba4-49ca-a0a5-7c243557b4f5


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-04 09:47:09 -08:00
Xi Yan
b1a63df8cd
move playground ui to llama-stack repo (#536)
# What does this PR do?

- Move Llama Stack Playground UI to llama-stack repo under
llama_stack/distribution
- Original PR in llama-stack-apps:
https://github.com/meta-llama/llama-stack-apps/pull/127

## Test Plan
```
cd llama-stack/llama_stack/distribution/ui
streamlit run app.py
```


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-11-26 22:04:21 -08:00