Commit graph

678 commits

Author SHA1 Message Date
Xi Yan
26f578cc1d run-with-safety memory 2024-12-03 20:54:59 -08:00
Xi Yan
bc2452d2e9 fix 2024-12-03 20:52:20 -08:00
Xi Yan
a0d79c68f9 all distros 2024-12-03 20:51:21 -08:00
Xi Yan
7103892f54 all distros 2024-12-03 20:49:30 -08:00
Xi Yan
a097bfa761 override faiss memory provider only in run.yaml 2024-12-03 20:41:44 -08:00
Xi Yan
eeb914fe4d add all providers 2024-12-03 20:37:36 -08:00
Xi Yan
e6ed7eabbb add all providers 2024-12-03 20:31:34 -08:00
Xi Yan
1dc337db8b Merge branch 'playground-ui' into ui-compose 2024-12-03 19:12:16 -08:00
Xi Yan
2da1e742e8 Merge branch 'main' into playground-ui 2024-12-03 19:11:49 -08:00
Xi Yan
6e10d0b23e precommit 2024-12-03 18:52:43 -08:00
Xi Yan
fd19a8a517 add missing __init__ 2024-12-03 18:50:18 -08:00
Xi Yan
3fc6b10d22 autogen build/run 2024-12-03 17:04:35 -08:00
Xi Yan
95187891ca add eval provider to distro 2024-12-03 17:01:33 -08:00
Xi Yan
f32092178e native eval flow refactor 2024-12-03 16:29:43 -08:00
Xi Yan
92f79d4dfb expander refactor 2024-12-03 16:20:31 -08:00
Xi Yan
e245f459bb requirements 2024-12-03 16:05:01 -08:00
Matthew Farrellee
435f34b05e
reduce the accuracy requirements to pass the chat completion structured output test (#522)
i find `test_structured_output` to be flakey. it's both a functionality
and accuracy test -

```
        answer = AnswerFormat.model_validate_json(response.completion_message.content)
        assert answer.first_name == "Michael"
        assert answer.last_name == "Jordan"
        assert answer.year_of_birth == 1963
        assert answer.num_seasons_in_nba == 15
```

it's an accuracy test because it checks the value of first/last name,
birth year, and num seasons.

i find that -
- llama-3.1-8b-instruct and llama-3.2-3b-instruct pass the functionality
portion
- llama-3.2-3b-instruct consistently fails the accuracy portion
(thinking MJ was in the NBA for 14 seasons)
 - llama-3.1-8b-instruct occasionally fails the accuracy portion

suggestions (not mutually exclusive) -
1. turn the test into functionality only, skip the value checks
2. split the test into a functionality version and an xfail accuracy
version
3. add context to the prompt so the llm can answer without accessing
embedded memory

# What does this PR do?

implements option (3) by adding context to the system prompt.


## Test Plan


`pytest -s -v ... llama_stack/providers/tests/inference/ ... -k
structured_output`


## Before submitting

- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [x] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.
2024-12-03 02:55:14 -08:00
Xi Yan
114595ce71 navigation 2024-12-02 20:11:32 -08:00
dltn
4c7b1a8fb3 Bump version to 0.0.57 2024-12-02 19:48:46 -08:00
Xi Yan
06b9566eb6 more msg 2024-12-02 16:14:04 -08:00
Dinesh Yeduguru
1e2faa461f
update client cli docs (#560)
Test plan: 
make html
sphinx-autobuild source build/html


![Screenshot 2024-12-02 at 3 32
18 PM](https://github.com/user-attachments/assets/061d5ca6-178f-463a-854c-acb96ca3bb0d)
2024-12-02 16:10:16 -08:00
Xi Yan
0e718b9712 native eval 2024-12-02 15:49:34 -08:00
Xi Yan
b59810cd9a native eval 2024-12-02 15:38:58 -08:00
Xi Yan
de2ab1243a native eval 2024-12-02 14:36:17 -08:00
Xi Yan
2f7e39fb10 fix 2024-12-02 13:20:23 -08:00
Xi Yan
6bdad37372 readme 2024-12-02 13:14:15 -08:00
Xi Yan
3335bcd83d cleanup 2024-12-02 13:12:44 -08:00
Xi Yan
7f2ed9622c cleanup 2024-12-02 13:06:36 -08:00
Xi Yan
9bceb1912e Merge branch 'main' into playground-ui 2024-12-02 12:44:50 -08:00
Aidan Do
6bcd1bd9f1
Fix broken Ollama link (#554)
# What does this PR do?

Fixes a broken Ollama link and formatting on this page:
https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/ollama.html

<img width="714" alt="Screenshot 2024-12-02 at 21 04 17"
src="https://github.com/user-attachments/assets/ada893c3-e1bd-4f04-826f-9ce1a11330a3">

<img width="822" alt="image"
src="https://github.com/user-attachments/assets/ab47cec3-3fcc-4671-92ae-febbc5003e6f">

To:

<img width="714" alt="Screenshot 2024-12-02 at 21 05 07"
src="https://github.com/user-attachments/assets/07a41653-1978-4472-bfa0-5f65dbf5cab5">

<img width="616" alt="image"
src="https://github.com/user-attachments/assets/dd0022e6-3468-4de0-bd55-c4ce2840c7d6">


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).

Co-authored-by: Aidan Do <aidand@canva.com>
2024-12-02 11:06:20 -08:00
Dinesh Yeduguru
fe48b9fb8c Bump version to 0.0.56 2024-11-30 12:27:31 -08:00
raghotham
8a3887c7eb
Guide readme fix (#552)
# What does this PR do?
Fixes readme to remove redundant information and added
llama-stack-client cli instructions.

## Before submitting

- [ X] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ X] Ran pre-commit to handle lint / formatting issues.
- [ X] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ X] Updated relevant documentation.
2024-11-30 12:28:03 -06:00
Jeffrey Lind
2fc1c16d58
Fix Zero to Hero README.md Formatting (#546)
# What does this PR do?
The formatting shown in the picture below in the Zero to Hero README.md
was fixed with this PR (also shown in a picture below).

**Before**
<img width="1014" alt="Screenshot 2024-11-28 at 1 47 32 PM"
src="https://github.com/user-attachments/assets/02d2281e-83ae-43eb-a1c7-702bc2365120">

**After**
<img width="1014" alt="Screenshot 2024-11-28 at 1 50 19 PM"
src="https://github.com/user-attachments/assets/03e54f40-c347-4737-8b91-197eee70a52f">
2024-11-29 10:12:53 -06:00
Jeffrey Lind
5fc2ee6f77
Fix URLs to Llama Stack Read the Docs Webpages (#547)
# What does this PR do?

Many of the URLs pointing to the Llama Stack's Read The Docs webpages
were broken, presumably due to recent refactor of the documentation.
This PR fixes all effected URLs throughout the repository.
2024-11-29 10:11:50 -06:00
Sean
9088206eda
fix[documentation]: Update links to point to correct pages (#549)
# What does this PR do?

In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.

- [x] Addresses issue (#548)


## Test Plan

Please describe:
No automated tests. Clicked on each link to ensure I was directed to the
right page.

## Sources


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [x] Updated relevant documentation.
- [ ] ~Wrote necessary unit or integration tests.~
2024-11-29 07:43:56 -06:00
Xi Yan
9bb6c1346b rag page 2024-11-27 16:56:57 -08:00
Xi Yan
2ecbbd92ed rag page 2024-11-27 16:52:07 -08:00
Xi Yan
5d9faca81b distribution inspect 2024-11-27 16:03:58 -08:00
Xi Yan
73335e4aaf playground 2024-11-27 15:31:57 -08:00
Xi Yan
68b70d1b1f playground 2024-11-27 15:27:10 -08:00
Xi Yan
125d98c9bd Merge branch 'main' into playground-ui 2024-11-27 15:12:29 -08:00
Xi Yan
c544e4b015 chat playground 2024-11-27 15:11:27 -08:00
Xi Yan
b1a63df8cd
move playground ui to llama-stack repo (#536)
# What does this PR do?

- Move Llama Stack Playground UI to llama-stack repo under
llama_stack/distribution
- Original PR in llama-stack-apps:
https://github.com/meta-llama/llama-stack-apps/pull/127

## Test Plan
```
cd llama-stack/llama_stack/distribution/ui
streamlit run app.py
```


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-11-26 22:04:21 -08:00
Xi Yan
371259ca5b readme 2024-11-26 22:02:29 -08:00
Xi Yan
8840cf1d9a readme 2024-11-26 20:16:39 -08:00
Xi Yan
2c8a7a972c rename playground-ui -> ui 2024-11-26 20:15:41 -08:00
Xi Yan
d467638f26 move playground ui to llama-stack repo 2024-11-26 19:57:00 -08:00
Xi Yan
c2cfd2261e move playground ui to llama-stack repo 2024-11-26 19:54:24 -08:00
Matthew Farrellee
060b4eb776
allow env NVIDIA_BASE_URL to set NVIDIAConfig.url (#531)
# What does this PR do?

this allows setting an NVIDIA_BASE_URL variable to control the
NVIDIAConfig.url option


## Test Plan

`pytest -s -v --providers inference=nvidia
llama_stack/providers/tests/inference/ --env
NVIDIA_BASE_URL=http://localhost:8000`


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-11-26 17:46:44 -08:00
Xi Yan
50cc165077
fixes tests & move braintrust api_keys to request headers (#535)
# What does this PR do?

- braintrust scoring provider requires OPENAI_API_KEY env variable to be
set
- move this to be able to be set as request headers (e.g. like together
/ fireworks api keys)
- fixes pytest with agents dependency

## Test Plan

**E2E**
```
llama stack run 
```
```yaml
scoring:
  - provider_id: braintrust-0
    provider_type: inline::braintrust
    config: {}
```

**Client**
```python
self.client = LlamaStackClient(
    base_url=os.environ.get("LLAMA_STACK_ENDPOINT", "http://localhost:5000"),
    provider_data={
        "openai_api_key": os.environ.get("OPENAI_API_KEY", ""),
    },
)
```
- run `llama-stack-client eval run_scoring`

**Unit Test**
```
pytest -v -s -m meta_reference_eval_together_inference eval/test_eval.py
```

```
pytest -v -s -m braintrust_scoring_together_inference scoring/test_scoring.py --env OPENAI_API_KEY=$OPENAI_API_KEY
```
<img width="745" alt="image"
src="https://github.com/user-attachments/assets/68f5cdda-f6c8-496d-8b4f-1b3dabeca9c2">

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-11-26 13:11:21 -08:00