Commit graph

286 commits

Author SHA1 Message Date
Justin Lee
e4865c3510
adding readme to docs folder for easier discoverability of notebooks … (#857)
as titled

<img width="454" alt="image"
src="https://github.com/user-attachments/assets/7579d1d2-06cd-48e4-9659-79ab1ec6a4c2"
/>
2025-01-28 04:58:46 -08:00
Chris Khanoyan
5b0d778871
Update index.md (#888)
Fixing the bullets

# What does this PR do?

The bullets were not there as intended so I helped fix them. 

- [x] Addresses issue (#issue)

## Test Plan

Please describe:

Ran the test, and the bullets are there now to be consistent with the
page.

## Sources

N/A

## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-28 04:55:41 -08:00
Ashwin Bharambe
e5936a8df8
Update discriminator to have the correct mapping (#881)
See
https://swagger.io/docs/specification/v3_0/data-models/inheritance-and-polymorphism/#discriminator

When specifying discriminators, mapping must be specified unless the
value of the discriminator is the subtype itself (which in our case is
not.)

The changes in the YAML are self-explanatory.
2025-01-27 09:18:13 -08:00
Bakunga Bronson
7de46e40f9
Fixed multiple typos (#878)
# What does this PR do?

In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.

- [ ] Addresses issue (#issue)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-24 14:45:43 -08:00
Bakunga Bronson
33113139e8
Fixed typo (#877)
# What does this PR do?

In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.

- [ ] Addresses issue (#issue)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-24 13:16:00 -08:00
Hardik Shah
2cebb24d3a
Update doc templates for running safety on self-hosted templates (#874) 2025-01-24 11:28:20 -08:00
Ashwin Bharambe
eaba6a550a Point to 0.1.0 release notes in docs 2025-01-24 10:00:16 -08:00
Ashwin Bharambe
19521cb22e More doc updates 2025-01-24 09:22:15 -08:00
Ashwin Bharambe
2118f37350 Doc updates 2025-01-23 21:31:18 -08:00
Ashwin Bharambe
9351a4b2d7 Update documentation 2025-01-23 17:10:57 -08:00
ehhuang
2fefe8dacd
Update 'first RAG agent' in gettingstarted doc (#867)
# What does this PR do?

Fix documentation to reflect new API


## Test Plan
Before:

User> What are the top 5 topics that were explained? Only list succinct
bullet points.
inference> I'm ready to help, but we haven't discussed any topics yet!
This is the start of our conversation. What would you like to talk
about? I can summarize our discussion at the end if you'd like.


Run with the change, observe relevant response

<img width="1029" alt="image"
src="https://github.com/user-attachments/assets/a7dece3c-e8b4-4a60-9092-ba544c87dffd"
/>



## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.

Co-authored-by: Eric Huang (AI Platform) <erichuang@fb.com>
2025-01-23 17:02:04 -08:00
Dinesh Yeduguru
ebffa15f40
update python sdk reference (#866)
# What does this PR do?

syncs changes from
https://github.com/stainless-sdks/llama-stack-python/blob/main/api.md
2025-01-23 16:04:06 -08:00
Dinesh Yeduguru
c570a708bf
update the client reference (#864)
# What does this PR do?

Syncs changes from
https://github.com/meta-llama/llama-stack-client-python/pull/96
2025-01-23 15:32:16 -08:00
Hardik Shah
94ffaf468c
More updates to ReadTheDocs (#861)
Improve Contributing section
2025-01-23 12:50:38 -08:00
Dinesh Yeduguru
7df40da5fa
sync readme.md to index.md (#860)
# What does this PR do?

README has some new content that is being synced to index.md
2025-01-23 12:43:09 -08:00
Hardik Shah
a6a4270eef
Updates to ReadTheDocs (#859)
Move evals section to AI Agents section 
drop from top level and other minor fixes
2025-01-23 12:42:15 -08:00
snova-edwardm
22dc684da6
Sambanova inference provider (#555)
# What does this PR do?

This PR adds SambaNova as one of the Provider

- Add SambaNova as a provider

## Test Plan
Test the functional command
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_embeddings.py llama_stack/providers/tests/inference/test_prompt_adapter.py llama_stack/providers/tests/inference/test_text_inference.py llama_stack/providers/tests/inference/test_vision_inference.py --env SAMBANOVA_API_KEY=<sambanova-api-key>
```

Test the distribution template:
```
# Docker
LLAMA_STACK_PORT=5001
docker run -it -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
  llamastack/distribution-sambanova \
  --port $LLAMA_STACK_PORT \
  --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY

# Conda
llama stack build --template sambanova --image-type conda
llama stack run ./run.yaml \
  --port $LLAMA_STACK_PORT \
  --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY
```

## Source
[SambaNova API Documentation](https://cloud.sambanova.ai/apis)

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [Y] Ran pre-commit to handle lint / formatting issues.
- [Y] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [Y] Updated relevant documentation.
- [Y ] Wrote necessary unit or integration tests.

---------

Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-01-23 12:20:28 -08:00
Dinesh Yeduguru
86466b71a9
update docs for adding new API providers (#855)
# What does this PR do?

update docs for adding new API providers
![Screenshot 2025-01-23 at 11 21
42 AM](https://github.com/user-attachments/assets/0d4621d4-ef7e-43cd-9c4a-3e8e0b49242f)
2025-01-23 12:05:57 -08:00
Dinesh Yeduguru
d0be9288a3
Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb (#854)
Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb
2025-01-23 12:04:06 -08:00
Hardik Shah
74e933cbfd
More Updates to Read the Docs (#856) 2025-01-23 11:39:33 -08:00
Dinesh Yeduguru
8a686270e9
remove getting started notebook (#853)
# What does this PR do?

This notebook is no longer updated and we should be using
https://github.com/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Building_AI_Applications.ipynb
2025-01-23 10:09:09 -08:00
Hardik Shah
25a70ca4dc
Fixed distro documentation (#852)
More docs
2025-01-23 08:19:51 -08:00
raghotham
e44a1a68f1
Delete docs/to_situate directory (#851)
# What does this PR do?

No need for the cookbook now. Removing the folder

- [ ] Addresses issue (#issue)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-23 07:15:47 -08:00
Sixian Yi
82a28f3a24
update doc for client-sdk testing (#849)
As title


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-23 00:17:16 -08:00
Dinesh Yeduguru
28012c51bb
update docs for tools and telemetry (#846)
# What does this PR do?

Added a new Tools doc describing how to use tools and updated the main
building agents doc to point to the tools doc.
Also updated telemetry doc.


https://llama-stack.readthedocs.io/en/tools-doc/building_applications/tools.html
2025-01-22 22:50:29 -08:00
Ashwin Bharambe
35c71d5bbe
Update OpenAPI generator to output discriminator (#848)
oneOf should have discriminators so Stainless can generate better code

## Test Plan

Going to generate the SDK now and check.
2025-01-22 22:15:23 -08:00
Hardik Shah
65f07c3d63
Update Documentation (#838)
# What does this PR do?

Update README and other documentation


## Before submitting

- [X] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-22 20:38:52 -08:00
Ashwin Bharambe
f3d8864c36 Rename builtin::memory -> builtin::rag 2025-01-22 20:22:51 -08:00
Ashwin Bharambe
494e969f8d add a bunch of NBVAL SKIPs to unblock ugh 2025-01-22 15:28:45 -08:00
Ashwin Bharambe
82d942b501 Foo 2025-01-22 13:58:17 -08:00
Ashwin Bharambe
55d01339c2 Update notebook 2025-01-22 13:31:11 -08:00
Ashwin Bharambe
07b87365ab
[inference api] modify content types so they follow a more standard structure (#841)
Some small updates to the inference types to make them more standard

Specifically:
- image data is now located in a "image" subkey
- similarly tool call data is located in a "tool_call" subkey

The pattern followed is `dict(type="foo", foo=<...>)`
2025-01-22 12:16:18 -08:00
Ashwin Bharambe
a63a43c646
[memory refactor][6/n] Update naming and routes (#839)
Making a few small naming changes as per feedback:

- RAGToolRuntime methods are called `insert` and `query` to keep them
more general
- The tool names are changed to non-namespaced forms
`insert_into_memory` and `query_from_memory`
- The REST endpoints are more REST-ful
2025-01-22 10:39:13 -08:00
Ashwin Bharambe
c9e5578151
[memory refactor][5/n] Migrate all vector_io providers (#835)
See https://github.com/meta-llama/llama-stack/issues/827 for the broader
design.

This PR finishes off all the stragglers and migrates everything to the
new naming.
2025-01-22 10:17:59 -08:00
Ashwin Bharambe
1a7490470a
[memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol (#832)
See https://github.com/meta-llama/llama-stack/issues/827 for the broader
design.

Third part:
- we need to make `tool_runtime.rag_tool.query_context()` and
`tool_runtime.rag_tool.insert_documents()` methods work smoothly with
complete type safety. To that end, we introduce a sub-resource path
`tool-runtime/rag-tool/` and make changes to the resolver to make things
work.
- the PR updates the agents implementation to directly call these typed
APIs for memory accesses rather than going through the complex, untyped
"invoke_tool" API. the code looks much nicer and simpler (expectedly.)
- there are a number of hacks in the server resolver implementation
still, we will live with some and fix some

Note that we must make sure the client SDKs are able to handle this
subresource complexity also. Stainless has support for subresources, so
this should be possible but beware.

## Test Plan

Our RAG test is sad (doesn't actually test for actual RAG output) but I
verified that the implementation works. I will work on fixing the RAG
test afterwards.

```bash
pytest -s -v tests/agents/test_agents.py -k "rag and together" --safety-shield=meta-llama/Llama-Guard-3-8B
```
2025-01-22 10:04:16 -08:00
Dinesh Yeduguru
7a4b382ae9
add section for mcp tool usage in notebook (#831)
# What does this PR do?

Adds a section to the notebook on how to use tools hosted in MCP server.


![Screenshot 2025-01-21 at 11 05
39 AM](https://github.com/user-attachments/assets/23e900f1-e2a7-4a46-be9b-13642753dca1)
Notebook:
https://colab.research.google.com/drive/1hBKX01NlG6p2BUrBU0ynwIlWjXQRxc3k?usp=sharing

Rendered notebook on this branch:
https://github.com/meta-llama/llama-stack/blob/mcp-notebook/docs/notebooks/Llama_Stack_Building_AI_Applications.ipynb
2025-01-21 13:10:42 -08:00
Dinesh Yeduguru
3d4c53dfec
add mcp runtime as default to all providers (#816)
# What does this PR do?

This is needed to have the notebook work with MCP
2025-01-17 16:40:58 -08:00
Yuan Tang
6da3053c0e
More generic image type for OCI-compliant container technologies (#802)
It's a more generic term and applicable to alternatives of Docker, such
as Podman or other OCI-compliant technologies.

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-01-17 16:37:42 -08:00
Xi Yan
9d005154d7
fix vllm template (#813)
# What does this PR do?

- Fix vLLM template to resolve
https://github.com/meta-llama/llama-stack/issues/805
- Fix agents test with shields

## Test Plan

```
vllm serve meta-llama/Llama-3.1-8B-Instruct
VLLM_URL="http://localhost:8000/v1" INFERENCE_MODEL="meta-llama/Llama-3.1-8B-Instruct" llama stack run ./llama_stack/templates/remote-vllm/run.yaml
```

```
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v ./tests/client-sdk/
```

<img width="1245" alt="image"
src="https://github.com/user-attachments/assets/9af27684-5a9c-4187-b338-cbfc5211bd99"
/>


- custom tool flaky due to model outputs
- /completions API not implemented

**Vision Model**
- 11B-Vision-Instruct
<img width="1240" alt="image"
src="https://github.com/user-attachments/assets/1d3b3b17-fa09-43a7-b56c-3f77263825c5"
/>


## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-17 15:34:29 -08:00
Paul McCarthy
e1decaec9d
Fixing small typo in quick start guide (#807)
# What does this PR do?

Fixing small typo in the quick start guide

## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
2025-01-17 11:15:55 -08:00
Dinesh Yeduguru
53b5f6b24a
add json_schema_type to ParamType deps (#808)
# What does this PR do?

Add missing json_schema_type annotation to ParamType deps
2025-01-17 11:02:25 -08:00
Xi Yan
c2a072911d
fix eval notebook & add test to workflow (#803) 2025-01-16 23:11:21 -08:00
Xi Yan
d1f3b032c9
cerebras template update for memory (#792)
# What does this PR do?

- we no longer have meta-reference as memory provider, update cerebras
template


## Test Plan

```
python llama_stack/scripts/distro_codegen.py
```

## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-16 16:07:53 -08:00
Ashwin Bharambe
03ac84a829 Update default port from 5000 -> 8321 2025-01-16 15:26:48 -08:00
Hardik Shah
f1faa9c924 pop fix 2025-01-16 14:09:59 -08:00
Dinesh Yeduguru
fcd1a57429 update notebook 2025-01-16 14:00:48 -08:00
Xi Yan
a6b9f2cec7
fix cerebras template (#790)
# What does this PR do?

- fix cerebras template

## Test Plan

```
llama stack build --template cerebras --image-type conda
llama stack run cerebras
LLAMA_STACK_BASE_URL="http://localhost:5000" pytest -v tests/client-sdk/ --html=report.html --self-contained-html
```

## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-16 13:53:06 -08:00
Dinesh Yeduguru
12c994b5b2
REST API fixes (#789)
# What does this PR do?

Client SDK fixes

## Test Plan


LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/llamastack-fireworks/fireworks-run.yaml"
pytest -v tests/client-sdk/safety/test_safety.py


LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/llamastack-fireworks/fireworks-run.yaml"
pytest -v tests/client-sdk/memory/test_memory.py
2025-01-16 13:47:08 -08:00
Dinesh Yeduguru
59eeaf7f81
Idiomatic REST API: Telemetry (#786)
# What does this PR do?

Changes Telemetry API to follow more idiomatic REST


- [ ] Addresses issue (#issue)


## Test Plan

TBD, once i get an approval for rest endpoints
2025-01-16 12:08:46 -08:00
Hardik Shah
74e4d520ac un-skip telemetry cells in notebook 2025-01-16 11:54:25 -08:00