Ellis Tarn 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								afca9d92f9 
								
							 
						 
						
							
							
								
								fix: Readthedocs cannot parse comments, resulting in docs bugs ( #1033 )  
							
							
							
						 
						
							2025-02-10 16:35:16 -05:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Hardik Shah 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a84e7669f0 
								
							 
						 
						
							
							
								
								feat: Add a new template for dell ( #978 )  
							
							... 
							
							
							
							- Added new template `dell` and its documentation 
- Update docs 
- [minor] uv fix i came across 
- codegen for all templates 
Tested with 
```bash
export INFERENCE_PORT=8181
export DEH_URL=http://0.0.0.0:$INFERENCE_PORT 
export INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
export CHROMADB_HOST=localhost
export CHROMADB_PORT=6601
export CHROMA_URL=[http://$CHROMADB_HOST:$CHROMADB_PORT ](about:blank)
export CUDA_VISIBLE_DEVICES=0
export LLAMA_STACK_PORT=8321
# build the stack template 
llama stack build --template=dell 
# start the TGI inference server 
podman run --rm -it --network host -v $HOME/.cache/huggingface:/data -e HF_TOKEN=$HF_TOKEN -p $INFERENCE_PORT:$INFERENCE_PORT --gpus $CUDA_VISIBLE_DEVICES [ghcr.io/huggingface/text-generation-inference](http://ghcr.io/huggingface/text-generation-inference ) --dtype bfloat16 --usage-stats off --sharded false --cuda-memory-fraction 0.7 --model-id $INFERENCE_MODEL --port $INFERENCE_PORT --hostname 0.0.0.0
# start chroma-db for vector-io ( aka RAG )
podman run --rm -it --network host --name chromadb -v .:/chroma/chroma -e IS_PERSISTENT=TRUE chromadb/chroma:latest --port $CHROMADB_PORT --host $(hostname)
# build docker 
llama stack build --template=dell --image-type=container
# run llama stack server ( via docker )
podman run -it \
--network host \
-p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
-v ~/.llama:/root/.llama \
# NOTE: mount the llama-stack / llama-model directories if testing local changes 
-v /home/hjshah/git/llama-stack:/app/llama-stack-source -v /home/hjshah/git/llama-models:/app/llama-models-source \ localhost/distribution-dell:dev \
--port $LLAMA_STACK_PORT  \
--env INFERENCE_MODEL=$INFERENCE_MODEL \
--env DEH_URL=$DEH_URL \
--env CHROMA_URL=$CHROMA_URL
# test the server 
cd <PATH_TO_LLAMA_STACK_REPO>
LLAMA_STACK_BASE_URL=http://0.0.0.0:$LLAMA_STACK_PORT  pytest -s -v tests/client-sdk/agents/test_agents.py
```
---------
Co-authored-by: Hardik Shah <hjshah@fb.com> 
							
						 
						
							2025-02-06 14:14:39 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Hardik Shah 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2cebb24d3a 
								
							 
						 
						
							
							
								
								Update doc templates for running safety on self-hosted templates ( #874 )  
							
							
							
						 
						
							2025-01-24 11:28:20 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Hardik Shah 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								25a70ca4dc 
								
							 
						 
						
							
							
								
								Fixed distro documentation ( #852 )  
							
							... 
							
							
							
							More docs 
							
						 
						
							2025-01-23 08:19:51 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								f3d8864c36 
								
							 
						 
						
							
							
								
								Rename builtin::memory -> builtin::rag  
							
							
							
						 
						
							2025-01-22 20:22:51 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c9e5578151 
								
							 
						 
						
							
							
								
								[memory refactor][5/n] Migrate all vector_io providers ( #835 )  
							
							... 
							
							
							
							See https://github.com/meta-llama/llama-stack/issues/827  for the broader
design.
This PR finishes off all the stragglers and migrates everything to the
new naming. 
							
						 
						
							2025-01-22 10:17:59 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3d4c53dfec 
								
							 
						 
						
							
							
								
								add mcp runtime as default to all providers ( #816 )  
							
							... 
							
							
							
							# What does this PR do?
This is needed to have the notebook work with MCP 
							
						 
						
							2025-01-17 16:40:58 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									raghotham 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ff182ff6de 
								
							 
						 
						
							
							
								
								rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars ( #744 )  
							
							... 
							
							
							
							# What does this PR do?
Rename environment var for consistency
## Test Plan
No regressions
## Sources
## Before submitting
- [X] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [X] Ran pre-commit to handle lint / formatting issues.
- [X] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md ),
Pull Request section?
- [X] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
---------
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com> 
							
						 
						
							2025-01-10 11:09:49 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a5c57cd381 
								
							 
						 
						
							
							
								
								agents to use tools api ( #673 )  
							
							... 
							
							
							
							# What does this PR do?
PR #639  introduced the notion of Tools API and ability to invoke tools
through API just as any resource. This PR changes the Agents to start
using the Tools API to invoke tools. Major changes include:
1) Ability to specify tool groups with AgentConfig
2) Agent gets the corresponding tool definitions for the specified tools
and pass along to the model
3) Attachements are now named as Documents and their behavior is mostly
unchanged from user perspective
4) You can specify args that can be injected to a tool call through
Agent config. This is especially useful in case of memory tool, where
you want the tool to operate on a specific memory bank.
5) You can also register tool groups with args, which lets the agent
inject these as well into the tool call.
6) All tests have been migrated to use new tools API and fixtures
including client SDK tests
7) Telemetry just works with tools API because of our trace protocol
decorator
## Test Plan
```
pytest -s -v -k fireworks llama_stack/providers/tests/agents/test_agents.py  \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct
pytest -s -v -k together  llama_stack/providers/tests/tools/test_tools.py \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct
LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/llamastack-together/together-run.yaml" pytest -v tests/client-sdk/agents/test_agents.py
```
run.yaml:
https://gist.github.com/dineshyv/0365845ad325e1c2cab755788ccc5994 
Notebook:
https://colab.research.google.com/drive/1ck7hXQxRl6UvT-ijNRZ-gMZxH1G3cN2d?usp=sharing  
							
						 
						
							2025-01-08 19:01:00 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7301403ce3 
								
							 
						 
						
							
							
								
								Add eval/scoring/datasetio API providers to distribution templates & UI developer guide ( #564 )  
							
							... 
							
							
							
							# What does this PR do?
- add /eval, /scoring, /datasetio API providers to distribution
templates
- regenerate build.yaml / run.yaml files
- fix `template.py` to take in list of providers instead of only first
one
- override memory provider as faiss default for all distro (as only 1
memory provider is needed to start basic flow, chromadb/pgvector need
additional setup step).
```
python llama_stack/scripts/distro_codegen.py
```
- updated README to start UI via conda builds. 
## Test Plan
```
python llama_stack/scripts/distro_codegen.py
```
- Use newly generated `run.yaml` to start server
```
llama stack run ./llama_stack/templates/together/run.yaml
```
<img width="1191" alt="image"
src="https://github.com/user-attachments/assets/62f7d179-0cd0-427c-b6e8-e087d4648f09 ">
#### Registration
```
❯ llama-stack-client datasets register \
--dataset-id "mmlu" \
--provider-id "huggingface" \
--url "https://huggingface.co/datasets/llamastack/evals " \
--metadata '{"path": "llamastack/evals", "name": "evals__mmlu__details", "split": "train"}' \
--schema '{"input_query": {"type": "string"}, "expected_answer": {"type": "string", "chat_completion_input": {"type": "string"}}}'
❯ llama-stack-client datasets list
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ identifier ┃ provider_id ┃ metadata                                ┃ type    ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ mmlu       │ huggingface │ {'path': 'llamastack/evals', 'name':    │ dataset │
│            │             │ 'evals__mmlu__details', 'split':        │         │
│            │             │ 'train'}                                │         │
└────────────┴─────────────┴─────────────────────────────────────────┴─────────┘
```
```
❯ llama-stack-client datasets register \
--dataset-id "simpleqa" \
--provider-id "huggingface" \
--url "https://huggingface.co/datasets/llamastack/evals " \
--metadata '{"path": "llamastack/evals", "name": "evals__simpleqa", "split": "train"}' \
--schema '{"input_query": {"type": "string"}, "expected_answer": {"type": "string", "chat_completion_input": {"type": "string"}}}'
❯ llama-stack-client datasets list
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ identifier ┃ provider_id ┃ metadata                                                      ┃ type    ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ mmlu       │ huggingface │ {'path': 'llamastack/evals', 'name': 'evals__mmlu__details',  │ dataset │
│            │             │ 'split': 'train'}                                             │         │
│ simpleqa   │ huggingface │ {'path': 'llamastack/evals', 'name': 'evals__simpleqa',       │ dataset │
│            │             │ 'split': 'train'}                                             │         │
└────────────┴─────────────┴───────────────────────────────────────────────────────────────┴─────────┘
```
```
❯ llama-stack-client eval_tasks register \
> --eval-task-id meta-reference-mmlu \
> --provider-id meta-reference \
> --dataset-id mmlu \
> --scoring-functions basic::regex_parser_multiple_choice_answer
❯ llama-stack-client eval_tasks register \
--eval-task-id meta-reference-simpleqa \
--provider-id meta-reference \
--dataset-id simpleqa \
--scoring-functions llm-as-judge::405b-simpleqa
❯ llama-stack-client eval_tasks list
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ dataset_id ┃ identifier       ┃ metadata ┃ provider_id    ┃ provider_resour… ┃ scoring_functio… ┃ type      ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ mmlu       │ meta-reference-… │ {}       │ meta-reference │ meta-reference-… │ ['basic::regex_… │ eval_task │
│ simpleqa   │ meta-reference-… │ {}       │ meta-reference │ meta-reference-… │ ['llm-as-judge:… │ eval_task │
└────────────┴──────────────────┴──────────┴────────────────┴──────────────────┴──────────────────┴───────────┘
```
#### Test with UI
```
streamlit run app.py
```
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md ),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests. 
							
						 
						
							2024-12-05 16:29:32 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								2cfc41e13b 
								
							 
						 
						
							
							
								
								Mark some pages as not-in-toctree explicitly  
							
							
							
						 
						
							2024-11-23 15:27:44 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								2411a44833 
								
							 
						 
						
							
							
								
								Update more distribution docs to be simpler and partially codegen'ed  
							
							
							
						 
						
							2024-11-20 22:03:44 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b3f9e8b2f2 
								
							 
						 
						
							
							
								
								Restructure docs ( #494 )  
							
							... 
							
							
							
							Rendered docs at: https://llama-stack.readthedocs.io/en/doc-simplify/  
							
						 
						
							2024-11-20 15:54:47 -08:00