llama-stack

forked from phoenix-oss/llama-stack-mirror

Author	SHA1	Message	Date
Nathan Weinberg	d897313e0b	feat: add additional logging to llama stack build (#1689 ) # What does this PR do? Partial revert of `fa68ded07c` this commit ensures users know where their new templates are generated and how to run the newly built distro locally discussion on Discord: `1351652390` ## Test Plan Did a local run - let me know if we want any unit testing covering this ![Screenshot from 2025-03-18 22-38-18](https://github.com/user-attachments/assets/6d5dac52-edad-4a84-992f-a3c23cda10c8) ## Documentation Updated "Zero to Hero" guide with new output --------- Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-04-30 11:06:24 -07:00
Kevin Postlethwait	d9e00fca66	fix: specify nbformat version in nb (#2023 ) # What does this PR do? Adding nbformat version fixes this issue. Not sure exactly why this needs to be done, but this version was rewritten to the bottom of a nb file when I changed its name trying to get to the bottom of this. When I opened it on GH the issue was no longer present Closes #1837 ## Test Plan N/A	2025-04-25 10:10:37 +02:00
Hardik Shah	127bac6869	fix: Default to port 8321 everywhere (#1734 ) As titled, moved all instances of 5001 to 8321	2025-03-20 15:50:41 -07:00
Reid	0b8cb830b9	docs: update ollama doc url (#1508 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] It should changed in this pr https://github.com/meta-llama/llama-stack/pull/1190/files#diff-53e3f35ced54ee5e57dc8b0d3b04770ed84f2f6434c6f492f42569b3c2810ecd [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: reidliu <reid201711@gmail.com>	2025-03-10 13:04:59 -07:00
ehhuang	ca2910d27a	docs: update test_agents to use new Agent SDK API (#1402 ) # Summary: new Agent SDK API is added in https://github.com/meta-llama/llama-stack-client-python/pull/178 Update docs and test to reflect this. Closes https://github.com/meta-llama/llama-stack/issues/1365 # Test Plan: ```bash py.test -v -s --nbval-lax ./docs/getting_started.ipynb LLAMA_STACK_CONFIG=fireworks \ pytest -s -v tests/integration/agents/test_agents.py \ --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct ```	2025-03-06 15:21:12 -08:00
Surya Prakash Pathak	9b6a2577b1	docs: Update llama-stack version in README.md (#1330 ) # What does this PR do? This PR updates the version in the [README.md](https://github.com/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/README.md) to reflect the latest changes in Llama Stack setup. Previously, using llama-stack==0.1.0 caused an error when running: ```bash llama stack build --template ollama --image-type conda ``` Upgrading to llama-stack==0.1.3 resolves this issue. ## Test Plan - Verified that `llama stack build --template ollama --image-type conda` works correctly. --------- Signed-off-by: Surya Prakash Pathak <supathak@redhat.com>	2025-02-28 13:37:03 -08:00
Reid	55eb257459	chore: update the zero_to_hero_guide doc link (#1220 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] It should changed by `8585b95a28`, so show `404` when click it. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: reidliu <reid201711@gmail.com>	2025-02-25 17:16:02 -08:00
Xi Yan	8585b95a28	rename	2025-02-18 16:02:44 -08:00
Maxime Lecanu	e964ec95e9	docs: Correct typos in Zero to Hero guide (#997 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. --> Corrects some typographical errors found in the `docs/zero_to_hero_guide/README.md` file. <!-- Uncomment this section with the issue number if an issue is being resolved Issue resolved by this Pull Request: Closes # ---> ## Test Plan <!-- Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. --> N/A <!-- ## Sources Please link relevant resources if necessary. --> <!-- ## Documentation - [ ] Added a [Changelog](https://github.com/meta-llama/llama-stack/blob/main/CHANGELOG.md) entry if the change is significant (new feature, breaking change etc.). --> Co-authored-by: Maxime Lecanu <mlecanu@fb.com>	2025-02-06 17:29:52 -05:00
Ryan Cook	2d9c8b549e	docs: missing T in import (#974 ) # What does this PR do? Missing T in import ## Test Plan N/A doc update ## Sources Please link relevant resources if necessary. ## Before submitting - [X ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-02-05 17:06:39 -05:00
Kamesh Akella	d9c0b4e3ba	[docs] update the zero_to_hero_guide llama stack version to 0.1.0 (#960 ) # What does this PR do? The Zero to Hero guide currently references an older 0.0.61 llama-stack version. Using the most recent stable release of the product in the documentation, would help the users not to go through any issues from the older llama-stack versions. ## Test Plan I have ran the workflow locally using the proposed version change and I am able to proceed further ahead without any issue. ## Before submitting - [X] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-02-05 11:49:26 -08:00
Yuan Tang	34ab7a3b6c	Fix precommit check after moving to ruff (#927 ) Lint check in main branch is failing. This fixes the lint check after we moved to ruff in https://github.com/meta-llama/llama-stack/pull/921. We need to move to a `ruff.toml` file as well as fixing and ignoring some additional checks. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-02 06:46:45 -08:00
Ashwin Bharambe	03ac84a829	Update default port from 5000 -> 8321	2025-01-16 15:26:48 -08:00
Hardik Shah	a51c8b4efc	Convert `SamplingParams.strategy` to a union (#767 ) # What does this PR do? Cleans up how we provide sampling params. Earlier, strategy was an enum and all params (top_p, temperature, top_k) across all strategies were grouped. We now have a strategy union object with each strategy (greedy, top_p, top_k) having its corresponding params. Earlier, ``` class SamplingParams: strategy: enum () top_p, temperature, top_k and other params ``` However, the `strategy` field was not being used in any providers making it confusing to know the exact sampling behavior purely based on the params since you could pass temperature, top_p, top_k and how the provider would interpret those would not be clear. Hence we introduced -- a union where the strategy and relevant params are all clubbed together to avoid this confusion. Have updated all providers, tests, notebooks, readme and otehr places where sampling params was being used to use the new format. ## Test Plan `pytest llama_stack/providers/tests/inference/groq/test_groq_utils.py` // inference on ollama, fireworks and together `with-proxy pytest -v -s -k "ollama" --inference-model="meta-llama/Llama-3.1-8B-Instruct" llama_stack/providers/tests/inference/test_text_inference.py ` // agents on fireworks `pytest -v -s -k 'fireworks and create_agent' --inference-model="meta-llama/Llama-3.1-8B-Instruct" llama_stack/providers/tests/agents/test_agents.py --safety-shield="meta-llama/Llama-Guard-3-8B"` ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [X] Ran pre-commit to handle lint / formatting issues. - [X] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [X] Updated relevant documentation. - [X] Wrote necessary unit or integration tests. --------- Co-authored-by: Hardik Shah <hjshah@fb.com>	2025-01-15 05:38:51 -08:00
raghotham	ff182ff6de	rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars (#744 ) # What does this PR do? Rename environment var for consistency ## Test Plan No regressions ## Sources ## Before submitting - [X] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [X] Ran pre-commit to handle lint / formatting issues. - [X] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [X] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>	2025-01-10 11:09:49 -08:00
Justin Lee	8e5b336792	Made changes to readme and pinning to llamastack v0.0.61 (#624 ) # What does this PR do? Pinning zero2hero to 0.0.61 and updated readme ## Test Plan Please describe: - Did a end to end test on the server and inference for 0.0.61 Server output: <img width="670" alt="image" src="https://github.com/user-attachments/assets/66515adf-102d-466d-b0ac-fa91568fcee6" /> ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-02 11:18:07 -08:00
Xi Yan	3c72c034e6	[remove import ] clean up import 's (#689 ) # What does this PR do? - as title, cleaning up `import `'s - upgrade tests to make them more robust to bad model outputs - remove import 's in llama_stack/apis/* (skip __init__ modules) <img width="465" alt="image" src="https://github.com/user-attachments/assets/d8339c13-3b40-4ba5-9c53-0d2329726ee2" /> - run `sh run_openapi_generator.sh`, no types gets affected ## Test Plan ### Providers Tests agents ``` pytest -v -s llama_stack/providers/tests/agents/test_agents.py -m "together" --safety-shield meta-llama/Llama-Guard-3-8B --inference-model meta-llama/Llama-3.1-405B-Instruct-FP8 ``` inference ```bash # meta-reference torchrun $CONDA_PREFIX/bin/pytest -v -s -k "meta_reference" --inference-model="meta-llama/Llama-3.1-8B-Instruct" ./llama_stack/providers/tests/inference/test_text_inference.py torchrun $CONDA_PREFIX/bin/pytest -v -s -k "meta_reference" --inference-model="meta-llama/Llama-3.2-11B-Vision-Instruct" ./llama_stack/providers/tests/inference/test_vision_inference.py # together pytest -v -s -k "together" --inference-model="meta-llama/Llama-3.1-8B-Instruct" ./llama_stack/providers/tests/inference/test_text_inference.py pytest -v -s -k "together" --inference-model="meta-llama/Llama-3.2-11B-Vision-Instruct" ./llama_stack/providers/tests/inference/test_vision_inference.py pytest ./llama_stack/providers/tests/inference/test_prompt_adapter.py ``` safety ``` pytest -v -s llama_stack/providers/tests/safety/test_safety.py -m together --safety-shield meta-llama/Llama-Guard-3-8B ``` memory ``` pytest -v -s llama_stack/providers/tests/memory/test_memory.py -m "sentence_transformers" --env EMBEDDING_DIMENSION=384 ``` scoring ``` pytest -v -s -m llm_as_judge_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py --judge-model meta-llama/Llama-3.2-3B-Instruct pytest -v -s -m basic_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py pytest -v -s -m braintrust_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py ``` datasetio ``` pytest -v -s -m localfs llama_stack/providers/tests/datasetio/test_datasetio.py pytest -v -s -m huggingface llama_stack/providers/tests/datasetio/test_datasetio.py ``` eval ``` pytest -v -s -m meta_reference_eval_together_inference llama_stack/providers/tests/eval/test_eval.py pytest -v -s -m meta_reference_eval_together_inference_huggingface_datasetio llama_stack/providers/tests/eval/test_eval.py ``` ### Client-SDK Tests ``` LLAMA_STACK_BASE_URL=http://localhost:5000 pytest -v ./tests/client-sdk ``` ### llama-stack-apps ``` PORT=5000 LOCALHOST=localhost python -m examples.agents.hello $LOCALHOST $PORT python -m examples.agents.inflation $LOCALHOST $PORT python -m examples.agents.podcast_transcript $LOCALHOST $PORT python -m examples.agents.rag_as_attachments $LOCALHOST $PORT python -m examples.agents.rag_with_memory_bank $LOCALHOST $PORT python -m examples.safety.llama_guard_demo_mm $LOCALHOST $PORT python -m examples.agents.e2e_loop_with_custom_tools $LOCALHOST $PORT # Vision model python -m examples.interior_design_assistant.app python -m examples.agent_store.app $LOCALHOST $PORT ``` ### CLI ``` which llama llama model prompt-format -m Llama3.2-11B-Vision-Instruct llama model list llama stack list-apis llama stack list-providers inference llama stack build --template ollama --image-type conda ``` ### Distributions Tests ollama ``` llama stack build --template ollama --image-type conda ollama run llama3.2:1b-instruct-fp16 llama stack run ./llama_stack/templates/ollama/run.yaml --env INFERENCE_MODEL=meta-llama/Llama-3.2-1B-Instruct ``` fireworks ``` llama stack build --template fireworks --image-type conda llama stack run ./llama_stack/templates/fireworks/run.yaml ``` together ``` llama stack build --template together --image-type conda llama stack run ./llama_stack/templates/together/run.yaml ``` tgi ``` llama stack run ./llama_stack/templates/tgi/run.yaml --env TGI_URL=http://0.0.0.0:5009 --env INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct ``` ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-27 15:45:44 -08:00
Jeff Tang	69a2d7b264	Use customtool's get_tool_definition to remove duplication (#584 ) # What does this PR do? Current examples would cause a lot of unnecessary painful duplication when a bunch of custom tools are expected while dealing with a real use case. Also added pip install -U httpx==0.27.2 to avoid a [httpx proxies error](https://github.com/meta-llama/llama-stack-apps/issues/131) when running in an env with 0.28 or higher of httpx installed by default. In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-08 15:00:41 -08:00
Jeff Tang	999b9781f7	specify the client version that works for current together server (#566 ) # What does this PR do? Fix the error when using the newer (v0.0.55-57) llama stack client library with Together's stack service. In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-05 08:39:13 -08:00
raghotham	8a3887c7eb	Guide readme fix (#552 ) # What does this PR do? Fixes readme to remove redundant information and added llama-stack-client cli instructions. ## Before submitting - [ X] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ X] Ran pre-commit to handle lint / formatting issues. - [ X] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ X] Updated relevant documentation.	2024-11-30 12:28:03 -06:00
Jeffrey Lind	2fc1c16d58	Fix Zero to Hero README.md Formatting (#546 ) # What does this PR do? The formatting shown in the picture below in the Zero to Hero README.md was fixed with this PR (also shown in a picture below). Before <img width="1014" alt="Screenshot 2024-11-28 at 1 47 32 PM" src="https://github.com/user-attachments/assets/02d2281e-83ae-43eb-a1c7-702bc2365120"> After <img width="1014" alt="Screenshot 2024-11-28 at 1 50 19 PM" src="https://github.com/user-attachments/assets/03e54f40-c347-4737-8b91-197eee70a52f">	2024-11-29 10:12:53 -06:00
Jeffrey Lind	5fc2ee6f77	Fix URLs to Llama Stack Read the Docs Webpages (#547 ) # What does this PR do? Many of the URLs pointing to the Llama Stack's Read The Docs webpages were broken, presumably due to recent refactor of the documentation. This PR fixes all effected URLs throughout the repository.	2024-11-29 10:11:50 -06:00
Sean	9088206eda	fix[documentation]: Update links to point to correct pages (#549 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [x] Addresses issue (#548) ## Test Plan Please describe: No automated tests. Clicked on each link to ensure I was directed to the right page. ## Sources ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [ ] ~Wrote necessary unit or integration tests.~	2024-11-29 07:43:56 -06:00
Ashwin Bharambe	526a8dcfe0	Minor edit to zero_to_hero_guide	2024-11-22 15:52:56 -08:00
Ashwin Bharambe	5acb15d2bf	Make quickstart.md -> README.md so it shows up as default	2024-11-22 15:50:25 -08:00
Justin Lee	9928405e2c	Docs improvement v3 (#433 ) # What does this PR do? - updated the notebooks to reflect past changes up to llama-stack 0.0.53 - updated readme to provide accurate and up-to-date info - improve the current zero to hero by integrating an example using together api ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --------- Co-authored-by: Sanyam Bhutani <sanyambhutani@meta.com>	2024-11-22 15:43:31 -08:00
Justin Lee	ae49a4cb97	Reorganizing Zero to Hero Folder structure (#447 ) Putting Zero to Hero Guide to root for increased visibility	2024-11-20 10:27:29 -08:00
Henry Tai	39e99b39fe	update quick start to have the working instruction (#467 ) # What does this PR do? Fix the instruction in quickstart readme so the new developers/users can run it without issues. ## Test Plan None ## Sources Please link relevant resources if necessary. ## Before submitting - [X] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [X] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [X] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Co-authored-by: Henry Tai <henrytai@fb.com>	2024-11-19 10:32:19 -08:00
Jeff Tang	15dee2b8b8	Added link to the Colab notebook of the Llama Stack lesson on the Llama 3.2 course on DLAI (#445 ) # What does this PR do? It shows a complete zero-setup Colab using the Llama Stack server implemented and powered by together.ai: using Llama Stack Client API to run inference, agent and 3.2 models. Good for a quick start guide. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-11-13 13:59:41 -08:00
Ashwin Bharambe	983d6ce2df	Remove the "ShieldType" concept (#430 ) # What does this PR do? This PR kills the notion of "ShieldType". The impetus for this is the realization: > Why is keyword llama-guard appearing so many times everywhere, sometimes with hyphens, sometimes with underscores? Now that we have a notion of "provider specific resource identifiers" and "user specific aliases" for those and the fact that this works with models ("Llama3.1-8B-Instruct" <> "fireworks/llama-3pv1-..."), we can follow the same rules for Shields. So each Safety provider can make up a notion of identifiers it has registered. This already happens with Bedrock correctly. We just generalize it for Llama Guard, Prompt Guard, etc. For Llama Guard, we further simplify by just adopting the underlying model name itself as the identifier! No confusion necessary. While doing this, I noticed a bug in our DistributionRegistry where we weren't scoping identifiers by type. Fixed. ## Feature/Issue validation/testing/test plan Ran (inference, safety, memory, agents) tests with ollama and fireworks providers.	2024-11-12 12:37:24 -08:00
Ashwin Bharambe	3d7561e55c	Rename all inline providers with an inline:: prefix (#423 )	2024-11-11 22:19:16 -08:00
Justin Lee	6d38b1690b	added quickstart w ollama and toolcalling using together (#413 ) * added quickstart w ollama and toolcalling using together * corrected url for colab --------- Co-authored-by: Justin Lee <justinai@fb.com>	2024-11-09 10:52:26 -08:00
Justin Lee	65371a5067	[Docs] Zero-to-Hero notebooks and quick start documentation (#368 ) Co-authored-by: Kai Wu <kaiwu@meta.com> Co-authored-by: Sanyam Bhutani <sanyambhutani@meta.com> Co-authored-by: Justin Lee <justinai@fb.com>	2024-11-08 17:16:44 -08:00

33 commits