llama-stack

forked from phoenix-oss/llama-stack-mirror

Author	SHA1	Message	Date
Justin Lee	e4865c3510	adding readme to docs folder for easier discoverability of notebooks … (#857 ) as titled <img width="454" alt="image" src="https://github.com/user-attachments/assets/7579d1d2-06cd-48e4-9659-79ab1ec6a4c2" />	2025-01-28 04:58:46 -08:00
Chris Khanoyan	5b0d778871	Update index.md (#888 ) Fixing the bullets # What does this PR do? The bullets were not there as intended so I helped fix them. - [x] Addresses issue (#issue) ## Test Plan Please describe: Ran the test, and the bullets are there now to be consistent with the page. ## Sources N/A ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-28 04:55:41 -08:00
Ashwin Bharambe	e5936a8df8	Update discriminator to have the correct `mapping` (#881 ) See https://swagger.io/docs/specification/v3_0/data-models/inheritance-and-polymorphism/#discriminator When specifying discriminators, mapping must be specified unless the value of the discriminator is the subtype itself (which in our case is not.) The changes in the YAML are self-explanatory.	2025-01-27 09:18:13 -08:00
Bakunga Bronson	7de46e40f9	Fixed multiple typos (#878 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-24 14:45:43 -08:00
Bakunga Bronson	33113139e8	Fixed typo (#877 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-24 13:16:00 -08:00
Hardik Shah	2cebb24d3a	Update doc templates for running safety on self-hosted templates (#874 )	2025-01-24 11:28:20 -08:00
Ashwin Bharambe	eaba6a550a	Point to 0.1.0 release notes in docs	2025-01-24 10:00:16 -08:00
Ashwin Bharambe	19521cb22e	More doc updates	2025-01-24 09:22:15 -08:00
Ashwin Bharambe	2118f37350	Doc updates	2025-01-23 21:31:18 -08:00
Ashwin Bharambe	9351a4b2d7	Update documentation	2025-01-23 17:10:57 -08:00
ehhuang	2fefe8dacd	Update 'first RAG agent' in gettingstarted doc (#867 ) # What does this PR do? Fix documentation to reflect new API ## Test Plan Before: User> What are the top 5 topics that were explained? Only list succinct bullet points. inference> I'm ready to help, but we haven't discussed any topics yet! This is the start of our conversation. What would you like to talk about? I can summarize our discussion at the end if you'd like. Run with the change, observe relevant response <img width="1029" alt="image" src="https://github.com/user-attachments/assets/a7dece3c-e8b4-4a60-9092-ba544c87dffd" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Co-authored-by: Eric Huang (AI Platform) <erichuang@fb.com>	2025-01-23 17:02:04 -08:00
Dinesh Yeduguru	ebffa15f40	update python sdk reference (#866 ) # What does this PR do? syncs changes from https://github.com/stainless-sdks/llama-stack-python/blob/main/api.md	2025-01-23 16:04:06 -08:00
Dinesh Yeduguru	c570a708bf	update the client reference (#864 ) # What does this PR do? Syncs changes from https://github.com/meta-llama/llama-stack-client-python/pull/96	2025-01-23 15:32:16 -08:00
Hardik Shah	94ffaf468c	More updates to ReadTheDocs (#861 ) Improve Contributing section	2025-01-23 12:50:38 -08:00
Dinesh Yeduguru	7df40da5fa	sync readme.md to index.md (#860 ) # What does this PR do? README has some new content that is being synced to index.md	2025-01-23 12:43:09 -08:00
Hardik Shah	a6a4270eef	Updates to ReadTheDocs (#859 ) Move evals section to AI Agents section drop from top level and other minor fixes	2025-01-23 12:42:15 -08:00
snova-edwardm	22dc684da6	Sambanova inference provider (#555 ) # What does this PR do? This PR adds SambaNova as one of the Provider - Add SambaNova as a provider ## Test Plan Test the functional command ``` pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_embeddings.py llama_stack/providers/tests/inference/test_prompt_adapter.py llama_stack/providers/tests/inference/test_text_inference.py llama_stack/providers/tests/inference/test_vision_inference.py --env SAMBANOVA_API_KEY=<sambanova-api-key> ``` Test the distribution template: ``` # Docker LLAMA_STACK_PORT=5001 docker run -it -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \ llamastack/distribution-sambanova \ --port $LLAMA_STACK_PORT \ --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY # Conda llama stack build --template sambanova --image-type conda llama stack run ./run.yaml \ --port $LLAMA_STACK_PORT \ --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY ``` ## Source [SambaNova API Documentation](https://cloud.sambanova.ai/apis) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [Y] Ran pre-commit to handle lint / formatting issues. - [Y] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [Y] Updated relevant documentation. - [Y ] Wrote necessary unit or integration tests. --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-01-23 12:20:28 -08:00
Dinesh Yeduguru	86466b71a9	update docs for adding new API providers (#855 ) # What does this PR do? update docs for adding new API providers ![Screenshot 2025-01-23 at 11 21 42 AM](https://github.com/user-attachments/assets/0d4621d4-ef7e-43cd-9c4a-3e8e0b49242f)	2025-01-23 12:05:57 -08:00
Dinesh Yeduguru	d0be9288a3	Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb (#854 ) Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb	2025-01-23 12:04:06 -08:00
Hardik Shah	74e933cbfd	More Updates to Read the Docs (#856 )	2025-01-23 11:39:33 -08:00
Dinesh Yeduguru	8a686270e9	remove getting started notebook (#853 ) # What does this PR do? This notebook is no longer updated and we should be using https://github.com/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Building_AI_Applications.ipynb	2025-01-23 10:09:09 -08:00
Hardik Shah	25a70ca4dc	Fixed distro documentation (#852 ) More docs	2025-01-23 08:19:51 -08:00
raghotham	e44a1a68f1	Delete docs/to_situate directory (#851 ) # What does this PR do? No need for the cookbook now. Removing the folder - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-23 07:15:47 -08:00
Sixian Yi	82a28f3a24	update doc for client-sdk testing (#849 ) As title ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-23 00:17:16 -08:00
Dinesh Yeduguru	28012c51bb	update docs for tools and telemetry (#846 ) # What does this PR do? Added a new Tools doc describing how to use tools and updated the main building agents doc to point to the tools doc. Also updated telemetry doc. https://llama-stack.readthedocs.io/en/tools-doc/building_applications/tools.html	2025-01-22 22:50:29 -08:00
Ashwin Bharambe	35c71d5bbe	Update OpenAPI generator to output discriminator (#848 ) oneOf should have discriminators so Stainless can generate better code ## Test Plan Going to generate the SDK now and check.	2025-01-22 22:15:23 -08:00
Hardik Shah	65f07c3d63	Update Documentation (#838 ) # What does this PR do? Update README and other documentation ## Before submitting - [X] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-22 20:38:52 -08:00
Ashwin Bharambe	f3d8864c36	Rename builtin::memory -> builtin::rag	2025-01-22 20:22:51 -08:00
Ashwin Bharambe	494e969f8d	add a bunch of NBVAL SKIPs to unblock ugh	2025-01-22 15:28:45 -08:00
Ashwin Bharambe	82d942b501	Foo	2025-01-22 13:58:17 -08:00
Ashwin Bharambe	55d01339c2	Update notebook	2025-01-22 13:31:11 -08:00
Ashwin Bharambe	07b87365ab	[inference api] modify content types so they follow a more standard structure (#841 ) Some small updates to the inference types to make them more standard Specifically: - image data is now located in a "image" subkey - similarly tool call data is located in a "tool_call" subkey The pattern followed is `dict(type="foo", foo=<...>)`	2025-01-22 12:16:18 -08:00
Ashwin Bharambe	a63a43c646	[memory refactor][6/n] Update naming and routes (#839 ) Making a few small naming changes as per feedback: - RAGToolRuntime methods are called `insert` and `query` to keep them more general - The tool names are changed to non-namespaced forms `insert_into_memory` and `query_from_memory` - The REST endpoints are more REST-ful	2025-01-22 10:39:13 -08:00
Ashwin Bharambe	c9e5578151	[memory refactor][5/n] Migrate all vector_io providers (#835 ) See https://github.com/meta-llama/llama-stack/issues/827 for the broader design. This PR finishes off all the stragglers and migrates everything to the new naming.	2025-01-22 10:17:59 -08:00
Ashwin Bharambe	1a7490470a	[memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol (#832 ) See https://github.com/meta-llama/llama-stack/issues/827 for the broader design. Third part: - we need to make `tool_runtime.rag_tool.query_context()` and `tool_runtime.rag_tool.insert_documents()` methods work smoothly with complete type safety. To that end, we introduce a sub-resource path `tool-runtime/rag-tool/` and make changes to the resolver to make things work. - the PR updates the agents implementation to directly call these typed APIs for memory accesses rather than going through the complex, untyped "invoke_tool" API. the code looks much nicer and simpler (expectedly.) - there are a number of hacks in the server resolver implementation still, we will live with some and fix some Note that we must make sure the client SDKs are able to handle this subresource complexity also. Stainless has support for subresources, so this should be possible but beware. ## Test Plan Our RAG test is sad (doesn't actually test for actual RAG output) but I verified that the implementation works. I will work on fixing the RAG test afterwards. ```bash pytest -s -v tests/agents/test_agents.py -k "rag and together" --safety-shield=meta-llama/Llama-Guard-3-8B ```	2025-01-22 10:04:16 -08:00
Dinesh Yeduguru	7a4b382ae9	add section for mcp tool usage in notebook (#831 ) # What does this PR do? Adds a section to the notebook on how to use tools hosted in MCP server. ![Screenshot 2025-01-21 at 11 05 39 AM](https://github.com/user-attachments/assets/23e900f1-e2a7-4a46-be9b-13642753dca1) Notebook: https://colab.research.google.com/drive/1hBKX01NlG6p2BUrBU0ynwIlWjXQRxc3k?usp=sharing Rendered notebook on this branch: https://github.com/meta-llama/llama-stack/blob/mcp-notebook/docs/notebooks/Llama_Stack_Building_AI_Applications.ipynb	2025-01-21 13:10:42 -08:00
Dinesh Yeduguru	3d4c53dfec	add mcp runtime as default to all providers (#816 ) # What does this PR do? This is needed to have the notebook work with MCP	2025-01-17 16:40:58 -08:00
Yuan Tang	6da3053c0e	More generic image type for OCI-compliant container technologies (#802 ) It's a more generic term and applicable to alternatives of Docker, such as Podman or other OCI-compliant technologies. --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-01-17 16:37:42 -08:00
Xi Yan	9d005154d7	fix vllm template (#813 ) # What does this PR do? - Fix vLLM template to resolve https://github.com/meta-llama/llama-stack/issues/805 - Fix agents test with shields ## Test Plan ``` vllm serve meta-llama/Llama-3.1-8B-Instruct VLLM_URL="http://localhost:8000/v1" INFERENCE_MODEL="meta-llama/Llama-3.1-8B-Instruct" llama stack run ./llama_stack/templates/remote-vllm/run.yaml ``` ``` LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v ./tests/client-sdk/ ``` <img width="1245" alt="image" src="https://github.com/user-attachments/assets/9af27684-5a9c-4187-b338-cbfc5211bd99" /> - custom tool flaky due to model outputs - /completions API not implemented Vision Model - 11B-Vision-Instruct <img width="1240" alt="image" src="https://github.com/user-attachments/assets/1d3b3b17-fa09-43a7-b56c-3f77263825c5" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-17 15:34:29 -08:00
Paul McCarthy	e1decaec9d	Fixing small typo in quick start guide (#807 ) # What does this PR do? Fixing small typo in the quick start guide ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).	2025-01-17 11:15:55 -08:00
Dinesh Yeduguru	53b5f6b24a	add json_schema_type to ParamType deps (#808 ) # What does this PR do? Add missing json_schema_type annotation to ParamType deps	2025-01-17 11:02:25 -08:00
Xi Yan	c2a072911d	fix eval notebook & add test to workflow (#803 )	2025-01-16 23:11:21 -08:00
Xi Yan	d1f3b032c9	cerebras template update for memory (#792 ) # What does this PR do? - we no longer have meta-reference as memory provider, update cerebras template ## Test Plan ``` python llama_stack/scripts/distro_codegen.py ``` ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-16 16:07:53 -08:00
Ashwin Bharambe	03ac84a829	Update default port from 5000 -> 8321	2025-01-16 15:26:48 -08:00
Hardik Shah	f1faa9c924	pop fix	2025-01-16 14:09:59 -08:00
Dinesh Yeduguru	fcd1a57429	update notebook	2025-01-16 14:00:48 -08:00
Xi Yan	a6b9f2cec7	fix cerebras template (#790 ) # What does this PR do? - fix cerebras template ## Test Plan ``` llama stack build --template cerebras --image-type conda llama stack run cerebras LLAMA_STACK_BASE_URL="http://localhost:5000" pytest -v tests/client-sdk/ --html=report.html --self-contained-html ``` ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-16 13:53:06 -08:00
Dinesh Yeduguru	12c994b5b2	REST API fixes (#789 ) # What does this PR do? Client SDK fixes ## Test Plan LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/llamastack-fireworks/fireworks-run.yaml" pytest -v tests/client-sdk/safety/test_safety.py LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/llamastack-fireworks/fireworks-run.yaml" pytest -v tests/client-sdk/memory/test_memory.py	2025-01-16 13:47:08 -08:00
Dinesh Yeduguru	59eeaf7f81	Idiomatic REST API: Telemetry (#786 ) # What does this PR do? Changes Telemetry API to follow more idiomatic REST - [ ] Addresses issue (#issue) ## Test Plan TBD, once i get an approval for rest endpoints	2025-01-16 12:08:46 -08:00
Hardik Shah	74e4d520ac	un-skip telemetry cells in notebook	2025-01-16 11:54:25 -08:00

1 2 3 4 5 ...

286 commits