llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

Author	SHA1	Message	Date
Matthew Farrellee	1a5c17a92f	align with CompletionResponseStreamChunk.delta as str (instead of TextDelta) (#900 ) # What does this PR do? fix type mismatch in /v1/inference/completion ## Test Plan `llama stack run ./llama_stack/templates/nvidia/run.yaml` `LLAMA_STACK_BASE_URL="http://localhost:8321" pytest -v tests/client-sdk/inference/test_inference.py` ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-29 09:25:50 -08:00
Ashwin Bharambe	9f709387e2	Kill X-LlamaStack-{Client-Version, Provider-Data} from OpenAPI spec ClientVersion: We don't need each SDK method to support this parameter because you wouldn't be passing a different client version each time you make an API call. ProviderData: although in this case, you _could_ be passing different API keys depending on which SDK call you make, it makes for a confusing experience. It is best to initialize the LlamaStackClient with all the keys which are then passed in each request.	2025-01-28 13:30:23 -08:00
Ashwin Bharambe	f2feb7d15c	Fix Chroma adapter (#893 ) Chroma method had the wrong signature. ## Test Plan Start Chroma: `chroma run --path /tmp/foo/chroma2 --host localhost --port 6001` Modify run.yaml to include Chroma server pointing to localhost:6001 and run `llama stack run` Then: ```bash LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -s -v agents/test_agents.py -k rag ``` passes	2025-01-28 13:19:47 -08:00
Ashwin Bharambe	ec3ebb5bcf	Use ruamel.yaml to format the OpenAPI spec (#892 ) Stainless ends up reformatting the YAML when we paste it in the Studio. We cannot have that happen if we are going to ever partially automate stainless config updates. Try ruamel.yaml, specifically `block_seq_indent` to avoid that.	2025-01-28 11:27:40 -08:00
Ashwin Bharambe	41749944a5	Fix ResponseFormat import	2025-01-28 09:34:05 -08:00
Ashwin Bharambe	aee6237685	Small refactor for run_with_pty	2025-01-28 09:32:33 -08:00
Vladislav Bronzov	8332ea23ad	Add run win command for stack (#890 ) # What does this PR do? Add win platform run command for stack - [x] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. https://github.com/meta-llama/llama-stack/pull/889 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-28 08:04:28 -08:00
Vladislav Bronzov	09299e908e	Add windows support for build execution (#889 ) # What does this PR do? This PR implements windows platform support for build_container.sh execution from terminal. Additionally, it resolves "no support for Terminos and PTY for Window PC" issues. - [x] Addresses issue (#issue) Releates issues: https://github.com/meta-llama/llama-stack/issues/826, https://github.com/meta-llama/llama-stack/issues/726 ## Test Plan Changes were tested manually by executing standard scripts from LLama guide: - llama stack build --template ollama --image-type container - llama stack build --list-templates - llama stack build ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-28 07:41:41 -08:00
Ashwin Bharambe	d123e9d3d7	Update docs for RAG and improve CONTRIBUTING.md	2025-01-28 06:09:48 -08:00
Zhonglin Han	229f0d5f7c	Agent response format (#660 ) # What does this PR do? Add response format for agents structured output. - [ ] Using structured output for agents (interior_design app as an example) (#issue) https://github.com/meta-llama/llama-stack-apps/issues/122 ## Test Plan E2E test plan with llama-stack-apps interior_design Please describe: Test ran: - provide instructions so it can be reproduced. Start your distro: llama stack run llama_stack/templates/fireworks/run.yaml --env FIREWORKS_API_KEY=<API_KEY> Run api test: ```PYTHONPATH=. python examples/interior_design_assistant/api.py localhost 5000 examples/interior_design_assistant/resources/documents/ examples/interior_design_assistant/resources/images/fireplaces``` ## Sources Results: https://github.com/meta-llama/llama-stack-client-python/pull/72 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-28 05:05:38 -08:00
Justin Lee	e4865c3510	adding readme to docs folder for easier discoverability of notebooks … (#857 ) as titled <img width="454" alt="image" src="https://github.com/user-attachments/assets/7579d1d2-06cd-48e4-9659-79ab1ec6a4c2" />	2025-01-28 04:58:46 -08:00
Sixian Yi	ba453c3487	Report generation minor fixes (#884 ) # What does this PR do? fixed report generation: 1) do not initialize a new client in report.py - instead get it from pytest fixture 2) Add "provider" for "safety" and "agents" section 3) add logprobs functionality in "inference" section ## Test Plan See the regenerated report ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-28 04:58:12 -08:00
Chris Khanoyan	5b0d778871	Update index.md (#888 ) Fixing the bullets # What does this PR do? The bullets were not there as intended so I helped fix them. - [x] Addresses issue (#issue) ## Test Plan Please describe: Ran the test, and the bullets are there now to be consistent with the page. ## Sources N/A ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-28 04:55:41 -08:00
snova-edwardm	aa65610e75	Sambanova - LlamaGuard (#886 ) # What does this PR do? - Fix loading SambaNovaImpl issue - Add LlamaGuard model support for inference ## Test Plan Run the following unit test scripts and results ### Embedding ``` pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_embeddings.py --inference-model meta-llama/Llama-3.2-11B-Vision-Instruct --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY} ``` ``` llama_stack/providers/tests/inference/test_embeddings.py::TestEmbeddings::test_embeddings[-sambanova] SKIPPED (This test is only applicable for embedding models) llama_stack/providers/tests/inference/test_embeddings.py::TestEmbeddings::test_batch_embeddings[-sambanova] SKIPPED (This test is only applicable for embedding models) =================================================================================================================== 2 skipped, 1 warning in 0.32s =================================================================================================================== ``` ### Vision ``` pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_vision_inference.py --inference-model meta-llama/Llama-3.2-11B-Vision-Instruct --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY} ``` ``` llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-sambanova-image0-expected_strings0] PASSED llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-sambanova-image1-expected_strings1] PASSED llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_streaming[-sambanova] PASSED =================================================================================================================== 3 passed, 1 warning in 2.68s ==================================================================================================================== ``` ### Text ``` pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY} ``` ``` llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming[-sambanova] PASSED =================================================================================================================== 1 passed, 1 warning in 0.46s ==================================================================================================================== ``` ``` pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY} ``` ``` llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming[-sambanova] PASSED =================================================================================================================== 1 passed, 1 warning in 0.48s ==================================================================================================================== ``` ## Before submitting - [] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [Y] Ran pre-commit to handle lint / formatting issues. - [Y] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [Y] Updated relevant documentation. - [Y] Wrote necessary unit or integration tests.	2025-01-27 15:46:30 -08:00
Dinesh Yeduguru	3c1a2c3d66	Fix telemetry init (#885 ) # What does this PR do? When you re-initialize the library client in a notebook, we were seeing this error: ``` Getting traces for session_id=5c8d1969-0957-49d2-b852-32cbb8ef8caf --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) [<ipython-input-11-d74bb6cdd3ab>](https://localhost:8080/#) in <cell line: 0>() 7 agent_logs = [] 8 ----> 9 for span in client.telemetry.query_spans( 10 attribute_filters=[ 11 {"key": "session_id", "op": "eq", "value": session_id}, 10 frames [/usr/local/lib/python3.11/dist-packages/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py](https://localhost:8080/#) in query_traces(self, attribute_filters, limit, offset, order_by) 246 ) -> QueryTracesResponse: 247 return QueryTracesResponse( --> 248 data=await self.trace_store.query_traces( 249 attribute_filters=attribute_filters, 250 limit=limit, AttributeError: 'TelemetryAdapter' object has no attribute 'trace_store' ``` This is happening because the we were skipping some required steps for the object state as part of the global _TRACE_PROVIDER check. This PR moves the initialization of the object state out of the TRACE_PROVIDER init.	2025-01-27 11:20:28 -08:00
Ashwin Bharambe	e5936a8df8	Update discriminator to have the correct `mapping` (#881 ) See https://swagger.io/docs/specification/v3_0/data-models/inheritance-and-polymorphism/#discriminator When specifying discriminators, mapping must be specified unless the value of the discriminator is the subtype itself (which in our case is not.) The changes in the YAML are self-explanatory.	2025-01-27 09:18:13 -08:00
Ashwin Bharambe	a6d20e0f53	Update documentation (#865 ) Update docs variously	2025-01-27 09:17:51 -08:00
Ashwin Bharambe	891bf704eb	Ensure llama stack build --config <> --image-type <> works (#879 ) Fix the issues brought up in https://github.com/meta-llama/llama-stack/issues/870 Test all combinations of (conda, container) vs. (template, config) combos.	2025-01-25 11:13:36 -08:00
Bakunga Bronson	7de46e40f9	Fixed multiple typos (#878 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-24 14:45:43 -08:00
Bakunga Bronson	33113139e8	Fixed typo (#877 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-24 13:16:00 -08:00
Hardik Shah	632e60439a	Fix report generation for url endpoints (#876 ) Earlier, we would have some unknown magic to identify the path for remote endpoints when testing client_sdk/tests. Removed that and now you have to explicitly pass a path	2025-01-24 13:15:44 -08:00
Ashwin Bharambe	087a83f673	Bump key for faiss	2025-01-24 12:08:36 -08:00
Ashwin Bharambe	d111bad2f2	Update GH action so it correctly queries for test.pypi, etc. (#875 ) The previous curl command was wrong and did not actually check for version correctly (status code was always 200 regardless of what you retrieved.) Also added tagging latest. cc @wukaixingxp	2025-01-24 11:56:29 -08:00
Hardik Shah	2cebb24d3a	Update doc templates for running safety on self-hosted templates (#874 )	2025-01-24 11:28:20 -08:00
Ashwin Bharambe	eaba6a550a	Point to 0.1.0 release notes in docs	2025-01-24 10:00:16 -08:00
Ashwin Bharambe	05d73dd4fd	Bump version to 0.1.0	2025-01-24 09:50:07 -08:00
Ashwin Bharambe	19521cb22e	More doc updates	2025-01-24 09:22:15 -08:00
Ashwin Bharambe	2118f37350	Doc updates	2025-01-23 21:31:18 -08:00
Ashwin Bharambe	9351a4b2d7	Update documentation	2025-01-23 17:10:57 -08:00
ehhuang	2fefe8dacd	Update 'first RAG agent' in gettingstarted doc (#867 ) # What does this PR do? Fix documentation to reflect new API ## Test Plan Before: User> What are the top 5 topics that were explained? Only list succinct bullet points. inference> I'm ready to help, but we haven't discussed any topics yet! This is the start of our conversation. What would you like to talk about? I can summarize our discussion at the end if you'd like. Run with the change, observe relevant response <img width="1029" alt="image" src="https://github.com/user-attachments/assets/a7dece3c-e8b4-4a60-9092-ba544c87dffd" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Co-authored-by: Eric Huang (AI Platform) <erichuang@fb.com>	2025-01-23 17:02:04 -08:00
Dinesh Yeduguru	cb11336886	remove logger handler only in notebook (#868 ) remove logger handler only in notebook	2025-01-23 16:58:17 -08:00
Dinesh Yeduguru	ebffa15f40	update python sdk reference (#866 ) # What does this PR do? syncs changes from https://github.com/stainless-sdks/llama-stack-python/blob/main/api.md	2025-01-23 16:04:06 -08:00
Dinesh Yeduguru	c570a708bf	update the client reference (#864 ) # What does this PR do? Syncs changes from https://github.com/meta-llama/llama-stack-client-python/pull/96	2025-01-23 15:32:16 -08:00
Dinesh Yeduguru	a78f1fc70d	make default tool prompt format none in agent config (#863 ) # What does this PR do? Previously the tests hard coded the tool prompt format to be json which will cause it to fail when using 3.2/3.3 family of models. This change make the default to be none for the agent config and just remove the specification in the tests. ## Test Plan LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v tests/client-sdk/agents/test_agents.py	2025-01-23 14:44:59 -08:00
Hardik Shah	94ffaf468c	More updates to ReadTheDocs (#861 ) Improve Contributing section	2025-01-23 12:50:38 -08:00
Dinesh Yeduguru	7df40da5fa	sync readme.md to index.md (#860 ) # What does this PR do? README has some new content that is being synced to index.md	2025-01-23 12:43:09 -08:00
Hardik Shah	a6a4270eef	Updates to ReadTheDocs (#859 ) Move evals section to AI Agents section drop from top level and other minor fixes	2025-01-23 12:42:15 -08:00
Ashwin Bharambe	d78027f3b5	Move runpod provider to the correct directory Also cleanup the test code to avoid skipping tests. Let failures be known and public.	2025-01-23 12:25:12 -08:00
snova-edwardm	22dc684da6	Sambanova inference provider (#555 ) # What does this PR do? This PR adds SambaNova as one of the Provider - Add SambaNova as a provider ## Test Plan Test the functional command ``` pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_embeddings.py llama_stack/providers/tests/inference/test_prompt_adapter.py llama_stack/providers/tests/inference/test_text_inference.py llama_stack/providers/tests/inference/test_vision_inference.py --env SAMBANOVA_API_KEY=<sambanova-api-key> ``` Test the distribution template: ``` # Docker LLAMA_STACK_PORT=5001 docker run -it -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \ llamastack/distribution-sambanova \ --port $LLAMA_STACK_PORT \ --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY # Conda llama stack build --template sambanova --image-type conda llama stack run ./run.yaml \ --port $LLAMA_STACK_PORT \ --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY ``` ## Source [SambaNova API Documentation](https://cloud.sambanova.ai/apis) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [Y] Ran pre-commit to handle lint / formatting issues. - [Y] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [Y] Updated relevant documentation. - [Y ] Wrote necessary unit or integration tests. --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-01-23 12:20:28 -08:00
Marut Pandya	e2b5456e48	Add Runpod Provider + Distribution (#362 ) Add Runpod as a inference provider for openAI compatible managed endpoints. Testing - Configured llama stack from scratch, set `remote::runpod` as a inference provider. - Added Runpod Endpoint URL and API key. - Started llama-stack server - llama stack run my-local-stack --port 3000 ``` curl http://localhost:5000/inference/chat_completion \ -H "Content-Type: application/json" \ -d '{ "model": "Llama3.1-8B-Instruct", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write me a 2 sentence poem about the moon"} ], "sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512} }' ``` --------- Signed-off-by: pandyamarut <pandyamarut@gmail.com>	2025-01-23 12:19:02 -08:00
Dinesh Yeduguru	86466b71a9	update docs for adding new API providers (#855 ) # What does this PR do? update docs for adding new API providers ![Screenshot 2025-01-23 at 11 21 42 AM](https://github.com/user-attachments/assets/0d4621d4-ef7e-43cd-9c4a-3e8e0b49242f)	2025-01-23 12:05:57 -08:00
Dinesh Yeduguru	d0be9288a3	Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb (#854 ) Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb	2025-01-23 12:04:06 -08:00
Hardik Shah	a10cdc7cdb	Update README.md	2025-01-23 12:00:01 -08:00
Hardik Shah	74e933cbfd	More Updates to Read the Docs (#856 )	2025-01-23 11:39:33 -08:00
Dinesh Yeduguru	8a686270e9	remove getting started notebook (#853 ) # What does this PR do? This notebook is no longer updated and we should be using https://github.com/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Building_AI_Applications.ipynb	2025-01-23 10:09:09 -08:00
Hardik Shah	25a70ca4dc	Fixed distro documentation (#852 ) More docs	2025-01-23 08:19:51 -08:00
raghotham	e44a1a68f1	Delete docs/to_situate directory (#851 ) # What does this PR do? No need for the cookbook now. Removing the folder - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-23 07:15:47 -08:00
Sixian Yi	bfbd773b54	remove test report	2025-01-23 01:06:39 -08:00
Sixian Yi	82a28f3a24	update doc for client-sdk testing (#849 ) As title ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-23 00:17:16 -08:00
Ashwin Bharambe	3d14a3d46f	Kill colons	2025-01-22 22:59:30 -08:00

1 2 3 4 5 ...

1079 commits