llama-stack

forked from phoenix-oss/llama-stack-mirror

Author	SHA1	Message	Date
Jeff Tang	162cfb280e	added note of the image understanding working with LS 0.1.0 and 0.1.2	2025-02-09 09:27:15 -08:00
Jeff Tang	44f1a4fd5c	fix of the agent image understanding example error for LS 0.1.2	2025-02-09 09:24:15 -08:00
raghotham	7766e68e92	docs: update index.md for 0.1.2 (#1013 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) [//]: # (- [ ] Added a Changelog entry if the change is significant)	2025-02-07 15:36:20 -08:00
Jeff Tang	a229de6d1e	Getting started notebook update (#936 ) # What does this PR do? Added examples (Section 4) of using Llama Stack 0.1 distro on together and Llama 3.2 to answer questions about an image with LS Chat and Agent APIs.	2025-02-07 15:36:15 -08:00
Ashwin Bharambe	62e5461da7	No spaces in ipynb tests	2025-02-07 11:56:22 -08:00
Ashwin Bharambe	a8820597ee	Minor clean up of notebook	2025-02-07 11:36:29 -08:00
ehhuang	af15426ad7	doc: getting started notebook (#996 ) # What does this PR do? Fix link ## Test Plan <!-- Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. --> <!-- ## Sources Please link relevant resources if necessary. --> <!-- ## Documentation - [ ] Added a [Changelog](https://github.com/meta-llama/llama-stack/blob/main/CHANGELOG.md) entry if the change is significant (new feature, breaking change etc.). -->	2025-02-06 17:30:21 -08:00
Hardik Shah	28a0fe57cc	fix: Update rag examples to use fresh faiss index every time (#998 ) # What does this PR do? In several examples we use the same faiss index , which means running it multiple times fills up the index with duplicates which eventually degrades the model performance on RAG as multiple copies of the same irrelevant chunks might be picked up several times. Fix is to ensure we create a new index each time. Resolves issue in this discussion - https://github.com/meta-llama/llama-stack/discussions/995 ## Test Plan Re-ran the getting started guide multiple times to see the same output Co-authored-by: Hardik Shah <hjshah@fb.com>	2025-02-06 16:12:29 -08:00
Maxime Lecanu	e964ec95e9	docs: Correct typos in Zero to Hero guide (#997 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. --> Corrects some typographical errors found in the `docs/zero_to_hero_guide/README.md` file. <!-- Uncomment this section with the issue number if an issue is being resolved Issue resolved by this Pull Request: Closes # ---> ## Test Plan <!-- Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. --> N/A <!-- ## Sources Please link relevant resources if necessary. --> <!-- ## Documentation - [ ] Added a [Changelog](https://github.com/meta-llama/llama-stack/blob/main/CHANGELOG.md) entry if the change is significant (new feature, breaking change etc.). --> Co-authored-by: Maxime Lecanu <mlecanu@fb.com>	2025-02-06 17:29:52 -05:00
Hardik Shah	a84e7669f0	feat: Add a new template for `dell` (#978 ) - Added new template `dell` and its documentation - Update docs - [minor] uv fix i came across - codegen for all templates Tested with ```bash export INFERENCE_PORT=8181 export DEH_URL=http://0.0.0.0:$INFERENCE_PORT export INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct export CHROMADB_HOST=localhost export CHROMADB_PORT=6601 export CHROMA_URL=[http://$CHROMADB_HOST:$CHROMADB_PORT](about:blank) export CUDA_VISIBLE_DEVICES=0 export LLAMA_STACK_PORT=8321 # build the stack template llama stack build --template=dell # start the TGI inference server podman run --rm -it --network host -v $HOME/.cache/huggingface:/data -e HF_TOKEN=$HF_TOKEN -p $INFERENCE_PORT:$INFERENCE_PORT --gpus $CUDA_VISIBLE_DEVICES [ghcr.io/huggingface/text-generation-inference](http://ghcr.io/huggingface/text-generation-inference) --dtype bfloat16 --usage-stats off --sharded false --cuda-memory-fraction 0.7 --model-id $INFERENCE_MODEL --port $INFERENCE_PORT --hostname 0.0.0.0 # start chroma-db for vector-io ( aka RAG ) podman run --rm -it --network host --name chromadb -v .:/chroma/chroma -e IS_PERSISTENT=TRUE chromadb/chroma:latest --port $CHROMADB_PORT --host $(hostname) # build docker llama stack build --template=dell --image-type=container # run llama stack server ( via docker ) podman run -it \ --network host \ -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \ -v ~/.llama:/root/.llama \ # NOTE: mount the llama-stack / llama-model directories if testing local changes -v /home/hjshah/git/llama-stack:/app/llama-stack-source -v /home/hjshah/git/llama-models:/app/llama-models-source \ localhost/distribution-dell:dev \ --port $LLAMA_STACK_PORT \ --env INFERENCE_MODEL=$INFERENCE_MODEL \ --env DEH_URL=$DEH_URL \ --env CHROMA_URL=$CHROMA_URL # test the server cd <PATH_TO_LLAMA_STACK_REPO> LLAMA_STACK_BASE_URL=http://0.0.0.0:$LLAMA_STACK_PORT pytest -s -v tests/client-sdk/agents/test_agents.py ``` --------- Co-authored-by: Hardik Shah <hjshah@fb.com>	2025-02-06 14:14:39 -08:00
Yuan Tang	09ed0e9c9f	Add Kubernetes deployment guide (#899 ) This PR moves some content from [the recent blog post](https://blog.vllm.ai/2025/01/27/intro-to-llama-stack-with-vllm.html) to here as a more official guide for users who'd like to deploy Llama Stack on Kubernetes. --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-06 10:28:02 -08:00
ehhuang	3922999118	sys_prompt support in Agent (#938 ) # What does this PR do? The current default system prompt for llama3.2 tends to overindex on tool calling and doesn't work well when the prompt does not require tool calling. This PR adds an option to override the default system prompt, and organizes tool-related configs into a new config object. - [ ] Addresses issue (#issue) ## Test Plan LLAMA_STACK_CONFIG=together pytest \-\-inference\-model=meta\-llama/Llama\-3\.3\-70B\-Instruct -s -v tests/client-sdk/agents/test_agents.py::test_override_system_message_behavior ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-02-05 21:11:32 -08:00
Nathan Weinberg	e777d965a1	docs: add addn server guidance for Linux users in Quick Start (#972 ) # What does this PR do? - [x] Addresses issue #971 ## Test Plan Ran docs build locally ## Sources See discussion linked in the issue ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Signed-off-by: Nathan Weinberg <nweinber@redhat.com> Co-authored-by: Mert Parker <mertpaker@gmail.com>	2025-02-05 20:57:51 -08:00
Ihar Hrachyshka	f4343f7dc0	docs: clarify host.docker.internal works for recent podman (#977 ) The host.docker.internal alias was implemented in podman 4.7.0: `b672ddc792` Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> # What does this PR do? Follow-up to previous podman specific doc update. ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-02-05 16:02:05 -08:00
Aakanksha Duggal	8fa642835b	Fix README.md notebook links (#976 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Signed-off-by: Aakanksha Duggal <aduggal@redhat.com>	2025-02-05 14:33:46 -08:00
Ryan Cook	2d9c8b549e	docs: missing T in import (#974 ) # What does this PR do? Missing T in import ## Test Plan N/A doc update ## Sources Please link relevant resources if necessary. ## Before submitting - [X ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-02-05 17:06:39 -05:00
Kamesh Akella	d9c0b4e3ba	[docs] update the zero_to_hero_guide llama stack version to 0.1.0 (#960 ) # What does this PR do? The Zero to Hero guide currently references an older 0.0.61 llama-stack version. Using the most recent stable release of the product in the documentation, would help the users not to go through any issues from the older llama-stack versions. ## Test Plan I have ran the workflow locally using the proposed version change and I am able to proceed further ahead without any issue. ## Before submitting - [X] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-02-05 11:49:26 -08:00
Ihar Hrachyshka	5c8e35a9e2	docs, tests: replace datasets.rst with memory_optimizations.rst (#968 ) datasets.rst was removed from torchtune repo. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> # What does this PR do? Replace a missing 404 document with another one that exists. (Removed it from the list when memory_optimizations.rst was already pulled.) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-02-05 11:25:56 -05:00
Ihar Hrachyshka	529708215c	[docs] Make RAG example self-contained (#962 ) Before the patch, the example could not be executed verbatim without copy-pasting client function from the inference example. I think it's better to have examples self-contained, especially in a getting started guide. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> # What does this PR do? See above. ## Test Plan Confirmed example can now be executed verbatim. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-02-04 16:22:50 -08:00
Ashwin Bharambe	474c4bdd7a	Make a couple properties optional (#963 )	2025-02-04 16:20:24 -08:00
Ihar Hrachyshka	0cbb3e401c	docs: miscellaneous small fixes (#961 ) - [docs] Fix misc typos and formatting issues in intro docs - [docs]: Export variables (e.g. INFERENCE_MODEL) in getting_started - [docs] Show that `llama-stack-client configure` will ask for api key # What does this PR do? Miscellaneous fixes in the documentation; not worth reporting an issue. ## Test Plan No code changes. Addressed issues spotted when walking through the guide. Confirmed locally. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --------- Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-02-04 15:31:30 -08:00
Bill Murdock	b0dec797a0	Add Podman instructions to Quick Start (#957 ) Podman is a popular alternative to Docker, so it would be nice to make it clear that it can also be used to deploy the container for the server. The instructions are a little different because you have to create the directory (unlike with Docker which makes the directory for you). # What does this PR do? - [ ] Add Podman instructions to Quick Start ## Test Plan Documentation only. ## Sources I tried it out and it worked. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-02-04 14:37:02 -08:00
Ashwin Bharambe	d67401c644	Several documentation fixes and fix link to API reference	2025-02-04 14:00:43 -08:00
Charlie Doern	26aef50bc5	if client.initialize fails, the example should exit (#954 ) # What does this PR do? the example script can gracefully exit if the boolean returned from initialize is used properly Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-02-04 13:54:21 -08:00
Ashwin Bharambe	b17277b06a	Fix the OpenAPI HTML	2025-02-04 10:38:49 -08:00
ehhuang	c9ab72fa82	Support sys_prompt behavior in inference (#937 ) # What does this PR do? The current default system prompt for llama3.2 tends to overindex on tool calling and doesn't work well when the prompt does not require tool calling. This PR adds an option to override the default system prompt, and organizes tool-related configs into a new config object. - [ ] Addresses issue (#issue) ## Test Plan python -m unittest llama_stack.providers.tests.inference.test_prompt_adapter ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/meta-llama/llama-stack/pull/937). * #938 * __->__ #937	2025-02-03 23:35:16 -08:00
Xi Yan	62cd3c391e	notebook point to github as source of truth	2025-02-03 15:08:25 -08:00
Ashwin Bharambe	753a1aa7bc	Update colab link to be pointing back to github source	2025-02-03 15:00:21 -08:00
Ashwin Bharambe	aefd5bb619	Test notebook update	2025-02-03 14:59:06 -08:00
Nathan Weinberg	7a72082cdd	fix: formatting for ollama note in Quick Start doc (#945 ) # What does this PR do? Fixes formatting for Ollama note found here: https://llama-stack.readthedocs.io/en/latest/getting_started/index.html#start-ollama - [ ] Addresses issue (#issue) ## Test Plan Ran local docs build as described [here](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md#building-the-documentation) ## Sources N/A ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-02-03 14:13:57 -08:00
Ashwin Bharambe	f98efe68c9	Misc fixes (#944 ) - Make sure torch + torchvision go together as deps, otherwise bad stuff happens - Add a pre-commit for requirements.txt	2025-02-03 14:08:47 -08:00
Nathan Weinberg	0f14378135	fix: broken "core concepts" link in docs website (#940 ) # What does this PR do? The `core concepts` link on [this page](https://llama-stack.readthedocs.io/en/latest/contributing/new_api_provider.html) is currently broken - this PR fixes that link ## Test Plan Ran local docs build as described [here](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md#building-the-documentation) ## Sources N/A ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [x] Wrote necessary unit or integration tests. Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-02-03 13:46:34 -08:00
Nathan Weinberg	1e36721686	fix: broken link in Quick Start doc (#943 ) # What does this PR do? Ollama download link is broken on this page: https://llama-stack.readthedocs.io/en/latest/getting_started/index.html ## Test Plan N/A ## Sources https://ollama.com/docs/installation ==> 404 https://ollama.com/download ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [x] Wrote necessary unit or integration tests. Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-02-03 13:45:35 -08:00
Ashwin Bharambe	ccf0cbb903	Update release pointer	2025-02-02 12:11:57 -08:00
Ashwin Bharambe	7fdbd5b642	Add NBVAL skips to the getting started notebook	2025-02-02 07:53:07 -08:00
Yuan Tang	34ab7a3b6c	Fix precommit check after moving to ruff (#927 ) Lint check in main branch is failing. This fixes the lint check after we moved to ruff in https://github.com/meta-llama/llama-stack/pull/921. We need to move to a `ruff.toml` file as well as fixing and ignoring some additional checks. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-02 06:46:45 -08:00
Hardik Shah	a7b929f17e	Sec fixes as raised by bandit (#917 ) minor fixes to hashlib and jinja	2025-01-31 13:44:26 -08:00
Xi Yan	15dcc4ea5e	openapi gen return type fix for streaming/non-streaming (#910 ) # What does this PR do? We need to change ```yaml /v1/inference/chat-completion: post: responses: '200': description: >- If stream=False, returns a ChatCompletionResponse with the full completion. If stream=True, returns an SSE event stream of ChatCompletionResponseStreamChunk content: text/event-stream: schema: oneOf: - $ref: '#/components/schemas/ChatCompletionResponse' - $ref: '#/components/schemas/ChatCompletionResponseStreamChunk' ``` into ```yaml /v1/inference/chat-completion: post: responses: '200': description: >- If stream=False, returns a ChatCompletionResponse with the full completion. If stream=True, returns an SSE event stream of ChatCompletionResponseStreamChunk content: text/event-stream: schema: $ref: '#/components/schemas/ChatCompletionResponseStreamChunk' application/json: schema: $ref: '#/components/schemas/ChatCompletionResponse' ``` ## Test Plan Python - tested in SDK sync: https://github.com/meta-llama/llama-stack-client-python/pull/108 Node - tested w/ https://gist.github.com/yanxi0830/b782f4b91e21dcccdfef8898ce55157e (SDK udpate follow up) ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-30 18:03:02 -08:00
Xi Yan	94051cfe9e	fix ImageContentItem to take base64 string as image.data (#909 ) # What does this PR do? - Discussion in https://github.com/meta-llama/llama-stack/pull/906#discussion_r1936260819 - image.data should accept base64 string as input instead of binary bytes, change prompt_adapter to account for that. ## Test Plan ``` pytest -v tests/client-sdk/inference/test_inference.py ``` with test in https://github.com/meta-llama/llama-stack/pull/906 ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-30 15:58:23 -08:00
snova-edwardm	7fe2592795	SambaNova supports Llama 3.3 (#905 ) # What does this PR do? - Fix typo - Support Llama 3.3 70B ## Test Plan Run the following scripts and obtain the test results Script ``` pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming --env SAMBANOVA_API_KEY={API_KEY} ``` Result ``` llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming[-sambanova] PASSED =========================================== 1 passed, 1 warning in 1.26s ============================================ ``` Script ``` pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming --env SAMBANOVA_API_KEY={API_KEY} ``` Result ``` llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming[-sambanova] PASSED =========================================== 1 passed, 1 warning in 0.52s ============================================ ``` ## Sources Please link relevant resources if necessary. ## Before submitting - [N] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [Y] Ran pre-commit to handle lint / formatting issues. - [Y] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [Y] Updated relevant documentation. - [N] Wrote necessary unit or integration tests.	2025-01-30 09:24:46 -08:00
Yuan Tang	d5b7de3897	Fix link to selection guide and change "docker" to "container" (#898 ) The current link doesn't work. Also changed docs to be consistent with https://github.com/meta-llama/llama-stack/pull/802.	2025-01-29 11:59:40 -08:00
Ashwin Bharambe	0d96070af9	Update OpenAPI generator to add param and field documentation (#896 ) We desperately need to document our APIs. This is the basic requirement of having a Spec :) This PR updates the OpenAPI generator so documentation for request parameters and object fields can be properly added to the OpenAPI specs. From there, this should get picked by Stainless, etc. ## Test Plan: Updated client-sdk (See https://github.com/meta-llama/llama-stack-client-python/pull/104) and then ran: ```bash cd tests/client-sdk LLAMA_STACK_CONFIG=../../llama_stack/templates/fireworks/run.yaml pytest -s -v inference/test_inference.py agents/test_agents.py ```	2025-01-29 10:04:30 -08:00
Ashwin Bharambe	9f709387e2	Kill X-LlamaStack-{Client-Version, Provider-Data} from OpenAPI spec ClientVersion: We don't need each SDK method to support this parameter because you wouldn't be passing a different client version each time you make an API call. ProviderData: although in this case, you _could_ be passing different API keys depending on which SDK call you make, it makes for a confusing experience. It is best to initialize the LlamaStackClient with all the keys which are then passed in each request.	2025-01-28 13:30:23 -08:00
Ashwin Bharambe	ec3ebb5bcf	Use ruamel.yaml to format the OpenAPI spec (#892 ) Stainless ends up reformatting the YAML when we paste it in the Studio. We cannot have that happen if we are going to ever partially automate stainless config updates. Try ruamel.yaml, specifically `block_seq_indent` to avoid that.	2025-01-28 11:27:40 -08:00
Ashwin Bharambe	d123e9d3d7	Update docs for RAG and improve CONTRIBUTING.md	2025-01-28 06:09:48 -08:00
Justin Lee	e4865c3510	adding readme to docs folder for easier discoverability of notebooks … (#857 ) as titled <img width="454" alt="image" src="https://github.com/user-attachments/assets/7579d1d2-06cd-48e4-9659-79ab1ec6a4c2" />	2025-01-28 04:58:46 -08:00
Chris Khanoyan	5b0d778871	Update index.md (#888 ) Fixing the bullets # What does this PR do? The bullets were not there as intended so I helped fix them. - [x] Addresses issue (#issue) ## Test Plan Please describe: Ran the test, and the bullets are there now to be consistent with the page. ## Sources N/A ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-28 04:55:41 -08:00
Ashwin Bharambe	e5936a8df8	Update discriminator to have the correct `mapping` (#881 ) See https://swagger.io/docs/specification/v3_0/data-models/inheritance-and-polymorphism/#discriminator When specifying discriminators, mapping must be specified unless the value of the discriminator is the subtype itself (which in our case is not.) The changes in the YAML are self-explanatory.	2025-01-27 09:18:13 -08:00
Bakunga Bronson	7de46e40f9	Fixed multiple typos (#878 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-24 14:45:43 -08:00
Bakunga Bronson	33113139e8	Fixed typo (#877 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-24 13:16:00 -08:00

1 2 3 4 5 ...

331 commits