From 0cbb3e401c6b8fea89b5aa5157fab2809c442efd Mon Sep 17 00:00:00 2001 From: Ihar Hrachyshka Date: Tue, 4 Feb 2025 18:31:30 -0500 Subject: [PATCH] docs: miscellaneous small fixes (#961) - **[docs] Fix misc typos and formatting issues in intro docs** - **[docs]: Export variables (e.g. INFERENCE_MODEL) in getting_started** - **[docs] Show that `llama-stack-client configure` will ask for api key** # What does this PR do? Miscellaneous fixes in the documentation; not worth reporting an issue. ## Test Plan No code changes. Addressed issues spotted when walking through the guide. Confirmed locally. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --------- Signed-off-by: Ihar Hrachyshka --- docs/source/distributions/selection.md | 2 +- .../distributions/self_hosted_distro/ollama.md | 4 +++- docs/source/getting_started/index.md | 12 +++++++----- 3 files changed, 11 insertions(+), 7 deletions(-) diff --git a/docs/source/distributions/selection.md b/docs/source/distributions/selection.md index 353da21d4..da1b0df9c 100644 --- a/docs/source/distributions/selection.md +++ b/docs/source/distributions/selection.md @@ -23,7 +23,7 @@ Which templates / distributions to choose depends on the hardware you have for r - {dockerhub}`distribution-together` ([Guide](self_hosted_distro/together)) - {dockerhub}`distribution-fireworks` ([Guide](self_hosted_distro/fireworks)) -- **Do you want to run Llama Stack inference on your iOS / Android device** Lastly, we also provide templates for running Llama Stack inference on your iOS / Android device: +- **Do you want to run Llama Stack inference on your iOS / Android device?** Lastly, we also provide templates for running Llama Stack inference on your iOS / Android device: - [iOS SDK](ondevice_distro/ios_sdk) - [Android](ondevice_distro/android_sdk) diff --git a/docs/source/distributions/self_hosted_distro/ollama.md b/docs/source/distributions/self_hosted_distro/ollama.md index 92e1f7dbf..e7c729501 100644 --- a/docs/source/distributions/self_hosted_distro/ollama.md +++ b/docs/source/distributions/self_hosted_distro/ollama.md @@ -25,7 +25,9 @@ The `llamastack/distribution-ollama` distribution consists of the following prov | vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` | -You should use this distribution if you have a regular desktop machine without very powerful GPUs. Of course, if you have powerful GPUs, you can still continue using this distribution since Ollama supports GPU acceleration.### Environment Variables +You should use this distribution if you have a regular desktop machine without very powerful GPUs. Of course, if you have powerful GPUs, you can still continue using this distribution since Ollama supports GPU acceleration. + +### Environment Variables The following environment variables can be configured: diff --git a/docs/source/getting_started/index.md b/docs/source/getting_started/index.md index ce89919a6..0634f4b1a 100644 --- a/docs/source/getting_started/index.md +++ b/docs/source/getting_started/index.md @@ -1,6 +1,6 @@ # Quick Start -In this guide, we'll walk through how you can use the Llama Stack (server and client SDK ) to test a simple RAG agent. +In this guide, we'll walk through how you can use the Llama Stack (server and client SDK) to test a simple RAG agent. A Llama Stack agent is a simple integrated system that can perform tasks by combining a Llama model for reasoning with tools (e.g., RAG, web search, code execution, etc.) for taking actions. @@ -42,8 +42,8 @@ To get started quickly, we provide various container images for the server compo Lets setup some environment variables that we will use in the rest of the guide. ```bash -INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" -LLAMA_STACK_PORT=8321 +export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" +export LLAMA_STACK_PORT=8321 ``` Next you can create a local directory to mount into the container’s file system. @@ -82,8 +82,10 @@ pip install llama-stack-client Let's use the `llama-stack-client` CLI to check the connectivity to the server. ```bash -llama-stack-client configure --endpoint http://localhost:$LLAMA_STACK_PORT -llama-stack-client models list +$ llama-stack-client configure --endpoint http://localhost:$LLAMA_STACK_PORT +> Enter the API key (leave empty if no key is needed): +Done! You can now use the Llama Stack Client CLI with endpoint http://localhost:8321 +$ llama-stack-client models list ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ identifier ┃ provider_id ┃ provider_resource_id ┃ metadata ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩