From 0cbb3e401c6b8fea89b5aa5157fab2809c442efd Mon Sep 17 00:00:00 2001
From: Ihar Hrachyshka <ihrachys@redhat.com>
Date: Tue, 4 Feb 2025 18:31:30 -0500
Subject: [PATCH] docs: miscellaneous small fixes (#961)

- **[docs] Fix misc typos and formatting issues in intro docs**
- **[docs]: Export variables (e.g. INFERENCE_MODEL) in getting_started**
- **[docs] Show that `llama-stack-client configure` will ask for api
key**

# What does this PR do?

Miscellaneous fixes in the documentation; not worth reporting an issue.

## Test Plan

No code changes. Addressed issues spotted when walking through the
guide.
Confirmed locally.

## Sources

Please link relevant resources if necessary.

## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.

---------

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
---
 docs/source/distributions/selection.md               |  2 +-
 .../distributions/self_hosted_distro/ollama.md       |  4 +++-
 docs/source/getting_started/index.md                 | 12 +++++++-----
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/docs/source/distributions/selection.md b/docs/source/distributions/selection.md
index 353da21d4..da1b0df9c 100644
--- a/docs/source/distributions/selection.md
+++ b/docs/source/distributions/selection.md
@@ -23,7 +23,7 @@ Which templates / distributions to choose depends on the hardware you have for r
   - {dockerhub}`distribution-together` ([Guide](self_hosted_distro/together))
   - {dockerhub}`distribution-fireworks` ([Guide](self_hosted_distro/fireworks))
 
-- **Do you want to run Llama Stack inference on your iOS / Android device**  Lastly, we also provide templates for running Llama Stack inference on your iOS / Android device:
+- **Do you want to run Llama Stack inference on your iOS / Android device?**  Lastly, we also provide templates for running Llama Stack inference on your iOS / Android device:
   - [iOS SDK](ondevice_distro/ios_sdk)
   - [Android](ondevice_distro/android_sdk)
 
diff --git a/docs/source/distributions/self_hosted_distro/ollama.md b/docs/source/distributions/self_hosted_distro/ollama.md
index 92e1f7dbf..e7c729501 100644
--- a/docs/source/distributions/self_hosted_distro/ollama.md
+++ b/docs/source/distributions/self_hosted_distro/ollama.md
@@ -25,7 +25,9 @@ The `llamastack/distribution-ollama` distribution consists of the following prov
 | vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` |
 
 
-You should use this distribution if you have a regular desktop machine without very powerful GPUs. Of course, if you have powerful GPUs, you can still continue using this distribution since Ollama supports GPU acceleration.### Environment Variables
+You should use this distribution if you have a regular desktop machine without very powerful GPUs. Of course, if you have powerful GPUs, you can still continue using this distribution since Ollama supports GPU acceleration.
+
+### Environment Variables
 
 The following environment variables can be configured:
 
diff --git a/docs/source/getting_started/index.md b/docs/source/getting_started/index.md
index ce89919a6..0634f4b1a 100644
--- a/docs/source/getting_started/index.md
+++ b/docs/source/getting_started/index.md
@@ -1,6 +1,6 @@
 # Quick Start
 
-In this guide, we'll walk through how you can use the Llama Stack (server and client SDK ) to test a simple RAG agent.
+In this guide, we'll walk through how you can use the Llama Stack (server and client SDK) to test a simple RAG agent.
 
 A Llama Stack agent is a simple integrated system that can perform tasks by combining a Llama model for reasoning with tools (e.g., RAG, web search, code execution, etc.) for taking actions.
 
@@ -42,8 +42,8 @@ To get started quickly, we provide various container images for the server compo
 
 Lets setup some environment variables that we will use in the rest of the guide.
 ```bash
-INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct"
-LLAMA_STACK_PORT=8321
+export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct"
+export LLAMA_STACK_PORT=8321
 ```
 
 Next you can create a local directory to mount into the container’s file system.
@@ -82,8 +82,10 @@ pip install llama-stack-client
 Let's use the `llama-stack-client` CLI to check the connectivity to the server.
 
 ```bash
-llama-stack-client configure --endpoint http://localhost:$LLAMA_STACK_PORT
-llama-stack-client models list
+$ llama-stack-client configure --endpoint http://localhost:$LLAMA_STACK_PORT
+> Enter the API key (leave empty if no key is needed):
+Done! You can now use the Llama Stack Client CLI with endpoint http://localhost:8321
+$ llama-stack-client models list
 ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
 ┃ identifier                       ┃ provider_id ┃ provider_resource_id      ┃ metadata ┃
 ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩