docs: fix steps in the Quick Start Guide

'build' command didn't take into account new flags
for starter distro

for some reason, I was having issues with HuggingFace
access for the embedding model, so added a tip for
that as well

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
This commit is contained in:
Nathan Weinberg 2025-07-17 11:09:13 -04:00
parent 0eb0583cdf
commit ed17af1fba

View file

@ -19,7 +19,7 @@ ollama run llama3.2:3b --keepalive 60m
#### Step 2: Run the Llama Stack server #### Step 2: Run the Llama Stack server
We will use `uv` to run the Llama Stack server. We will use `uv` to run the Llama Stack server.
```bash ```bash
INFERENCE_MODEL=llama3.2:3b uv run --with llama-stack llama stack build --template starter --image-type venv --run ENABLE_OLLAMA=ollama OLLAMA_INFERENCE_MODEL=llama3.2:3b uv run --with llama-stack llama stack build --template starter --image-type venv --run
``` ```
#### Step 3: Run the demo #### Step 3: Run the demo
Now open up a new terminal and copy the following script into a file named `demo_script.py`. Now open up a new terminal and copy the following script into a file named `demo_script.py`.
@ -111,6 +111,12 @@ Ultimately, great work is about making a meaningful contribution and leaving a l
``` ```
Congratulations! You've successfully built your first RAG application using Llama Stack! 🎉🥳 Congratulations! You've successfully built your first RAG application using Llama Stack! 🎉🥳
```{admonition} HuggingFace access
:class: tip
If you are getting a **401 Client Error** from HuggingFace for the **all-MiniLM-L6-v2** model, try setting **HF_TOKEN** to a valid HuggingFace token in your environment
```
### Next Steps ### Next Steps
Now you're ready to dive deeper into Llama Stack! Now you're ready to dive deeper into Llama Stack!