docs: fix steps in the Quick Start Guide (#2800)

# What does this PR do?
'build' command didn't take into account ENABLE flags for starter distro

for some reason, I was having issues with HuggingFace access for the
embedding model, so added a tip for that as well

Closes #2779

## Test Plan
I ran the described steps manually, but it would be nice if someone else
could try it and verify this still works

We might consider having some CI job ensure the QSG remains functional -
it's not a great experience for new users if they try Llama Stack for
the first time and it doesn't work as we describe

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
This commit is contained in:
Nathan Weinberg 2025-07-18 12:08:46 -04:00 committed by GitHub
parent e45543f7f3
commit 2bb9039173
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -19,7 +19,7 @@ ollama run llama3.2:3b --keepalive 60m
#### Step 2: Run the Llama Stack server
We will use `uv` to run the Llama Stack server.
```bash
INFERENCE_MODEL=llama3.2:3b uv run --with llama-stack llama stack build --template starter --image-type venv --run
ENABLE_OLLAMA=ollama OLLAMA_INFERENCE_MODEL=llama3.2:3b uv run --with llama-stack llama stack build --template starter --image-type venv --run
```
#### Step 3: Run the demo
Now open up a new terminal and copy the following script into a file named `demo_script.py`.
@ -111,6 +111,12 @@ Ultimately, great work is about making a meaningful contribution and leaving a l
```
Congratulations! You've successfully built your first RAG application using Llama Stack! 🎉🥳
```{admonition} HuggingFace access
:class: tip
If you are getting a **401 Client Error** from HuggingFace for the **all-MiniLM-L6-v2** model, try setting **HF_TOKEN** to a valid HuggingFace token in your environment
```
### Next Steps
Now you're ready to dive deeper into Llama Stack!