docs: fix errors in kubernetes deployment guide (#1914)

# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
Fixes a couple of errors in PVC/Secret setup and adds context for
expected Hugging Face token
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]

[//]: # (## Documentation)
This commit is contained in:
Mark Campbell 2025-04-11 12:04:13 +01:00 committed by GitHub
parent 2fcb70b789
commit 6aa459b00c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -11,7 +11,12 @@ First, create a local Kubernetes cluster via Kind:
kind create cluster --image kindest/node:v1.32.0 --name llama-stack-test kind create cluster --image kindest/node:v1.32.0 --name llama-stack-test
``` ```
First, create a Kubernetes PVC and Secret for downloading and storing Hugging Face model: First set your hugging face token as an environment variable.
```
export HF_TOKEN=$(echo -n "your-hf-token" | base64)
```
Now create a Kubernetes PVC and Secret for downloading and storing Hugging Face model:
``` ```
cat <<EOF |kubectl apply -f - cat <<EOF |kubectl apply -f -
@ -33,7 +38,8 @@ metadata:
name: hf-token-secret name: hf-token-secret
type: Opaque type: Opaque
data: data:
token: $(HF_TOKEN) token: $HF_TOKEN
EOF
``` ```
@ -120,7 +126,7 @@ providers:
Once we have defined the run configuration for Llama Stack, we can build an image with that configuration and the server source code: Once we have defined the run configuration for Llama Stack, we can build an image with that configuration and the server source code:
``` ```
cat >/tmp/test-vllm-llama-stack/Containerfile.llama-stack-run-k8s <<EOF tmp_dir=$(mktemp -d) && cat >$tmp_dir/Containerfile.llama-stack-run-k8s <<EOF
FROM distribution-myenv:dev FROM distribution-myenv:dev
RUN apt-get update && apt-get install -y git RUN apt-get update && apt-get install -y git
@ -128,7 +134,7 @@ RUN git clone https://github.com/meta-llama/llama-stack.git /app/llama-stack-sou
ADD ./vllm-llama-stack-run-k8s.yaml /app/config.yaml ADD ./vllm-llama-stack-run-k8s.yaml /app/config.yaml
EOF EOF
podman build -f /tmp/test-vllm-llama-stack/Containerfile.llama-stack-run-k8s -t llama-stack-run-k8s /tmp/test-vllm-llama-stack podman build -f $tmp_dir/Containerfile.llama-stack-run-k8s -t llama-stack-run-k8s $tmp_dir
``` ```
### Deploying Llama Stack Server in Kubernetes ### Deploying Llama Stack Server in Kubernetes