mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-28 19:04:19 +00:00
docs: fix errors in kubernetes deployment guide (#1914)
# What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] Fixes a couple of errors in PVC/Secret setup and adds context for expected Hugging Face token [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.*] [//]: # (## Documentation)
This commit is contained in:
parent
2fcb70b789
commit
6aa459b00c
1 changed files with 10 additions and 4 deletions
|
@ -11,7 +11,12 @@ First, create a local Kubernetes cluster via Kind:
|
|||
kind create cluster --image kindest/node:v1.32.0 --name llama-stack-test
|
||||
```
|
||||
|
||||
First, create a Kubernetes PVC and Secret for downloading and storing Hugging Face model:
|
||||
First set your hugging face token as an environment variable.
|
||||
```
|
||||
export HF_TOKEN=$(echo -n "your-hf-token" | base64)
|
||||
```
|
||||
|
||||
Now create a Kubernetes PVC and Secret for downloading and storing Hugging Face model:
|
||||
|
||||
```
|
||||
cat <<EOF |kubectl apply -f -
|
||||
|
@ -33,7 +38,8 @@ metadata:
|
|||
name: hf-token-secret
|
||||
type: Opaque
|
||||
data:
|
||||
token: $(HF_TOKEN)
|
||||
token: $HF_TOKEN
|
||||
EOF
|
||||
```
|
||||
|
||||
|
||||
|
@ -120,7 +126,7 @@ providers:
|
|||
Once we have defined the run configuration for Llama Stack, we can build an image with that configuration and the server source code:
|
||||
|
||||
```
|
||||
cat >/tmp/test-vllm-llama-stack/Containerfile.llama-stack-run-k8s <<EOF
|
||||
tmp_dir=$(mktemp -d) && cat >$tmp_dir/Containerfile.llama-stack-run-k8s <<EOF
|
||||
FROM distribution-myenv:dev
|
||||
|
||||
RUN apt-get update && apt-get install -y git
|
||||
|
@ -128,7 +134,7 @@ RUN git clone https://github.com/meta-llama/llama-stack.git /app/llama-stack-sou
|
|||
|
||||
ADD ./vllm-llama-stack-run-k8s.yaml /app/config.yaml
|
||||
EOF
|
||||
podman build -f /tmp/test-vllm-llama-stack/Containerfile.llama-stack-run-k8s -t llama-stack-run-k8s /tmp/test-vllm-llama-stack
|
||||
podman build -f $tmp_dir/Containerfile.llama-stack-run-k8s -t llama-stack-run-k8s $tmp_dir
|
||||
```
|
||||
|
||||
### Deploying Llama Stack Server in Kubernetes
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue