forked from phoenix-oss/llama-stack-mirror
docs: fix errors in kubernetes deployment guide (#1914)
# What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] Fixes a couple of errors in PVC/Secret setup and adds context for expected Hugging Face token [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.*] [//]: # (## Documentation)
This commit is contained in:
parent
2fcb70b789
commit
6aa459b00c
1 changed files with 10 additions and 4 deletions
|
@ -11,7 +11,12 @@ First, create a local Kubernetes cluster via Kind:
|
||||||
kind create cluster --image kindest/node:v1.32.0 --name llama-stack-test
|
kind create cluster --image kindest/node:v1.32.0 --name llama-stack-test
|
||||||
```
|
```
|
||||||
|
|
||||||
First, create a Kubernetes PVC and Secret for downloading and storing Hugging Face model:
|
First set your hugging face token as an environment variable.
|
||||||
|
```
|
||||||
|
export HF_TOKEN=$(echo -n "your-hf-token" | base64)
|
||||||
|
```
|
||||||
|
|
||||||
|
Now create a Kubernetes PVC and Secret for downloading and storing Hugging Face model:
|
||||||
|
|
||||||
```
|
```
|
||||||
cat <<EOF |kubectl apply -f -
|
cat <<EOF |kubectl apply -f -
|
||||||
|
@ -33,7 +38,8 @@ metadata:
|
||||||
name: hf-token-secret
|
name: hf-token-secret
|
||||||
type: Opaque
|
type: Opaque
|
||||||
data:
|
data:
|
||||||
token: $(HF_TOKEN)
|
token: $HF_TOKEN
|
||||||
|
EOF
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
@ -120,7 +126,7 @@ providers:
|
||||||
Once we have defined the run configuration for Llama Stack, we can build an image with that configuration and the server source code:
|
Once we have defined the run configuration for Llama Stack, we can build an image with that configuration and the server source code:
|
||||||
|
|
||||||
```
|
```
|
||||||
cat >/tmp/test-vllm-llama-stack/Containerfile.llama-stack-run-k8s <<EOF
|
tmp_dir=$(mktemp -d) && cat >$tmp_dir/Containerfile.llama-stack-run-k8s <<EOF
|
||||||
FROM distribution-myenv:dev
|
FROM distribution-myenv:dev
|
||||||
|
|
||||||
RUN apt-get update && apt-get install -y git
|
RUN apt-get update && apt-get install -y git
|
||||||
|
@ -128,7 +134,7 @@ RUN git clone https://github.com/meta-llama/llama-stack.git /app/llama-stack-sou
|
||||||
|
|
||||||
ADD ./vllm-llama-stack-run-k8s.yaml /app/config.yaml
|
ADD ./vllm-llama-stack-run-k8s.yaml /app/config.yaml
|
||||||
EOF
|
EOF
|
||||||
podman build -f /tmp/test-vllm-llama-stack/Containerfile.llama-stack-run-k8s -t llama-stack-run-k8s /tmp/test-vllm-llama-stack
|
podman build -f $tmp_dir/Containerfile.llama-stack-run-k8s -t llama-stack-run-k8s $tmp_dir
|
||||||
```
|
```
|
||||||
|
|
||||||
### Deploying Llama Stack Server in Kubernetes
|
### Deploying Llama Stack Server in Kubernetes
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue