# Kubernetes Deployment Guide Instead of starting the Llama Stack and vLLM servers locally. We can deploy them in a Kubernetes cluster. ### Prerequisites In this guide, we'll use a local [Kind](https://kind.sigs.k8s.io/) cluster and a vLLM inference service in the same cluster for demonstration purposes. First, create a local Kubernetes cluster via Kind: ```bash kind create cluster --image kindest/node:v1.32.0 --name llama-stack-test ``` First, create a Kubernetes PVC and Secret for downloading and storing Hugging Face model: ```bash cat </tmp/test-vllm-llama-stack/Containerfile.llama-stack-run-k8s <