# Kubernetes Deployment Guide Instead of starting the Llama Stack and vLLM servers locally. We can deploy them in a Kubernetes cluster. In this guide, we'll use a local [Kind](https://kind.sigs.k8s.io/) cluster and a vLLM inference service in the same cluster for demonstration purposes. First, create a local Kubernetes cluster via Kind: ```bash kind create cluster --image kindest/node:v1.32.0 --name llama-stack-test ``` Start vLLM server as a Kubernetes Pod and Service: ```bash cat </tmp/test-vllm-llama-stack/Containerfile.llama-stack-run-k8s <