From 9ff82036f7872b472eea4907c062b4e7551001c7 Mon Sep 17 00:00:00 2001 From: Yuan Tang Date: Mon, 24 Mar 2025 12:08:50 -0400 Subject: [PATCH] docs: Simplify vLLM deployment in K8s deployment guide (#1655) # What does this PR do? * Removes the use of `huggingface-cli` * Simplifies HF cache mount path * Simplifies vLLM server startup command * Separates PVC/secret creation from deployment/service * Fixes a typo: "pod" should be "deployment" Signed-off-by: Yuan Tang --- .../distributions/kubernetes_deployment.md | 40 +++++++++---------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/docs/source/distributions/kubernetes_deployment.md b/docs/source/distributions/kubernetes_deployment.md index 6cca2bc47..1b4467934 100644 --- a/docs/source/distributions/kubernetes_deployment.md +++ b/docs/source/distributions/kubernetes_deployment.md @@ -8,7 +8,7 @@ First, create a local Kubernetes cluster via Kind: kind create cluster --image kindest/node:v1.32.0 --name llama-stack-test ``` -Start vLLM server as a Kubernetes Pod and Service: +First, create a Kubernetes PVC and Secret for downloading and storing Hugging Face model: ```bash cat <