Commit graph

2 commits

Author SHA1 Message Date
Yuan Tang
9ff82036f7
docs: Simplify vLLM deployment in K8s deployment guide (#1655)
# What does this PR do?

* Removes the use of `huggingface-cli` 
* Simplifies HF cache mount path
* Simplifies vLLM server startup command
* Separates PVC/secret creation from deployment/service
* Fixes a typo: "pod" should be "deployment"

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-03-24 09:08:50 -07:00
Yuan Tang
09ed0e9c9f
Add Kubernetes deployment guide (#899)
This PR moves some content from [the recent blog
post](https://blog.vllm.ai/2025/01/27/intro-to-llama-stack-with-vllm.html)
to here as a more official guide for users who'd like to deploy Llama
Stack on Kubernetes.

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-06 10:28:02 -08:00