llama-stack/docs/source/distributions
Yuan Tang 9ff82036f7
docs: Simplify vLLM deployment in K8s deployment guide (#1655)
# What does this PR do?

* Removes the use of `huggingface-cli` 
* Simplifies HF cache mount path
* Simplifies vLLM server startup command
* Separates PVC/secret creation from deployment/service
* Fixes a typo: "pod" should be "deployment"

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-03-24 09:08:50 -07:00
..
ondevice_distro docs: Fix trailing whitespace error (#1669) 2025-03-17 08:53:30 -07:00
remote_hosted_distro fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
self_hosted_distro fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
building_distro.md docs: fixed broken tip in distro build docs (#1673) 2025-03-17 17:22:26 -07:00
configuration.md script for running client sdk tests (#895) 2025-02-19 22:38:06 -08:00
importing_as_library.md Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
index.md Add Kubernetes deployment guide (#899) 2025-02-06 10:28:02 -08:00
kubernetes_deployment.md docs: Simplify vLLM deployment in K8s deployment guide (#1655) 2025-03-24 09:08:50 -07:00
selection.md docs: small fixes (#1224) 2025-02-24 07:59:58 -05:00