llama-stack-mirror/docs/source
Yuan Tang 9ff82036f7
docs: Simplify vLLM deployment in K8s deployment guide (#1655)
# What does this PR do?

* Removes the use of `huggingface-cli` 
* Simplifies HF cache mount path
* Simplifies vLLM server startup command
* Separates PVC/secret creation from deployment/service
* Fixes a typo: "pod" should be "deployment"

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-03-24 09:08:50 -07:00
..
building_applications fix: docker run with --pull always to fetch the latest image (#1733) 2025-03-20 15:35:48 -07:00
concepts docs: fix typos in evaluation concepts (#1745) 2025-03-21 12:00:53 -07:00
contributing chore: consolidate scripts under ./scripts directory (#1646) 2025-03-17 17:56:30 -04:00
distributions docs: Simplify vLLM deployment in K8s deployment guide (#1655) 2025-03-24 09:08:50 -07:00
getting_started fix: docker run with --pull always to fetch the latest image (#1733) 2025-03-20 15:35:48 -07:00
introduction docs: Remove mentions of focus on Llama models (#1690) 2025-03-19 00:17:22 -04:00
playground fix: docker run with --pull always to fetch the latest image (#1733) 2025-03-20 15:35:48 -07:00
providers feat: Qdrant inline provider (#1273) 2025-03-18 14:04:21 -07:00
references feat(api): (1/n) datasets api clean up (#1573) 2025-03-17 16:55:45 -07:00
conf.py fix: fetched latest pypi version when building documentation 2025-03-06 21:15:15 -08:00
index.md docs: Remove mentions of focus on Llama models (#1690) 2025-03-19 00:17:22 -04:00