llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Yuan Tang 9ff82036f7 docs: Simplify vLLM deployment in K8s deployment guide (#1655 ) # What does this PR do? * Removes the use of `huggingface-cli` * Simplifies HF cache mount path * Simplifies vLLM server startup command * Separates PVC/secret creation from deployment/service * Fixes a typo: "pod" should be "deployment" Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>		2025-03-24 09:08:50 -07:00
..
ondevice_distro	docs: Fix trailing whitespace error (#1669 )	2025-03-17 08:53:30 -07:00
remote_hosted_distro	fix: Default to port 8321 everywhere (#1734 )	2025-03-20 15:50:41 -07:00
self_hosted_distro	fix: Default to port 8321 everywhere (#1734 )	2025-03-20 15:50:41 -07:00
building_distro.md	docs: fixed broken tip in distro build docs (#1673 )	2025-03-17 17:22:26 -07:00
configuration.md	script for running client sdk tests (#895 )	2025-02-19 22:38:06 -08:00
importing_as_library.md	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
index.md	Add Kubernetes deployment guide (#899 )	2025-02-06 10:28:02 -08:00
kubernetes_deployment.md	docs: Simplify vLLM deployment in K8s deployment guide (#1655 )	2025-03-24 09:08:50 -07:00
selection.md	docs: small fixes (#1224 )	2025-02-24 07:59:58 -05:00