This commit is contained in:
Young Han 2025-07-14 17:52:27 -07:00
parent 723a870171
commit 2153957ab6
2 changed files with 6 additions and 4 deletions

View file

@ -1,8 +1,9 @@
<!-- This file was auto-generated by distro_codegen.py, please edit source -->
# Llama Stack with llama.cpp
This template demonstrates how to utilize Llama Stack with [llama.cpp](https://github.com/ggerganov/llama.cpp) as the inference provider. \n
Previously, the use of quantized models with Llama Stack was restricted, but now it is fully supported through llama.cpp. \n
This template demonstrates how to utilize Llama Stack with [llama.cpp](https://github.com/ggerganov/llama.cpp) as the inference provider.
Previously, the use of quantized models with Llama Stack was restricted, but now it is fully supported through llama.cpp.
You can employ any .gguf models available on [Hugging Face](https://huggingface.co/models) with this template.

View file

@ -1,7 +1,8 @@
# Llama Stack with llama.cpp
This template demonstrates how to utilize Llama Stack with [llama.cpp](https://github.com/ggerganov/llama.cpp) as the inference provider. \n
Previously, the use of quantized models with Llama Stack was restricted, but now it is fully supported through llama.cpp. \n
This template demonstrates how to utilize Llama Stack with [llama.cpp](https://github.com/ggerganov/llama.cpp) as the inference provider.
Previously, the use of quantized models with Llama Stack was restricted, but now it is fully supported through llama.cpp.
You can employ any .gguf models available on [Hugging Face](https://huggingface.co/models) with this template.