llama-stack-mirror/llama_stack/distribution
Ben Browning 404708e99d fix: Ollama should be optional in starter distro
Our starter distro required Ollama to be running (and a large list of
models available in that Ollama) to successfully start. This adjusts
things so that Ollama does not have to be running to use the starter
template / distro.

To accomplish this, a few changes were needed:

* The Ollama provider is now configurable whether it raises an
Exception or just logs a warning when it cannot reach the Ollama
server on startup. The default is to raise an exception (same as
previous behavior), but in the starter template we adjust this to just
log a warning so that we can bring the stack up without needing a
running Ollama server.

* The starter template no longer specifies a default list of models
for Ollama, as any models specified there need to actually be pulled
and available in Ollama. Instead, it adds a new
`OLLAMA_INFERENCE_MODEL` environment variable where users can provide
an optional model to register with the Ollama provider on
startup. Additional models can also be registered via the typical
`models.register(...)` at runtime.

* The vLLM template was adjusted to also allow an optional
`VLLM_INFERENCE_MODEL` specified on startup, so that the behavior
between vLLM and Ollama was consistent here to make it easy to get up
and running quickly.

* The default vector store was changed from sqlite-vec to
faiss. sqlite-vec can enabled via setting the `ENABLE_SQLITE_VEC`
environment variable, like we do for chromadb and pgvector. This is
due to sqlite-vec not shipping proper arm64 binaries, like we
previously fixed in #1530 for the ollama distribution.

With this change, the following scenarios now work with the starter
template that did not before:

* no Ollama running
* Ollama running but not all of the Llama models pulled locally
* Ollama running with a custom model registered on startup
* vLLM running with a custom model registered on startup
* running the starter template on linux/arm64, like when running
containers on Mac without rosetta emulation

Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-06-25 09:04:45 -04:00
..
access_control feat: drop python 3.10 support (#2469) 2025-06-19 12:07:14 +05:30
routers feat: Add search_mode support to OpenAI vector store API (#2500) 2025-06-24 20:38:47 -04:00
routing_tables feat: fine grained access control policy (#2264) 2025-06-03 14:51:12 -07:00
server feat: drop python 3.10 support (#2469) 2025-06-19 12:07:14 +05:30
store fix(tools): do not index tools, only index toolgroups (#2261) 2025-05-25 13:27:52 -07:00
ui ci: add python package build test (#2457) 2025-06-19 18:57:32 +05:30
utils refactor: remove container from list of run image types (#2178) 2025-06-02 09:57:55 +02:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
build.py chore: bump python supported version to 3.12 (#2475) 2025-06-24 09:22:04 +05:30
build_conda_env.sh chore: fix build script bug (#2507) 2025-06-24 12:05:22 -07:00
build_container.sh chore: bump python supported version to 3.12 (#2475) 2025-06-24 09:22:04 +05:30
build_venv.sh chore: remove straggler references to llama-models (#1345) 2025-03-01 14:26:03 -08:00
client.py chore: make cprint write to stderr (#2250) 2025-05-24 23:39:57 -07:00
common.sh feat(pre-commit): enhance pre-commit hooks with additional checks (#2014) 2025-04-30 11:35:49 -07:00
configure.py feat: refactor external providers dir (#2049) 2025-05-15 20:17:03 +02:00
datatypes.py feat: fine grained access control policy (#2264) 2025-06-03 14:51:12 -07:00
distribution.py ci: fix external provider test (#2438) 2025-06-12 16:14:32 +02:00
inspect.py chore: use starlette built-in Route class (#2267) 2025-05-28 09:53:33 -07:00
library_client.py refactor: unify stream and non-stream impls for responses (#2388) 2025-06-05 17:48:09 +02:00
providers.py feat: drop python 3.10 support (#2469) 2025-06-19 12:07:14 +05:30
request_headers.py feat: fine grained access control policy (#2264) 2025-06-03 14:51:12 -07:00
resolver.py feat: support auth attributes in inference/responses stores (#2389) 2025-06-20 10:24:45 -07:00
stack.py fix: Ollama should be optional in starter distro 2025-06-25 09:04:45 -04:00
start_stack.sh refactor: remove container from list of run image types (#2178) 2025-06-02 09:57:55 +02:00