llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-05 10:23:44 +00:00

History

Ben Browning fa0b0c13d4 Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 17s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s Details Integration Tests / test-matrix (http, 3.13, inference) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 20s Details Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 15s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 14s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 1m10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1m8s Details Python Package Build Test / build (3.13) (push) Failing after 1m6s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m4s Details Pre-commit / pre-commit (push) Successful in 2m33s Details fix: Ollama should be optional in starter distro (#2482 ) # What does this PR do? Our starter distro required Ollama to be running (and a large list of models available in that Ollama) to successfully start. This adjusts things so that Ollama does not have to be running to use the starter template / distro. To accomplish this, a few changes were needed: * The Ollama provider is now configurable whether it raises an Exception or just logs a warning when it cannot reach the Ollama server on startup. The default is to raise an exception (same as previous behavior), but in the starter template we adjust this to just log a warning so that we can bring the stack up without needing a running Ollama server. * The starter template no longer specifies a default list of models for Ollama, as any models specified there need to actually be pulled and available in Ollama. Instead, it adds a new `OLLAMA_INFERENCE_MODEL` environment variable where users can provide an optional model to register with the Ollama provider on startup. Additional models can also be registered via the typical `models.register(...)` at runtime. * The vLLM template was adjusted to also allow an optional `VLLM_INFERENCE_MODEL` specified on startup, so that the behavior between vLLM and Ollama was consistent here to make it easy to get up and running quickly. * The default vector store was changed from sqlite-vec to faiss. sqlite-vec can enabled via setting the `ENABLE_SQLITE_VEC` environment variable, like we do for chromadb and pgvector. This is due to sqlite-vec not shipping proper arm64 binaries, like we previously fixed in #1530 for the ollama distribution. ## Test Plan With this change, the following scenarios now work with the starter template that did not before: * no Ollama running * Ollama running but not all of the Llama models pulled locally * Ollama running with a custom model registered on startup * vLLM running with a custom model registered on startup * running the starter template on linux/arm64, like when running containers on Mac without rosetta emulation --------- Signed-off-by: Ben Browning <bbrownin@redhat.com>		2025-06-25 15:54:00 +02:00
..
anthropic	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
bedrock	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
cerebras	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
cerebras_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
databricks	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
fireworks	feat: Add `suffix` to openai_completions (#2449 )	2025-06-13 16:06:06 -07:00
fireworks_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
gemini	feat: expand set of known gemini models (#2471 )	2025-06-19 12:19:37 -04:00
groq	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
groq_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
llama_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
nvidia	feat: Add `suffix` to openai_completions (#2449 )	2025-06-13 16:06:06 -07:00
ollama	fix: Ollama should be optional in starter distro (#2482 )	2025-06-25 15:54:00 +02:00
openai	feat: Add `suffix` to openai_completions (#2449 )	2025-06-13 16:06:06 -07:00
passthrough	feat: Add `suffix` to openai_completions (#2449 )	2025-06-13 16:06:06 -07:00
runpod	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
sambanova	fix(providers): update sambanova json schema mode (#2306 )	2025-05-29 09:54:23 -07:00
sambanova_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
tgi	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
together	feat: Add `suffix` to openai_completions (#2449 )	2025-06-13 16:06:06 -07:00
together_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
vllm	fix: Ollama should be optional in starter distro (#2482 )	2025-06-25 15:54:00 +02:00
watsonx	feat: Add `suffix` to openai_completions (#2449 )	2025-06-13 16:06:06 -07:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00