fix: revert to using faiss for ollama distro (#1530)

This is unfortunate because `sqlite-vec` seems promising. But its PIP package is not quite complete. It does not have binary for arm64 (I think, or maybe it even lacks 64 bit builds?) which results in the arm64 container resulting in ``` File "/usr/local/lib/python3.10/site-packages/sqlite_vec/init.py", line 17, in load conn.load_extension(loadable_path()) sqlite3.OperationalError: /usr/local/lib/python3.10/site-packages/sqlite_vec/vec0.so: wrong ELF class: ELFCLASS32 ``` To get around I tried to install from source via `uv pip install sqlite-vec --no-binary=sqlite-vec` however it even lacks a source distribution which makes that impossible. ## Test Plan Build the container locally using: ```bash LLAMA_STACK_DIR=. llama stack build --template ollama --image-type container ``` Run the container as: ``` podman run --privileged -it -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \ -v ~/.llama:/root/.llama \ --env INFERENCE_MODEL=$INFERENCE_MODEL \ --env OLLAMA_URL=http://host.containers.internal:11434 \ -v ~/local/llama-stack:/app/llama-stack-source localhost/distribution-ollama:dev --port $LLAMA_STACK_PORT ``` Verify the container starts up correctly. Without this patch, it would encounter the ELFCLASS32 error.
2025-12-03 09:53:45 +00:00 · 2025-03-10 16:15:17 -07:00 · 2025-03-10 16:15:17 -07:00 · dc84bc755a
commit dc84bc755a
parent 21e39633d8
7 changed files with 25 additions and 17 deletions
--- a/docs/source/distributions/self_hosted_distro/ollama.md
+++ b/docs/source/distributions/self_hosted_distro/ollama.md
@ -23,7 +23,7 @@ The `llamastack/distribution-ollama` distribution consists of the following prov
 | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` |
 | telemetry | `inline::meta-reference` |
 | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::code-interpreter`, `inline::rag-runtime`, `remote::model-context-protocol`, `remote::wolfram-alpha` |
-| vector_io | `inline::sqlite-vec`, `remote::chromadb`, `remote::pgvector` |
+| vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` |


 You should use this distribution if you have a regular desktop machine without very powerful GPUs. Of course, if you have powerful GPUs, you can still continue using this distribution since Ollama supports GPU acceleration.