fix: list models only for active providers

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-19 08:49:40 +00:00

There has been an error rolling around where we can retrieve a model when doing something like a chat completion but then we hit issues when trying to associate that model with an active provider.

This is a common thing that happens when:
1. you run the stack with say remote::ollama
2. you register a model, say llama3.2:3b
3. you do some completions, etc
4. you kill the server
5. you `unset OLLAMA_URL`
6. you re-start the stack
7. you do `llama-stack-client models list`

```
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ embedding     │ all-minilm                                                                       │ all-minilm:l6-v2                                                     │ {'embedding_dimension': 384.0,        │ ollama                   │
│               │                                                                                  │                                                                      │ 'context_length': 512.0}              │                          │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ llm           │ llama3.2:3b                                                                      │ llama3.2:3b                                                          │                                       │ ollama                   │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ embedding     │ ollama/all-minilm:l6-v2                                                          │ all-minilm:l6-v2                                                     │ {'embedding_dimension': 384.0,        │ ollama                   │
│               │                                                                                  │                                                                      │ 'context_length': 512.0}              │                          │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ llm           │ ollama/llama3.2:3b                                                               │ llama3.2:3b                                                          │                                       │ ollama                   │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤

```

This shouldn't be happening, `ollama` isn't a provider running, and the only reason the model is popping up is because its in the dist_registry (on disk).

While its nice to have this static store so that if I go and `export OLLAMA_URL=..` again, it can read from the store, it shouldn't _always_ be reading and returning these models from the store

now if you `llama-stack-client models list` with this change, no more llama3.2:3b appears.

Signed-off-by: Charlie Doern <cdoern@redhat.com>

This commit is contained in:

Charlie Doern

2025-08-14 08:08:39 -04:00

parent 27d6becfd0

commit 14f96d7079

9 changed files with 25 additions and 17 deletions

									
										2

llama_stack/core/routers/vector_io.py
									
										View file
										
				@ -171,7 +171,7 @@ class VectorIORouter(VectorIO):

				        logger.debug(f"VectorIORouter.openai_list_vector_stores: limit={limit}")

				        # Route to default provider for now - could aggregate from all providers in the future

				        # call retrieve on each vector dbs to get list of vector stores

				        vector_dbs = await self.routing_table.get_all_with_type("vector_db")

				        vector_dbs = await self.routing_table.get_all_with_type_filtered("vector_db")

				        all_stores = []

				        for vector_db in vector_dbs:

				            try:

Rows
Columns

fix: list models only for active providers

2 llama_stack/core/routers/vector_io.py Unescape Escape View file

2

llama_stack/core/routers/vector_io.py

View file