llama-stack-mirror/llama_stack/core
Charlie Doern 14f96d7079 fix: list models only for active providers
There has been an error rolling around where we can retrieve a model when doing something like a chat completion but then we hit issues when trying to associate that model with an active provider.

This is a common thing that happens when:
1. you run the stack with say remote::ollama
2. you register a model, say llama3.2:3b
3. you do some completions, etc
4. you kill the server
5. you `unset OLLAMA_URL`
6. you re-start the stack
7. you do `llama-stack-client models list`

```
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ embedding     │ all-minilm                                                                       │ all-minilm:l6-v2                                                     │ {'embedding_dimension': 384.0,        │ ollama                   │
│               │                                                                                  │                                                                      │ 'context_length': 512.0}              │                          │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ llm           │ llama3.2:3b                                                                      │ llama3.2:3b                                                          │                                       │ ollama                   │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ embedding     │ ollama/all-minilm:l6-v2                                                          │ all-minilm:l6-v2                                                     │ {'embedding_dimension': 384.0,        │ ollama                   │
│               │                                                                                  │                                                                      │ 'context_length': 512.0}              │                          │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ llm           │ ollama/llama3.2:3b                                                               │ llama3.2:3b                                                          │                                       │ ollama                   │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤

```

This shouldn't be happening, `ollama` isn't a provider running, and the only reason the model is popping up is because its in the dist_registry (on disk).

While its nice to have this static store so that if I go and `export OLLAMA_URL=..` again, it can read from the store, it shouldn't _always_ be reading and returning these models from the store

now if you `llama-stack-client models list` with this change, no more llama3.2:3b appears.

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-08-18 16:28:14 -04:00
..
access_control chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
routers fix: list models only for active providers 2025-08-18 16:28:14 -04:00
routing_tables fix: list models only for active providers 2025-08-18 16:28:14 -04:00
server feat: add batches API with OpenAI compatibility (with inference replay) (#3162) 2025-08-15 15:34:15 -07:00
store chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
ui chore: rename templates to distributions (#3035) 2025-08-04 11:34:17 -07:00
utils chore: rename templates to distributions (#3035) 2025-08-04 11:34:17 -07:00
__init__.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
build.py chore(tests): fix responses and vector_io tests (#3119) 2025-08-12 16:15:53 -07:00
build_container.sh chore: rename templates to distributions (#3035) 2025-08-04 11:34:17 -07:00
build_venv.sh refactor: remove Conda support from Llama Stack (#2969) 2025-08-02 15:52:59 -07:00
client.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
common.sh refactor: remove Conda support from Llama Stack (#2969) 2025-08-02 15:52:59 -07:00
configure.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
datatypes.py refactor: remove Conda support from Llama Stack (#2969) 2025-08-02 15:52:59 -07:00
distribution.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
external.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
inspect.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
library_client.py refactor: modify DELETE API endpoints by returning HTTP 204 No Content + empty body instead of 200 OK + response body with null (#3112) 2025-08-13 07:56:26 -07:00
providers.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
request_headers.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
resolver.py feat: add batches API with OpenAI compatibility (with inference replay) (#3162) 2025-08-15 15:34:15 -07:00
stack.py chore: rename templates to distributions (#3035) 2025-08-04 11:34:17 -07:00
start_stack.sh refactor: remove Conda support from Llama Stack (#2969) 2025-08-02 15:52:59 -07:00