mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-17 18:19:51 +00:00
There has been an error rolling around where we can retrieve a model when doing something like a chat completion but then we hit issues when trying to associate that model with an active provider.
This is a common thing that happens when:
1. you run the stack with say remote::ollama
2. you register a model, say llama3.2:3b
3. you do some completions, etc
4. you kill the server
5. you `unset OLLAMA_URL`
6. you re-start the stack
7. you do `llama-stack-client models list`
```
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ embedding │ all-minilm │ all-minilm:l6-v2 │ {'embedding_dimension': 384.0, │ ollama │
│ │ │ │ 'context_length': 512.0} │ │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ llm │ llama3.2:3b │ llama3.2:3b │ │ ollama │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ embedding │ ollama/all-minilm:l6-v2 │ all-minilm:l6-v2 │ {'embedding_dimension': 384.0, │ ollama │
│ │ │ │ 'context_length': 512.0} │ │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ llm │ ollama/llama3.2:3b │ llama3.2:3b │ │ ollama │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
```
This shouldn't be happening, `ollama` isn't a provider running, and the only reason the model is popping up is because its in the dist_registry (on disk).
While its nice to have this static store so that if I go and `export OLLAMA_URL=..` again, it can read from the store, it shouldn't _always_ be reading and returning these models from the store
now if you `llama-stack-client models list` with this change, no more llama3.2:3b appears.
Signed-off-by: Charlie Doern <cdoern@redhat.com>
|
||
|---|---|---|
| .. | ||
| apis | ||
| cli | ||
| core | ||
| distributions | ||
| models | ||
| providers | ||
| strong_typing | ||
| testing | ||
| ui | ||
| __init__.py | ||
| env.py | ||
| log.py | ||
| schema_utils.py | ||