mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-19 08:49:40 +00:00
fix: list models only for active providers
There has been an error rolling around where we can retrieve a model when doing something like a chat completion but then we hit issues when trying to associate that model with an active provider.
This is a common thing that happens when:
1. you run the stack with say remote::ollama
2. you register a model, say llama3.2:3b
3. you do some completions, etc
4. you kill the server
5. you `unset OLLAMA_URL`
6. you re-start the stack
7. you do `llama-stack-client models list`
```
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ embedding │ all-minilm │ all-minilm:l6-v2 │ {'embedding_dimension': 384.0, │ ollama │
│ │ │ │ 'context_length': 512.0} │ │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ llm │ llama3.2:3b │ llama3.2:3b │ │ ollama │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ embedding │ ollama/all-minilm:l6-v2 │ all-minilm:l6-v2 │ {'embedding_dimension': 384.0, │ ollama │
│ │ │ │ 'context_length': 512.0} │ │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ llm │ ollama/llama3.2:3b │ llama3.2:3b │ │ ollama │
├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
```
This shouldn't be happening, `ollama` isn't a provider running, and the only reason the model is popping up is because its in the dist_registry (on disk).
While its nice to have this static store so that if I go and `export OLLAMA_URL=..` again, it can read from the store, it shouldn't _always_ be reading and returning these models from the store
now if you `llama-stack-client models list` with this change, no more llama3.2:3b appears.
Signed-off-by: Charlie Doern <cdoern@redhat.com>
This commit is contained in:
parent
27d6becfd0
commit
14f96d7079
9 changed files with 25 additions and 17 deletions
|
|
@ -171,7 +171,7 @@ class VectorIORouter(VectorIO):
|
|||
logger.debug(f"VectorIORouter.openai_list_vector_stores: limit={limit}")
|
||||
# Route to default provider for now - could aggregate from all providers in the future
|
||||
# call retrieve on each vector dbs to get list of vector stores
|
||||
vector_dbs = await self.routing_table.get_all_with_type("vector_db")
|
||||
vector_dbs = await self.routing_table.get_all_with_type_filtered("vector_db")
|
||||
all_stores = []
|
||||
for vector_db in vector_dbs:
|
||||
try:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue