llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-17 20:29:47 +00:00

History

Charlie Doern 14f96d7079 fix: list models only for active providers There has been an error rolling around where we can retrieve a model when doing something like a chat completion but then we hit issues when trying to associate that model with an active provider. This is a common thing that happens when: 1. you run the stack with say remote::ollama 2. you register a model, say llama3.2:3b 3. you do some completions, etc 4. you kill the server 5. you `unset OLLAMA_URL` 6. you re-start the stack 7. you do `llama-stack-client models list` ``` ├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤ │ embedding │ all-minilm │ all-minilm:l6-v2 │ {'embedding_dimension': 384.0, │ ollama │ │ │ │ │ 'context_length': 512.0} │ │ ├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤ │ llm │ llama3.2:3b │ llama3.2:3b │ │ ollama │ ├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤ │ embedding │ ollama/all-minilm:l6-v2 │ all-minilm:l6-v2 │ {'embedding_dimension': 384.0, │ ollama │ │ │ │ │ 'context_length': 512.0} │ │ ├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤ │ llm │ ollama/llama3.2:3b │ llama3.2:3b │ │ ollama │ ├───────────────┼──────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤ ``` This shouldn't be happening, `ollama` isn't a provider running, and the only reason the model is popping up is because its in the dist_registry (on disk). While its nice to have this static store so that if I go and `export OLLAMA_URL=..` again, it can read from the store, it shouldn't _always_ be reading and returning these models from the store now if you `llama-stack-client models list` with this change, no more llama3.2:3b appears. Signed-off-by: Charlie Doern <cdoern@redhat.com>		2025-08-18 16:28:14 -04:00
..
access_control	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
routers	fix: list models only for active providers	2025-08-18 16:28:14 -04:00
routing_tables	fix: list models only for active providers	2025-08-18 16:28:14 -04:00
server	feat: add batches API with OpenAI compatibility (with inference replay) (#3162 )	2025-08-15 15:34:15 -07:00
store	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
ui	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
utils	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
__init__.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
build.py	chore(tests): fix responses and vector_io tests (#3119 )	2025-08-12 16:15:53 -07:00
build_container.sh	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
build_venv.sh	refactor: remove Conda support from Llama Stack (#2969 )	2025-08-02 15:52:59 -07:00
client.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
common.sh	refactor: remove Conda support from Llama Stack (#2969 )	2025-08-02 15:52:59 -07:00
configure.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
datatypes.py	refactor: remove Conda support from Llama Stack (#2969 )	2025-08-02 15:52:59 -07:00
distribution.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
external.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
inspect.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
library_client.py	refactor: modify DELETE API endpoints by returning HTTP 204 No Content + empty body instead of 200 OK + response body with null (#3112 )	2025-08-13 07:56:26 -07:00
providers.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
request_headers.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
resolver.py	feat: add batches API with OpenAI compatibility (with inference replay) (#3162 )	2025-08-15 15:34:15 -07:00
stack.py	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
start_stack.sh	refactor: remove Conda support from Llama Stack (#2969 )	2025-08-02 15:52:59 -07:00