llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Ashwin Bharambe 84ffcb8c2b fix(inference): enable routing of models with provider_data alone (#3928 ) This PR enables routing of fully qualified model IDs of the form `provider_id/model_id` even when the models are not registered with the Stack. Here's the situation: assume a remote inference provider which works only when users provide their own API keys via `X-LlamaStack-Provider-Data` header. By definition, we cannot list models and hence update our routing registry. But because we _require_ a provider ID in the models now, we can identify which provider to route to and let that provider decide. Note that we still try to look up our registry since it may have a pre-registered alias. Just that we don't outright fail when we are not able to look it up. Also, updated inference router so that the responses have the _exact_ model that the request had. Added an integration test Closes #3929 --------- Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>		2025-11-12 13:34:30 -08:00
..
apis	revert: "chore(cleanup)!: remove tool_runtime.rag_tool" (#3877 )	2025-10-21 11:22:06 -07:00
cli	fix: print help for list-deps if no args (backport #4078 ) (#4083 )	2025-11-05 14:58:47 -08:00
core	fix(inference): enable routing of models with provider_data alone (#3928 )	2025-11-12 13:34:30 -08:00
distributions	fix: harden storage semantics (backport #4118 ) (#4138 )	2025-11-12 13:01:21 -08:00
models	chore: remove dead code (#3729 )	2025-10-07 20:26:02 -07:00
providers	fix(inference): enable routing of models with provider_data alone (#3928 )	2025-11-12 13:34:30 -08:00
strong_typing	chore: refactor (chat)completions endpoints to use shared params struct (#3761 )	2025-10-10 15:46:34 -07:00
testing	feat(ci): add support for docker:distro in tests (#3832 )	2025-10-16 19:33:13 -07:00
ui	fix(ci): export UV_INDEX_STRATEGY to current shell before running uv sync (#4019 )	2025-11-01 12:54:19 -07:00
__init__.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
env.py	refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401 )	2025-03-04 14:53:47 -08:00
log.py	fix(logs): restore uvicorn and llama_stack logger settings	2025-10-21 15:47:55 -07:00
schema_utils.py	fix(auth): allow unauthenticated access to health and version endpoints (#3736 )	2025-10-10 13:41:43 -07:00