llama-stack-mirror/llama_stack/core/routers
Ashwin Bharambe 84ffcb8c2b fix(inference): enable routing of models with provider_data alone (#3928)
This PR enables routing of fully qualified model IDs of the form
`provider_id/model_id` even when the models are not registered with the
Stack.

Here's the situation: assume a remote inference provider which works
only when users provide their own API keys via
`X-LlamaStack-Provider-Data` header. By definition, we cannot list
models and hence update our routing registry. But because we _require_ a
provider ID in the models now, we can identify which provider to route
to and let that provider decide.

Note that we still try to look up our registry since it may have a
pre-registered alias. Just that we don't outright fail when we are not
able to look it up.

Also, updated inference router so that the responses have the _exact_
model that the request had.

Added an integration test

Closes #3929

---------

Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>
2025-11-12 13:34:30 -08:00
..
__init__.py chore(cleanup)!: kill vector_db references as far as possible (#3864) 2025-10-20 20:06:16 -07:00
datasets.py refactor(logging): rename llama_stack logger categories (#3065) 2025-08-21 17:31:04 -07:00
eval_scoring.py refactor(logging): rename llama_stack logger categories (#3065) 2025-08-21 17:31:04 -07:00
inference.py fix(inference): enable routing of models with provider_data alone (#3928) 2025-11-12 13:34:30 -08:00
safety.py refactor(logging): rename llama_stack logger categories (#3065) 2025-08-21 17:31:04 -07:00
tool_runtime.py revert: "chore(cleanup)!: remove tool_runtime.rag_tool" (#3877) 2025-10-21 11:22:06 -07:00
vector_io.py chore(cleanup)!: kill vector_db references as far as possible (#3864) 2025-10-20 20:06:16 -07:00