llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Ashwin Bharambe 84ffcb8c2b fix(inference): enable routing of models with provider_data alone (#3928 ) This PR enables routing of fully qualified model IDs of the form `provider_id/model_id` even when the models are not registered with the Stack. Here's the situation: assume a remote inference provider which works only when users provide their own API keys via `X-LlamaStack-Provider-Data` header. By definition, we cannot list models and hence update our routing registry. But because we _require_ a provider ID in the models now, we can identify which provider to route to and let that provider decide. Note that we still try to look up our registry since it may have a pre-registered alias. Just that we don't outright fail when we are not able to look it up. Also, updated inference router so that the responses have the _exact_ model that the request had. Added an integration test Closes #3929 --------- Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>		2025-11-12 13:34:30 -08:00
..
access_control	chore(cleanup)!: kill vector_db references as far as possible (#3864 )	2025-10-20 20:06:16 -07:00
conversations	feat(stores)!: use backend storage references instead of configs (#3697 )	2025-10-20 13:20:09 -07:00
prompts	feat(stores)!: use backend storage references instead of configs (#3697 )	2025-10-20 13:20:09 -07:00
routers	fix(inference): enable routing of models with provider_data alone (#3928 )	2025-11-12 13:34:30 -08:00
routing_tables	chore(cleanup)!: kill vector_db references as far as possible (#3864 )	2025-10-20 20:06:16 -07:00
server	revert: "chore(cleanup)!: remove tool_runtime.rag_tool" (#3877 )	2025-10-21 11:22:06 -07:00
storage	feat(stores)!: use backend storage references instead of configs (#3697 )	2025-10-20 13:20:09 -07:00
store	feat(stores)!: use backend storage references instead of configs (#3697 )	2025-10-20 13:20:09 -07:00
ui	chore(cleanup)!: kill vector_db references as far as possible (#3864 )	2025-10-20 20:06:16 -07:00
utils	feat(cherry-pick): fixes for 0.3.1 release (#3998 )	2025-10-30 21:51:42 -07:00
__init__.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
build.py	feat(distro): no huggingface provider for starter (#3258 )	2025-08-26 14:06:36 -07:00
client.py	feat: introduce API leveling, post_training, eval to v1alpha (#3449 )	2025-09-26 16:18:07 +02:00
common.sh	refactor: remove Conda support from Llama Stack (#2969 )	2025-08-02 15:52:59 -07:00
configure.py	chore(release-0.3.x): handle missing external_providers_dir (#4011 )	2025-10-31 12:55:34 -07:00
datatypes.py	feat: support `workers` in run config (#4014 )	2025-10-31 13:48:55 -07:00
distribution.py	chore(cleanup)!: kill vector_db references as far as possible (#3864 )	2025-10-20 20:06:16 -07:00
external.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
id_generation.py	feat(tests): make inference_recorder into api_recorder (include tool_invoke) (#3403 )	2025-10-09 14:27:51 -07:00
inspect.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
library_client.py	fix(logging): move module-level initialization to explicit setup calls (#3874 )	2025-10-21 11:08:25 -07:00
providers.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
request_headers.py	chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061 )	2025-08-20 07:15:35 -04:00
resolver.py	chore(cleanup)!: kill vector_db references as far as possible (#3864 )	2025-10-20 20:06:16 -07:00
stack.py	revert: "chore(cleanup)!: remove tool_runtime.rag_tool" (#3877 )	2025-10-21 11:22:06 -07:00
start_stack.sh	chore!: remove --env from `llama stack run` (#3711 )	2025-10-07 20:58:15 -07:00
testing_context.py	feat(ci): add support for docker:distro in tests (#3832 )	2025-10-16 19:33:13 -07:00