llama-stack-mirror/llama_stack
Matthew Farrellee 477bcd4d09
feat: allow dynamic model registration for nvidia inference provider (#2726)
# What does this PR do?

let's users register models available at
https://integrate.api.nvidia.com/v1/models that isn't already in
llama_stack/providers/remote/inference/nvidia/models.py

## Test Plan

1. run the nvidia distro
2. register a model from https://integrate.api.nvidia.com/v1/models that
isn't already know, as of this writing
nvidia/llama-3.1-nemotron-ultra-253b-v1 is a good example
3. perform inference w/ the model
2025-07-17 12:11:30 -07:00
..
apis chore: internal change, make Model.provider_model_id non-optional (#2690) 2025-07-17 08:26:57 -07:00
cli chore(api): add mypy coverage to cli/stack (#2650) 2025-07-10 16:53:38 +02:00
distribution fix: de-clutter llama stack run logs (#2783) 2025-07-16 09:44:26 -07:00
models chore(api): add mypy coverage to prompts (#2657) 2025-07-09 10:07:00 +02:00
providers feat: allow dynamic model registration for nvidia inference provider (#2726) 2025-07-17 12:11:30 -07:00
strong_typing chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
templates fix: Move sentence-transformers to the top (#2703) 2025-07-17 10:31:30 -04:00
ui fix: re-hydrate requirement and fix package (#2774) 2025-07-16 05:46:15 -04:00
__init__.py export LibraryClient 2024-12-13 12:08:00 -08:00
env.py refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401) 2025-03-04 14:53:47 -08:00
log.py chore: remove nested imports (#2515) 2025-06-26 08:01:05 +05:30
schema_utils.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00