llama-stack-mirror/llama_stack/providers/remote
Matthew Farrellee 477bcd4d09
feat: allow dynamic model registration for nvidia inference provider (#2726)
# What does this PR do?

let's users register models available at
https://integrate.api.nvidia.com/v1/models that isn't already in
llama_stack/providers/remote/inference/nvidia/models.py

## Test Plan

1. run the nvidia distro
2. register a model from https://integrate.api.nvidia.com/v1/models that
isn't already know, as of this writing
nvidia/llama-3.1-nemotron-ultra-253b-v1 is a good example
3. perform inference w/ the model
2025-07-17 12:11:30 -07:00
..
agents test: add unit test to ensure all config types are instantiable (#1601) 2025-03-12 22:29:58 -07:00
datasetio fix: allow default empty vars for conditionals (#2570) 2025-07-01 14:42:05 +02:00
eval refactor(env)!: enhanced environment variable substitution (#2490) 2025-06-26 08:20:08 +05:30
inference feat: allow dynamic model registration for nvidia inference provider (#2726) 2025-07-17 12:11:30 -07:00
post_training fix: allow default empty vars for conditionals (#2570) 2025-07-01 14:42:05 +02:00
safety fix: sambanova shields and model validation (#2693) 2025-07-11 16:29:15 -04:00
tool_runtime fix: allow default empty vars for conditionals (#2570) 2025-07-01 14:42:05 +02:00
vector_io chore: Adding OpenAI Vector Stores Files API compatibility for PGVector (#2755) 2025-07-15 15:46:49 -04:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00