llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-06 12:37:33 +00:00

History

Matthew Farrellee 477bcd4d09 feat: allow dynamic model registration for nvidia inference provider (#2726 ) # What does this PR do? let's users register models available at https://integrate.api.nvidia.com/v1/models that isn't already in llama_stack/providers/remote/inference/nvidia/models.py ## Test Plan 1. run the nvidia distro 2. register a model from https://integrate.api.nvidia.com/v1/models that isn't already know, as of this writing nvidia/llama-3.1-nemotron-ultra-253b-v1 is a good example 3. perform inference w/ the model		2025-07-17 12:11:30 -07:00
..
__init__.py	add NVIDIA NIM inference adapter (#355 )	2024-11-23 15:59:00 -08:00
config.py	fix: allow default empty vars for conditionals (#2570 )	2025-07-01 14:42:05 +02:00
models.py	ci: test safety with starter (#2628 )	2025-07-09 16:53:50 +02:00
NVIDIA.md	docs: Add NVIDIA platform distro docs (#1971 )	2025-04-17 05:54:30 -07:00
nvidia.py	feat: allow dynamic model registration for nvidia inference provider (#2726 )	2025-07-17 12:11:30 -07:00
openai_utils.py	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
utils.py	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00