llama-stack-mirror/llama_stack/providers/remote/inference/nvidia
Matthew Farrellee 9845631d51
feat: update nvidia inference provider to use model_store (#1988)
# What does this PR do?

NVIDIA Inference provider was using the ModelRegistryHelper to map input
model ids to provider model ids. this updates it to use the model_store.

## Test Plan

`LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest -v
tests/integration/inference/{test_embedding.py,test_text_inference.py,test_openai_completion.py}
--embedding-model nvidia/llama-3.2-nv-embedqa-1b-v2
--text-model=meta-llama/Llama-3.1-70B-Instruct`
2025-04-18 10:16:43 +02:00
..
__init__.py add NVIDIA NIM inference adapter (#355) 2024-11-23 15:59:00 -08:00
config.py chore: move all Llama Stack types from llama-models to llama-stack (#1098) 2025-02-14 09:10:59 -08:00
models.py chore: add meta/llama-3.3-70b-instruct as supported nvidia inference provider model (#1985) 2025-04-17 06:50:40 -07:00
NVIDIA.md docs: Add NVIDIA platform distro docs (#1971) 2025-04-17 05:54:30 -07:00
nvidia.py feat: update nvidia inference provider to use model_store (#1988) 2025-04-18 10:16:43 +02:00
openai_utils.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
utils.py style: remove prints in codebase (#1146) 2025-02-18 19:41:37 -08:00