llama-stack-mirror/llama_stack/templates
Rashmi Pawar ace82836c1
feat: NVIDIA allow non-llama model registration (#1859)
# What does this PR do?
Adds custom model registration functionality to NVIDIAInferenceAdapter
which let's the inference happen on:
- post-training model
- non-llama models in API Catalogue(behind
https://integrate.api.nvidia.com and endpoints compatible with
AyncOpenAI)

## Example Usage:
```python
from llama_stack.apis.models import Model, ModelType
from llama_stack.distribution.library_client import LlamaStackAsLibraryClient
client = LlamaStackAsLibraryClient("nvidia")
_ = client.initialize()

client.models.register(
        model_id=model_name,
        model_type=ModelType.llm,
        provider_id="nvidia"
)

response = client.inference.chat_completion(
    model_id=model_name,
    messages=[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Write a limerick about the wonders of GPU computing."}],
)
```

## Test Plan
```bash
pytest tests/unit/providers/nvidia/test_supervised_fine_tuning.py 
========================================================== test session starts ===========================================================
platform linux -- Python 3.10.0, pytest-8.3.5, pluggy-1.5.0
rootdir: /home/ubuntu/llama-stack
configfile: pyproject.toml
plugins: anyio-4.9.0
collected 6 items                                                                                                                        

tests/unit/providers/nvidia/test_supervised_fine_tuning.py ......                                                                  [100%]

============================================================ warnings summary ============================================================
../miniconda/envs/nvidia-1/lib/python3.10/site-packages/pydantic/fields.py:1076
  /home/ubuntu/miniconda/envs/nvidia-1/lib/python3.10/site-packages/pydantic/fields.py:1076: PydanticDeprecatedSince20: Using extra keyword arguments on `Field` is deprecated and will be removed. Use `json_schema_extra` instead. (Extra keys: 'contentEncoding'). Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
    warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================================================== 6 passed, 1 warning in 1.51s ======================================================
```

[//]: # (## Documentation)
Updated Readme.md

cc: @dglogo, @sumitb, @mattf
2025-04-24 17:13:33 -07:00
..
bedrock chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
cerebras chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
ci-tests test: verification on provider's OAI endpoints (#1893) 2025-04-07 23:06:28 -07:00
dell chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
dev fix: 100% OpenAI API verification for together and fireworks (#1946) 2025-04-14 08:56:29 -07:00
experimental-post-training fix: fix experimental-post-training template (#1740) 2025-03-20 23:07:19 -07:00
fireworks test: verification on provider's OAI endpoints (#1893) 2025-04-07 23:06:28 -07:00
groq fix: 100% OpenAI API verification for together and fireworks (#1946) 2025-04-14 08:56:29 -07:00
hf-endpoint chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
hf-serverless chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
meta-reference-gpu feat: add batch inference API to llama stack inference (#1945) 2025-04-12 11:41:12 -07:00
nvidia feat: NVIDIA allow non-llama model registration (#1859) 2025-04-24 17:13:33 -07:00
ollama chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
open-benchmark chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
passthrough chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
remote-vllm docs: Add tips for debugging remote vLLM provider (#1992) 2025-04-18 14:47:47 +02:00
sambanova test: verification on provider's OAI endpoints (#1893) 2025-04-07 23:06:28 -07:00
tgi chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
together test: verification on provider's OAI endpoints (#1893) 2025-04-07 23:06:28 -07:00
verification fix: 100% OpenAI API verification for together and fireworks (#1946) 2025-04-14 08:56:29 -07:00
vllm-gpu chore: Revert "chore(telemetry): remove service_name entirely" (#1785) 2025-03-25 14:42:05 -07:00
__init__.py Auto-generate distro yamls + docs (#468) 2024-11-18 14:57:06 -08:00
dependencies.json feat: Add NVIDIA Eval integration (#1890) 2025-04-24 17:12:42 -07:00
template.py feat(api): (1/n) datasets api clean up (#1573) 2025-03-17 16:55:45 -07:00