llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Rashmi Pawar ace82836c1 feat: NVIDIA allow non-llama model registration (#1859 ) # What does this PR do? Adds custom model registration functionality to NVIDIAInferenceAdapter which let's the inference happen on: - post-training model - non-llama models in API Catalogue(behind https://integrate.api.nvidia.com and endpoints compatible with AyncOpenAI) ## Example Usage: ```python from llama_stack.apis.models import Model, ModelType from llama_stack.distribution.library_client import LlamaStackAsLibraryClient client = LlamaStackAsLibraryClient("nvidia") _ = client.initialize() client.models.register( model_id=model_name, model_type=ModelType.llm, provider_id="nvidia" ) response = client.inference.chat_completion( model_id=model_name, messages=[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Write a limerick about the wonders of GPU computing."}], ) ``` ## Test Plan ```bash pytest tests/unit/providers/nvidia/test_supervised_fine_tuning.py ========================================================== test session starts =========================================================== platform linux -- Python 3.10.0, pytest-8.3.5, pluggy-1.5.0 rootdir: /home/ubuntu/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0 collected 6 items tests/unit/providers/nvidia/test_supervised_fine_tuning.py ...... [100%] ============================================================ warnings summary ============================================================ ../miniconda/envs/nvidia-1/lib/python3.10/site-packages/pydantic/fields.py:1076 /home/ubuntu/miniconda/envs/nvidia-1/lib/python3.10/site-packages/pydantic/fields.py:1076: PydanticDeprecatedSince20: Using extra keyword arguments on `Field` is deprecated and will be removed. Use `json_schema_extra` instead. (Extra keys: 'contentEncoding'). Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/ warn( -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ====================================================== 6 passed, 1 warning in 1.51s ====================================================== ``` [//]: # (## Documentation) Updated Readme.md cc: @dglogo, @sumitb, @mattf		2025-04-24 17:13:33 -07:00
..
bedrock	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
cerebras	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
ci-tests	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
dell	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
dev	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
experimental-post-training	fix: fix experimental-post-training template (#1740 )	2025-03-20 23:07:19 -07:00
fireworks	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
groq	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
hf-endpoint	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
hf-serverless	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
meta-reference-gpu	feat: add batch inference API to llama stack inference (#1945 )	2025-04-12 11:41:12 -07:00
nvidia	feat: NVIDIA allow non-llama model registration (#1859 )	2025-04-24 17:13:33 -07:00
ollama	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
open-benchmark	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
passthrough	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
remote-vllm	docs: Add tips for debugging remote vLLM provider (#1992 )	2025-04-18 14:47:47 +02:00
sambanova	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
tgi	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
together	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
verification	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
vllm-gpu	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
__init__.py	Auto-generate distro yamls + docs (#468 )	2024-11-18 14:57:06 -08:00
dependencies.json	feat: Add NVIDIA Eval integration (#1890 )	2025-04-24 17:12:42 -07:00
template.py	feat(api): (1/n) datasets api clean up (#1573 )	2025-03-17 16:55:45 -07:00