forked from phoenix-oss/llama-stack-mirror
# What does this PR do? Adds custom model registration functionality to NVIDIAInferenceAdapter which let's the inference happen on: - post-training model - non-llama models in API Catalogue(behind https://integrate.api.nvidia.com and endpoints compatible with AyncOpenAI) ## Example Usage: ```python from llama_stack.apis.models import Model, ModelType from llama_stack.distribution.library_client import LlamaStackAsLibraryClient client = LlamaStackAsLibraryClient("nvidia") _ = client.initialize() client.models.register( model_id=model_name, model_type=ModelType.llm, provider_id="nvidia" ) response = client.inference.chat_completion( model_id=model_name, messages=[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Write a limerick about the wonders of GPU computing."}], ) ``` ## Test Plan ```bash pytest tests/unit/providers/nvidia/test_supervised_fine_tuning.py ========================================================== test session starts =========================================================== platform linux -- Python 3.10.0, pytest-8.3.5, pluggy-1.5.0 rootdir: /home/ubuntu/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0 collected 6 items tests/unit/providers/nvidia/test_supervised_fine_tuning.py ...... [100%] ============================================================ warnings summary ============================================================ ../miniconda/envs/nvidia-1/lib/python3.10/site-packages/pydantic/fields.py:1076 /home/ubuntu/miniconda/envs/nvidia-1/lib/python3.10/site-packages/pydantic/fields.py:1076: PydanticDeprecatedSince20: Using extra keyword arguments on `Field` is deprecated and will be removed. Use `json_schema_extra` instead. (Extra keys: 'contentEncoding'). Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/ warn( -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ====================================================== 6 passed, 1 warning in 1.51s ====================================================== ``` [//]: # (## Documentation) Updated Readme.md cc: @dglogo, @sumitb, @mattf |
||
---|---|---|
.. | ||
_static | ||
notebooks | ||
openapi_generator | ||
resources | ||
source | ||
zero_to_hero_guide | ||
conftest.py | ||
contbuild.sh | ||
dog.jpg | ||
getting_started.ipynb | ||
getting_started_llama4.ipynb | ||
license_header.txt | ||
make.bat | ||
Makefile | ||
readme.md | ||
requirements.txt |
Llama Stack Documentation
Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our ReadTheDocs page.
Render locally
pip install -r requirements.txt
cd docs
python -m sphinx_autobuild source _build
You can open up the docs in your browser at http://localhost:8000
Content
Try out Llama Stack's capabilities through our detailed Jupyter notebooks:
- Building AI Applications Notebook - A comprehensive guide to building production-ready AI applications using Llama Stack
- Benchmark Evaluations Notebook - Detailed performance evaluations and benchmarking results
- Zero-to-Hero Guide - Step-by-step guide for getting started with Llama Stack