mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-17 23:17:17 +00:00
add NVIDIA NIM inference adapter
This commit is contained in:
parent
ac93dd89cf
commit
2dd8c4bcb6
12 changed files with 1115 additions and 0 deletions
26
tests/nvidia/README.md
Normal file
26
tests/nvidia/README.md
Normal file
|
@ -0,0 +1,26 @@
|
|||
# NVIDIA tests
|
||||
|
||||
## Running tests
|
||||
|
||||
**Install the required dependencies:**
|
||||
```bash
|
||||
pip install pytest pytest-asyncio pytest-httpx
|
||||
```
|
||||
|
||||
There are three modes for testing:
|
||||
|
||||
1. Unit tests - this mode checks the provider functionality and does not require a network connection or running distribution
|
||||
|
||||
```bash
|
||||
pytest tests/nvidia/unit
|
||||
```
|
||||
|
||||
2. Integration tests against hosted preview APIs - this mode checks the provider functionality against a live system and requires an API key. Get an API key by 0. going to https://build.nvidia.com, 1. selecting a Llama model, e.g. https://build.nvidia.com/meta/llama-3_1-8b-instruct, and 2. clicking "Get API Key". Store the API key in the `NVIDIA_API_KEY` environment variable.
|
||||
|
||||
```bash
|
||||
export NVIDIA_API_KEY=...
|
||||
|
||||
pytest tests/nvidia/integration --base-url https://integrate.api.nvidia.com
|
||||
```
|
||||
|
||||
3. Integration tests against a running distribution - this mode checks the provider functionality in the context of a running distribution. This involves running a local NIM, see https://build.nvidia.com/meta/llama-3_1-8b-instruct?snippet_tab=Docker, and creating & configuring a distribution to use it. Details to come.
|
Loading…
Add table
Add a link
Reference in a new issue