mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 18:00:36 +00:00
# What does this PR do? Add rerank API for NVIDIA Inference Provider. <!-- If resolving an issue, uncomment and update the line below --> Closes #3278 ## Test Plan Unit test: ``` pytest tests/unit/providers/nvidia/test_rerank_inference.py ``` Integration test: ``` pytest -s -v tests/integration/inference/test_rerank.py --stack-config="inference=nvidia" --rerank-model=nvidia/nvidia/nv-rerankqa-mistral-4b-v3 --env NVIDIA_API_KEY="" --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ``` |
||
|---|---|---|
| .. | ||
| agents | ||
| batches | ||
| datasetio | ||
| eval | ||
| external | ||
| files | ||
| inference | ||
| post_training | ||
| safety | ||
| scoring | ||
| tool_runtime | ||
| vector_io | ||
| index.mdx | ||
| openai.mdx | ||
| openai_responses_limitations.mdx | ||