mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 18:00:36 +00:00
# What does this PR do? Add rerank API for NVIDIA Inference Provider. <!-- If resolving an issue, uncomment and update the line below --> Closes #3278 ## Test Plan Unit test: ``` pytest tests/unit/providers/nvidia/test_rerank_inference.py ``` Integration test: ``` pytest -s -v tests/integration/inference/test_rerank.py --stack-config="inference=nvidia" --rerank-model=nvidia/nvidia/nv-rerankqa-mistral-4b-v3 --env NVIDIA_API_KEY="" --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ``` |
||
|---|---|---|
| .. | ||
| recordings | ||
| __init__.py | ||
| dog.png | ||
| test_openai_completion.py | ||
| test_openai_embeddings.py | ||
| test_openai_vision_inference.py | ||
| test_provider_data_routing.py | ||
| test_rerank.py | ||
| test_tools_with_schemas.py | ||
| test_vision_inference.py | ||
| vision_test_1.jpg | ||
| vision_test_2.jpg | ||
| vision_test_3.jpg | ||