llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Jiayi Ni fa7699d2c3 feat: Add rerank API for NVIDIA Inference Provider (#3329 ) # What does this PR do? Add rerank API for NVIDIA Inference Provider. <!-- If resolving an issue, uncomment and update the line below --> Closes #3278 ## Test Plan Unit test: ``` pytest tests/unit/providers/nvidia/test_rerank_inference.py ``` Integration test: ``` pytest -s -v tests/integration/inference/test_rerank.py --stack-config="inference=nvidia" --rerank-model=nvidia/nvidia/nv-rerankqa-mistral-4b-v3 --env NVIDIA_API_KEY="" --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ```		2025-10-30 21:42:09 -07:00
..
advanced_apis	chore: update doc (#3857 )	2025-10-20 10:33:21 -07:00
building_applications	chore: update docs for telemetry api removal (#3900 )	2025-10-24 13:57:28 -07:00
concepts	chore: update docs for telemetry api removal (#3900 )	2025-10-24 13:57:28 -07:00
contributing	feat: Add static file import system for docs (#3882 )	2025-10-24 14:01:33 -04:00
deploying	chore: use uvicorn to start llama stack server everywhere (#3625 )	2025-10-06 14:27:40 +02:00
distributions	docs: add documentation on how to use custom run yaml in docker (#3949 )	2025-10-28 16:05:44 -07:00
getting_started	feat: Add static file import system for docs (#3882 )	2025-10-24 14:01:33 -04:00
providers	feat: Add rerank API for NVIDIA Inference Provider (#3329 )	2025-10-30 21:42:09 -07:00
references	chore: update docs for telemetry api removal (#3900 )	2025-10-24 13:57:28 -07:00
api-overview.md	docs: api separation (#3630 )	2025-10-01 10:13:31 -07:00
index.mdx	chore: update docs for telemetry api removal (#3900 )	2025-10-24 13:57:28 -07:00