llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Jiayi Ni fa7699d2c3 feat: Add rerank API for NVIDIA Inference Provider (#3329 ) # What does this PR do? Add rerank API for NVIDIA Inference Provider. <!-- If resolving an issue, uncomment and update the line below --> Closes #3278 ## Test Plan Unit test: ``` pytest tests/unit/providers/nvidia/test_rerank_inference.py ``` Integration test: ``` pytest -s -v tests/integration/inference/test_rerank.py --stack-config="inference=nvidia" --rerank-model=nvidia/nvidia/nv-rerankqa-mistral-4b-v3 --env NVIDIA_API_KEY="" --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ```		2025-10-30 21:42:09 -07:00
..
agent	chore(mypy): part-04 resolve mypy errors in meta_reference agents (#3969 )	2025-10-29 13:37:28 -07:00
agents	fix!: Enhance response API support to not fail with tool calling (#3385 )	2025-10-27 09:33:02 -07:00
batches	feat(stores)!: use backend storage references instead of configs (#3697 )	2025-10-20 13:20:09 -07:00
files	feat(stores)!: use backend storage references instead of configs (#3697 )	2025-10-20 13:20:09 -07:00
inference	feat: add provider data keys for Cerebras, Databricks, NVIDIA, and RunPod (#3734 )	2025-10-27 13:09:35 -07:00
inline	feat: Add responses and safety impl extra_body (#3781 )	2025-10-15 15:01:37 -07:00
nvidia	feat: Add rerank API for NVIDIA Inference Provider (#3329 )	2025-10-30 21:42:09 -07:00
utils	feat: Add rerank models and rerank API change (#3831 )	2025-10-22 12:02:28 -07:00
vector_io	fix!: remove chunk_id property from Chunk class (#3954 )	2025-10-29 18:59:59 -07:00
test_bedrock.py	fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386 )	2025-09-11 11:41:53 +02:00
test_configs.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00