mirror of
				https://github.com/meta-llama/llama-stack.git
				synced 2025-10-26 09:15:40 +00:00 
			
		
		
		
	feat: Add rerank models and rerank API change (#3831)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - Extend the model type to include rerank models. - Implement `rerank()` method in inference router. - Add `rerank_model_list` to `OpenAIMixin` to enable providers to register and identify rerank models - Update documentation. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> ``` pytest tests/unit/providers/utils/inference/test_openai_mixin.py ```
This commit is contained in:
		
							parent
							
								
									f2598d30e6
								
							
						
					
					
						commit
						bb1ebb3c6b
					
				
					 12 changed files with 186 additions and 43 deletions
				
			
		
							
								
								
									
										7
									
								
								docs/static/deprecated-llama-stack-spec.yaml
									
										
									
									
										vendored
									
									
								
							
							
						
						
									
										7
									
								
								docs/static/deprecated-llama-stack-spec.yaml
									
										
									
									
										vendored
									
									
								
							|  | @ -10218,13 +10218,16 @@ tags: | |||
|       embeddings. | ||||
| 
 | ||||
| 
 | ||||
|       This API provides the raw interface to the underlying models. Two kinds of models | ||||
|       are supported: | ||||
|       This API provides the raw interface to the underlying models. Three kinds of | ||||
|       models are supported: | ||||
| 
 | ||||
|       - LLM models: these models generate "raw" and "chat" (conversational) completions. | ||||
| 
 | ||||
|       - Embedding models: these models generate embeddings to be used for semantic | ||||
|       search. | ||||
| 
 | ||||
|       - Rerank models: these models reorder the documents based on their relevance | ||||
|       to a query. | ||||
|     x-displayName: Inference | ||||
|   - name: Models | ||||
|     description: '' | ||||
|  |  | |||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue