fix for openai embedding issue for asymmetric embedding nims

2025-12-19 15:18:40 +00:00 · 2025-08-19 10:41:45 -07:00 · 2025-08-19 10:41:45 -07:00 · 85cae08e79
commit 85cae08e79
parent eb07a0f86a
2 changed files with 59 additions and 1 deletions
--- a/llama_stack/providers/remote/inference/nvidia/NVIDIA.md
+++ b/llama_stack/providers/remote/inference/nvidia/NVIDIA.md
@ -77,6 +77,10 @@ print(f"Response: {response.completion_message.content}")
 ```

 ### Create Embeddings
+> Note on OpenAI embeddings compatibility
+>
+> NVIDIA asymmetric embedding models (e.g., `nvidia/llama-3.2-nv-embedqa-1b-v2`) require an `input_type` parameter not present in the standard OpenAI embeddings API. The NVIDIA Inference Adapter automatically sets `input_type="query"` when using the OpenAI-compatible embeddings endpoint for NVIDIA. For passage embeddings, use the `embeddings` API with `task_type="document"`.
+
 ```python
 response = client.inference.embeddings(
    model_id="nvidia/llama-3.2-nv-embedqa-1b-v2",