fix: add missing extra_body to client.chat.completions.create() call

- test requires vLLM as provider, current is skipped in GH Action
- test:
>export VLLM_URL="http://localhost:8000"
>pytest tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_extra_body -v --stack-config="inference=remote::vllm"

Signed-off-by: Wen Zhou <wenzhou@redhat.com>
This commit is contained in:
Wen Zhou 2025-07-11 13:02:11 +02:00
parent d880c2df0e
commit ea964a13ec
2 changed files with 46 additions and 0 deletions

View file

@ -654,6 +654,7 @@ class VLLMInferenceAdapter(Inference, ModelsProtocolPrivate):
top_logprobs: int | None = None,
top_p: float | None = None,
user: str | None = None,
extra_body: dict[str, Any] | None = None,
) -> OpenAIChatCompletion | AsyncIterator[OpenAIChatCompletionChunk]:
self._lazy_initialize_client()
model_obj = await self._get_model(model)
@ -681,6 +682,7 @@ class VLLMInferenceAdapter(Inference, ModelsProtocolPrivate):
top_logprobs=top_logprobs,
top_p=top_p,
user=user,
extra_body=extra_body,
)
return await self.client.chat.completions.create(**params) # type: ignore