forked from phoenix-oss/llama-stack-mirror
fix vllm base64 image inference (#815)
# What does this PR do? - fix base64 based image url for vllm - add a test case for base64 based image_url - fixes issue: https://github.com/meta-llama/llama-stack/issues/571 ## Test Plan ``` LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v ./tests/client-sdk/inference/test_inference.py::test_image_chat_completion_base64_url ``` <img width="991" alt="image" src="https://github.com/user-attachments/assets/d56381ba-6777-4d23-9da9-81f73ce93566" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.
This commit is contained in:
parent
3d4c53dfec
commit
3e7496e835
3 changed files with 42 additions and 3 deletions
|
@ -176,10 +176,8 @@ class VLLMInferenceAdapter(Inference, ModelsProtocolPrivate):
|
|||
media_present = request_has_media(request)
|
||||
if isinstance(request, ChatCompletionRequest):
|
||||
if media_present:
|
||||
# vllm does not seem to work well with image urls, so we download the images
|
||||
input_dict["messages"] = [
|
||||
await convert_message_to_openai_dict(m, download=True)
|
||||
for m in request.messages
|
||||
await convert_message_to_openai_dict(m) for m in request.messages
|
||||
]
|
||||
else:
|
||||
input_dict["prompt"] = await chat_completion_request_to_prompt(
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue