llama-stack-mirror

phoenix-oss/llama-stack-mirror

Fork 1

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-27 21:30:25 +00:00

Commit graph

Author	SHA1	Message	Date
Ben Browning	49148bb26a	fix: openai_compat messages system/assistant non-str content When converting OpenAI message content for the "system" and "assistant" roles to Llama Stack inference APIs (used for some providers when dealing with Llama models via OpenAI API requests to get proper prompt / tool handling), we were not properly converting any non-string content. I discovered this while running the new Responses AI verification suite against the Fireworks provider, but instead of fixing it as part of some ongoing work there split this out into a separate PR. This fixes that, by using the `openai_content_to_content` helper we used elsewhere to ensure content parts were mapped properly. I added a couple of new tests to `test_openai_compat` to reproduce this issue and validate its fix. I ran those as below: ``` python -m pytest -s -v tests/unit/providers/utils/inference/test_openai_compat.py ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-05-02 15:31:22 -04:00
Ben Browning	6378c2a2f3	fix: resolve BuiltinTools to strings for vllm tool_call messages (#2071 ) # What does this PR do? When the result of a ToolCall gets passed back into vLLM for the model to handle the tool call result (as is often the case in agentic tool-calling workflows), we forgot to handle the case where BuiltinTool calls are not string values but instead instances of the BuiltinTool enum. This fixes that, properly converting those enums to string values before trying to serialize them into an OpenAI chat completion request to vLLM. PR #1931 fixed a bug where we weren't passing these tool calling results back into vLLM, but as a side-effect it created this serialization bug when using BuiltinTools. Closes #2070 ## Test Plan I added a new unit test to the openai_compat unit tests to cover this scenario, ensured the new test failed before this fix, and all the existing tests there plus the new one passed with this fix. ``` python -m pytest -s -v tests/unit/providers/utils/inference/test_openai_compat.py ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-05-01 08:47:29 -04:00
Derek Higgins	c8797f1125	fix: Including tool call in chat (#1931 ) Include the tool call details with the chat when doing Rag with Remote vllm Fixes: #1929 With this PR the tool call is included in the chat returned to vllm, the model (meta-llama/Llama-3.1-8B-Instruct) the returns the answer as expected. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-04-24 16:59:10 -07:00

Author

SHA1

Message

Date

Ben Browning

49148bb26a

fix: openai_compat messages system/assistant non-str content

When converting OpenAI message content for the "system" and
"assistant" roles to Llama Stack inference APIs (used for some
providers when dealing with Llama models via OpenAI API requests to
get proper prompt / tool handling), we were not properly converting
any non-string content.

I discovered this while running the new Responses AI verification
suite against the Fireworks provider, but instead of fixing it as part
of some ongoing work there split this out into a separate PR.

This fixes that, by using the `openai_content_to_content` helper we
used elsewhere to ensure content parts were mapped properly.

I added a couple of new tests to `test_openai_compat` to reproduce
this issue and validate its fix. I ran those as below:

```
python -m pytest -s -v tests/unit/providers/utils/inference/test_openai_compat.py
```

Signed-off-by: Ben Browning <bbrownin@redhat.com>

2025-05-02 15:31:22 -04:00

Ben Browning

6378c2a2f3

fix: resolve BuiltinTools to strings for vllm tool_call messages (#2071 )

# What does this PR do?

When the result of a ToolCall gets passed back into vLLM for the model
to handle the tool call result (as is often the case in agentic
tool-calling workflows), we forgot to handle the case where BuiltinTool
calls are not string values but instead instances of the BuiltinTool
enum. This fixes that, properly converting those enums to string values
before trying to serialize them into an OpenAI chat completion request
to vLLM.

PR #1931 fixed a bug where we weren't passing these tool calling results
back into vLLM, but as a side-effect it created this serialization bug
when using BuiltinTools.

Closes #2070

## Test Plan

I added a new unit test to the openai_compat unit tests to cover this
scenario, ensured the new test failed before this fix, and all the
existing tests there plus the new one passed with this fix.

```
python -m pytest -s -v tests/unit/providers/utils/inference/test_openai_compat.py
```

Signed-off-by: Ben Browning <bbrownin@redhat.com>

2025-05-01 08:47:29 -04:00

Derek Higgins

c8797f1125

fix: Including tool call in chat (#1931 )

Include the tool call details with the chat when doing Rag with Remote
vllm

Fixes: #1929

With this PR the tool call is included in the chat returned to vllm, the
model (meta-llama/Llama-3.1-8B-Instruct) the returns the answer as
expected.

Signed-off-by: Derek Higgins <derekh@redhat.com>

2025-04-24 16:59:10 -07:00

3 commits