according to gemini docs: https://ai.google.dev/gemini-api/docs/thought-signatures
thought_signature lives in the extra_content of the tool_call, add extra_content to OpenAIChatCompletionToolCall to support this
Signed-off-by: Charlie Doern <cdoern@redhat.com>
Add support for reasoning fields in OpenAI-compatible chat completion
messages to enable compatibility with vLLM reasoning parsers.
Changes:
- Add `reasoning_content` and `reasoning` fields to OpenAIAssistantMessageParam
- Add `reasoning` field to OpenAIChoiceDelta (reasoning_content already existed)
Both field names are supported for maximum compatibility:
- `reasoning_content`: Used by vLLM ≤ v0.8.4
- `reasoning`: New field name in vLLM ≥ v0.9.x
vLLM documentation recommends migrating to the shorter `reasoning` field
name, but maintains backward compatibility with `reasoning_content`.
These fields allow reasoning models to return their chain-of-thought
process alongside the final answer, which is crucial for transparency
and debugging with reasoning models.
References:
- vLLM Reasoning Outputs: https://docs.vllm.ai/en/stable/features/reasoning_outputs/
- vLLM Issue #12468: https://github.com/vllm-project/vllm/issues/12468
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
Removes stale data from llama stack about old telemetry system
**Depends on** https://github.com/llamastack/llama-stack/pull/4127
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
# What does this PR do?
the directory structure was src/llama-stack-api/llama_stack_api
instead it should just be src/llama_stack_api to match the other
packages.
update the structure and pyproject/linting config
---------
Signed-off-by: Charlie Doern <cdoern@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-11-13 15:04:36 -08:00
Renamed from src/llama-stack-api/llama_stack_api/inference.py (Browse further)