feat!: Implement include parameter specifically for adding logprobs in the output message (#4261)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 15s
Python Package Build Test / build (3.12) (push) Successful in 17s
Python Package Build Test / build (3.13) (push) Successful in 18s
Test External API and Providers / test-external (venv) (push) Failing after 28s
Vector IO Integration Tests / test-matrix (push) Failing after 43s
UI Tests / ui-tests (22) (push) Successful in 52s
Unit Tests / unit-tests (3.13) (push) Failing after 1m45s
Unit Tests / unit-tests (3.12) (push) Failing after 1m58s
Pre-commit / pre-commit (22) (push) Successful in 3m9s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m5s

# Problem
As an Application Developer, I want to use the include parameter with
the value message.output_text.logprobs, so that I can receive log
probabilities for output tokens to assess the model's confidence in its
response.

# What does this PR do?

- Updates the include parameter in various resource definitions
- Updates the inline provider to return logprobs when
"message.output_text.logprobs" is passed in the include parameter
- Converts the logprobs returned by the inference provider from chat
completion format to responses format

Closes #[4260](https://github.com/llamastack/llama-stack/issues/4260)

## Test Plan

- Created a script to explore OpenAI behavior:
https://github.com/s-akhtar-baig/llama-stack-examples/blob/main/responses/src/include.py
- Added integration tests and new recordings

---------

Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
This commit is contained in:
Shabana Baig 2025-12-11 14:11:21 -05:00 committed by GitHub
parent 76e47d811a
commit 805abf573f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
26 changed files with 13524 additions and 161 deletions

View file

@ -5,6 +5,7 @@
# the root directory of this source tree.
from collections.abc import AsyncIterator
from enum import StrEnum
from typing import Annotated, Protocol, runtime_checkable
from pydantic import BaseModel
@ -40,6 +41,20 @@ class ResponseGuardrailSpec(BaseModel):
ResponseGuardrail = str | ResponseGuardrailSpec
class ResponseItemInclude(StrEnum):
"""
Specify additional output data to include in the model response.
"""
web_search_call_action_sources = "web_search_call.action.sources"
code_interpreter_call_outputs = "code_interpreter_call.outputs"
computer_call_output_output_image_url = "computer_call_output.output.image_url"
file_search_call_results = "file_search_call.results"
message_input_image_image_url = "message.input_image.image_url"
message_output_text_logprobs = "message.output_text.logprobs"
reasoning_encrypted_content = "reasoning.encrypted_content"
@runtime_checkable
class Agents(Protocol):
"""Agents
@ -80,7 +95,7 @@ class Agents(Protocol):
temperature: float | None = None,
text: OpenAIResponseText | None = None,
tools: list[OpenAIResponseInputTool] | None = None,
include: list[str] | None = None,
include: list[ResponseItemInclude] | None = None,
max_infer_iters: int | None = 10, # this is an extension to the OpenAI API
guardrails: Annotated[
list[ResponseGuardrail] | None,