llama-stack-mirror/llama_stack/apis
Dinesh Yeduguru ab7f802698
feat: add MetricResponseMixin to chat completion response types (#1050)
# What does this PR do?
Defines a MetricResponseMixin which can be inherited by any response
class. Adds it to chat completion response types.


This is a short term solution to allow inference API to return metrics
The ideal way to do this is to have a way for all response types to
include metrics
and all metric events logged to the telemetry API to be included with
the response
To do this, we will need to augment all response types with a metrics
field.
We have hit a blocker from stainless SDK that prevents us from doing
this.
The blocker is that if we were to augment the response types that have a
data field
in them like so
class ListModelsResponse(BaseModel):
    metrics: Optional[List[MetricEvent]] = None
    data: List[Models]
    ...
The client SDK will need to access the data by using a .data field,
which is not
ergonomic. Stainless SDK does support unwrapping the response type, but
it
requires that the response type to only have a single field.

We will need a way in the client SDK to signal that the metrics are
needed
and if they are needed, the client SDK has to return the full response
type
without unwrapping it.

## Test Plan
sh run_openapi_generator.sh ./
sh stainless_sync.sh dineshyv/dev add-metrics-to-resp-v4

LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/fireworks/fireworks-run.yaml"
pytest -v tests/client-sdk/agents/test_agents.py
2025-02-11 14:58:12 -08:00
..
agents fix: agent config validation (#1053) 2025-02-11 14:48:42 -08:00
batch_inference Update OpenAPI generator to add param and field documentation (#896) 2025-01-29 10:04:30 -08:00
common fix ImageContentItem to take base64 string as image.data (#909) 2025-01-30 15:58:23 -08:00
datasetio Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
datasets More idiomatic REST API (#765) 2025-01-15 13:20:09 -08:00
eval Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
eval_tasks More idiomatic REST API (#765) 2025-01-15 13:20:09 -08:00
inference feat: add MetricResponseMixin to chat completion response types (#1050) 2025-02-11 14:58:12 -08:00
inspect REST API fixes (#789) 2025-01-16 13:47:08 -08:00
models More idiomatic REST API (#765) 2025-01-15 13:20:09 -08:00
post_training Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
safety More idiomatic REST API (#765) 2025-01-15 13:20:09 -08:00
scoring More idiomatic REST API (#765) 2025-01-15 13:20:09 -08:00
scoring_functions Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
shields More idiomatic REST API (#765) 2025-01-15 13:20:09 -08:00
synthetic_data_generation [remove import *] clean up import *'s (#689) 2024-12-27 15:45:44 -08:00
telemetry feat: add MetricResponseMixin to chat completion response types (#1050) 2025-02-11 14:58:12 -08:00
tools Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
vector_dbs [memory refactor][1/n] Rename Memory -> VectorIO, MemoryBanks -> VectorDBs (#828) 2025-01-22 09:59:30 -08:00
vector_io [memory refactor][6/n] Update naming and routes (#839) 2025-01-22 10:39:13 -08:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
datatypes.py [memory refactor][1/n] Rename Memory -> VectorIO, MemoryBanks -> VectorDBs (#828) 2025-01-22 09:59:30 -08:00
resource.py Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
version.py llama-stack version alpha -> v1 2025-01-15 05:58:09 -08:00