llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-22 22:52:26 +00:00

History

Charlie Doern d52722b0d1 fix: actually propagate inference metrics currently metrics are only propagated in `chat_completion` and `completion` since most providers use the openai_.. routes as the default in llama-stack-client inference chat-completion, metrics are currently not working as expected. in order to get them working the following had to be done: 1. get the completion as usual 2. use new openai_ versions of the metric gathering functions which use .usage from the OpenAI.. response types to gather the metrics which are already populated. 3. define a `stream_generator` which counts the tokens and computes the metrics 4. use a NEW span and log_metrics because the span of the request ends before this processing is complete, leading to no logging unless a custom span is used 5. add metrics to response Signed-off-by: Charlie Doern <cdoern@redhat.com>		2025-08-06 15:48:41 -04:00
..
apis	refactor: introduce common 'ResourceNotFoundError' exception (#3032 )	2025-08-06 10:22:55 -07:00
cli	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
core	fix: actually propagate inference metrics	2025-08-06 15:48:41 -04:00
distributions	chore: update postgres_demo with new config (#3045 )	2025-08-06 07:48:40 -07:00
models	chore(api): add `mypy` coverage to `chat_format` (#2654 )	2025-07-18 11:56:53 +02:00
providers	fix: actually propagate inference metrics	2025-08-06 15:48:41 -04:00
strong_typing	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
testing	fix(recording): endpoint resolution (#3013 )	2025-08-01 16:23:54 -07:00
ui	fix(ci): allow tests to skip llama stack client instantiation (#3052 )	2025-08-06 11:15:41 -07:00
__init__.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
env.py	refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401 )	2025-03-04 14:53:47 -08:00
log.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
schema_utils.py	feat(auth): API access control (#2822 )	2025-07-24 15:30:48 -07:00