currently metrics are only propagated in `chat_completion` and `completion`
since most providers use the openai_.. routes as the default in llama-stack-client inference chat-completion, metrics are currently not working as expected.
in order to get them working the following had to be done:
1. get the completion as usual
2. use new openai_ versions of the metric gathering functions which use .usage from the OpenAI.. response types to gather the metrics which are already populated.
3. define a `stream_generator` which counts the tokens and computes the metrics
4. use a NEW span and log_metrics because the span of the request ends before this processing is complete, leading to no logging unless a custom span is used
5. add metrics to response
Signed-off-by: Charlie Doern <cdoern@redhat.com>