LiteLLM Minor Fixes + Improvements (#5474)

* feat(proxy/_types.py): add lago billing to callbacks ui

Closes https://github.com/BerriAI/litellm/issues/5472

* fix(anthropic.py): return anthropic prompt caching information

Fixes https://github.com/BerriAI/litellm/issues/5364

* feat(bedrock/chat.py): support 'json_schema' for bedrock models

Closes https://github.com/BerriAI/litellm/issues/5434

* fix(bedrock/embed/embeddings.py): support async embeddings for amazon titan models

* fix: linting fixes

* fix: handle key errors

* fix(bedrock/chat.py): fix bedrock ai21 streaming object

* feat(bedrock/embed): support bedrock embedding optional params

* fix(databricks.py): fix usage chunk

* fix(internal_user_endpoints.py): apply internal user defaults, if user role updated

Fixes issue where user update wouldn't apply defaults

* feat(slack_alerting.py): provide multiple slack channels for a given alert type

multiple channels might be interested in receiving an alert for a given type

* docs(alerting.md): add multiple channel alerting to docs
This commit is contained in:
Krish Dholakia 2024-09-02 14:29:57 -07:00 committed by GitHub
parent 3fbb4f8fac
commit 11f85d883f
22 changed files with 720 additions and 209 deletions

View file

@ -703,8 +703,16 @@ class ModelResponseIterator:
is_finished = True
finish_reason = processed_chunk.choices[0].finish_reason
if hasattr(processed_chunk, "usage"):
usage = processed_chunk.usage # type: ignore
if hasattr(processed_chunk, "usage") and isinstance(
processed_chunk.usage, litellm.Usage
):
usage_chunk: litellm.Usage = processed_chunk.usage
usage = ChatCompletionUsageBlock(
prompt_tokens=usage_chunk.prompt_tokens,
completion_tokens=usage_chunk.completion_tokens,
total_tokens=usage_chunk.total_tokens,
)
return GenericStreamingChunk(
text=text,