Litellm stable dev (#5711)

* feat(aws_base_llm.py): prevents recreating boto3 credentials during high traffic

Leads to 100ms perf boost in local testing

* fix(base_aws_llm.py): fix credential caching check to see if token is set

* refactor(bedrock/chat): separate converse api and invoke api + isolate converse api transformation logic

Make it easier to see how requests are transformed for /converse

* fix: fix imports

* fix(bedrock/embed): fix reordering of headers

* fix(base_aws_llm.py): fix get credential logic

* fix(converse_handler.py): fix ai21 streaming response
This commit is contained in:
Krish Dholakia 2024-09-14 23:22:59 -07:00 committed by GitHub
parent 2efdd2a6a4
commit da77706c26
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
14 changed files with 1073 additions and 1039 deletions

View file

@ -2385,6 +2385,7 @@ def completion(
)
if model in litellm.BEDROCK_CONVERSE_MODELS:
response = bedrock_converse_chat_completion.completion(
model=model,
messages=messages,
@ -3570,7 +3571,7 @@ def embedding(
client=client,
timeout=timeout,
aembedding=aembedding,
litellm_params=litellm_params,
litellm_params={},
api_base=api_base,
print_verbose=print_verbose,
extra_headers=extra_headers,