Fix anthropic thinking + response_format (#9594)

* fix(anthropic/chat/transformation.py): Don't set tool choice on response_format conversion when thinking is enabled Not allowed by Anthropic Fixes https://github.com/BerriAI/litellm/issues/8901 * refactor: move test to base anthropic chat tests ensures consistent behaviour across vertex/anthropic/bedrock * fix(anthropic/chat/transformation.py): if thinking token is specified and max tokens is not - ensure max token to anthropic is higher than thinking tokens * feat(converse_transformation.py): correctly handle thinking + response format on Bedrock Converse Fixes https://github.com/BerriAI/litellm/issues/8901 * fix(converse_transformation.py): correctly handle adding max tokens * test: handle service unavailable error
2025-04-25 18:54:30 +00:00 · 2025-03-28 15:57:40 -07:00 · 2025-03-28 15:57:40 -07:00 · 5f8859eda8
commit 5f8859eda8
parent 7c1026e210
8 changed files with 96 additions and 6 deletions
--- a/litellm/constants.py
+++ b/litellm/constants.py
@ -7,6 +7,7 @@ DEFAULT_MAX_RETRIES = 2
 DEFAULT_FAILURE_THRESHOLD_PERCENT = (
    0.5  # default cooldown a deployment if 50% of requests fail in a given minute
 )
+DEFAULT_MAX_TOKENS = 4096
 DEFAULT_REDIS_SYNC_INTERVAL = 1
 DEFAULT_COOLDOWN_TIME_SECONDS = 5
 DEFAULT_REPLICATE_POLLING_RETRIES = 5