* unit test test_huggingface_text_completion_logprobs
* fix return TextCompletionHandler convert_chat_to_text_completion
* fix hf rest api
* fix test_huggingface_text_completion_logprobs
* fix linting errors
* fix importLiteLLMResponseObjectHandler
* fix test for LiteLLMResponseObjectHandler
* fix test text completion
* perf: move writing key to cache, to background task
* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils
adds 200ms on calls with pgdb connected
* fix(litellm_pre_call_utils.py'): rename call_type to actual call used
* perf(proxy_server.py): remove db logic from _get_config_from_file
was causing db calls to occur on every llm request, if team_id was set on key
* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db
reduces latency/call by ~100ms
* fix(proxy_server.py): minor fix on existing_settings not incl alerting
* fix(exception_mapping_utils.py): map databricks exception string
* fix(auth_checks.py): fix auth check logic
* test: correctly mark flaky test
* fix(utils.py): handle auth token error for tokenizers.from_pretrained
* fix(dual_cache.py): update in-memory check for redis batch get cache
Fixes latency delay for async_batch_redis_cache
* fix(service_logger.py): fix race condition causing otel service logging to be overwritten if service_callbacks set
* feat(user_api_key_auth.py): add parent otel component for auth
allows us to isolate how much latency is added by auth checks
* perf(parallel_request_limiter.py): move async_set_cache_pipeline (from max parallel request limiter) out of execution path (background task)
reduces latency by 200ms
* feat(user_api_key_auth.py): have user api key auth object return user tpm/rpm limits - reduces redis calls in downstream task (parallel_request_limiter)
Reduces latency by 400-800ms
* fix(parallel_request_limiter.py): use batch get cache to reduce user/key/team usage object calls
reduces latency by 50-100ms
* fix: fix linting error
* fix(_service_logger.py): fix import
* fix(user_api_key_auth.py): fix service logging
* fix(dual_cache.py): don't pass 'self'
* fix: fix python3.8 error
* fix: fix init]
* refactor: move gemini translation logic inside the transformation.py file
easier to isolate the gemini translation logic
* fix(gemini-transformation): support multiple tool calls in message body
Merges https://github.com/BerriAI/litellm/pull/6487/files
* test(test_vertex.py): add remaining tests from https://github.com/BerriAI/litellm/pull/6487
* fix(gemini-transformation): return tool calls for multiple tool calls
* fix: support passing logprobs param for vertex + gemini
* feat(vertex_ai): add logprobs support for gemini calls
* fix(anthropic/chat/transformation.py): fix disable parallel tool use flag
* fix: fix linting error
* fix(_logging.py): log stacktrace information in json logs
Closes https://github.com/BerriAI/litellm/issues/6497
* fix(utils.py): fix mem leak for async stream + completion
Uses a global executor pool instead of creating a new thread on each request
Fixes https://github.com/BerriAI/litellm/issues/6404
* fix(factory.py): handle tool call + content in assistant message for bedrock
* fix: fix import
* fix(factory.py): maintain support for content as a str in assistant response
* fix: fix import
* test: cleanup test
* fix(vertex_and_google_ai_studio/): return none for content if no str value
* test: retry flaky tests
* (UI) Fix viewing members, keys in a team + added testing (#6514)
* fix listing teams on ui
* LiteLLM Minor Fixes & Improvements (10/28/2024) (#6475)
* fix(anthropic/chat/transformation.py): support anthropic disable_parallel_tool_use param
Fixes https://github.com/BerriAI/litellm/issues/6456
* feat(anthropic/chat/transformation.py): support anthropic computer tool use
Closes https://github.com/BerriAI/litellm/issues/6427
* fix(vertex_ai/common_utils.py): parse out '$schema' when calling vertex ai
Fixes issue when trying to call vertex from vercel sdk
* fix(main.py): add 'extra_headers' support for azure on all translation endpoints
Fixes https://github.com/BerriAI/litellm/issues/6465
* fix: fix linting errors
* fix(transformation.py): handle no beta headers for anthropic
* test: cleanup test
* fix: fix linting error
* fix: fix linting errors
* fix: fix linting errors
* fix(transformation.py): handle dummy tool call
* fix(main.py): fix linting error
* fix(azure.py): pass required param
* LiteLLM Minor Fixes & Improvements (10/24/2024) (#6441)
* fix(azure.py): handle /openai/deployment in azure api base
* fix(factory.py): fix faulty anthropic tool result translation check
Fixes https://github.com/BerriAI/litellm/issues/6422
* fix(gpt_transformation.py): add support for parallel_tool_calls to azure
Fixes https://github.com/BerriAI/litellm/issues/6440
* fix(factory.py): support anthropic prompt caching for tool results
* fix(vertex_ai/common_utils): don't pop non-null required field
Fixes https://github.com/BerriAI/litellm/issues/6426
* feat(vertex_ai.py): support code_execution tool call for vertex ai + gemini
Closes https://github.com/BerriAI/litellm/issues/6434
* build(model_prices_and_context_window.json): Add 'supports_assistant_prefill' for bedrock claude-3-5-sonnet v2 models
Closes https://github.com/BerriAI/litellm/issues/6437
* fix(types/utils.py): fix linting
* test: update test to include required fields
* test: fix test
* test: handle flaky test
* test: remove e2e test - hitting gemini rate limits
* Litellm dev 10 26 2024 (#6472)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* (Testing) Add unit testing for DualCache - ensure in memory cache is used when expected (#6471)
* test test_dual_cache_get_set
* unit testing for dual cache
* fix async_set_cache_sadd
* test_dual_cache_local_only
* redis otel tracing + async support for latency routing (#6452)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* fix(dual_cache.py): set default value for parent_otel_span
* fix(transformation.py): support 'response_format' for anthropic calls
* fix(transformation.py): check for cache_control inside 'function' block
* fix: fix linting error
* fix: fix linting errors
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* ui new build
* Add retry strat (#6520)
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* (fix) slack alerting - don't spam the failed cost tracking alert for the same model (#6543)
* fix use failing_model as cache key for failed_tracking_alert
* fix use standard logging payload for getting response cost
* fix kwargs.get("response_cost")
* fix getting response cost
* (feat) add XAI ChatCompletion Support (#6373)
* init commit for XAI
* add full logic for xai chat completion
* test_completion_xai
* docs xAI
* add xai/grok-beta
* test_xai_chat_config_get_openai_compatible_provider_info
* test_xai_chat_config_map_openai_params
* add xai streaming test
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
* fix use failing_model as cache key for failed_tracking_alert
* fix use standard logging payload for getting response cost
* fix kwargs.get("response_cost")
* fix getting response cost
* fix listing teams on ui
* LiteLLM Minor Fixes & Improvements (10/28/2024) (#6475)
* fix(anthropic/chat/transformation.py): support anthropic disable_parallel_tool_use param
Fixes https://github.com/BerriAI/litellm/issues/6456
* feat(anthropic/chat/transformation.py): support anthropic computer tool use
Closes https://github.com/BerriAI/litellm/issues/6427
* fix(vertex_ai/common_utils.py): parse out '$schema' when calling vertex ai
Fixes issue when trying to call vertex from vercel sdk
* fix(main.py): add 'extra_headers' support for azure on all translation endpoints
Fixes https://github.com/BerriAI/litellm/issues/6465
* fix: fix linting errors
* fix(transformation.py): handle no beta headers for anthropic
* test: cleanup test
* fix: fix linting error
* fix: fix linting errors
* fix: fix linting errors
* fix(transformation.py): handle dummy tool call
* fix(main.py): fix linting error
* fix(azure.py): pass required param
* LiteLLM Minor Fixes & Improvements (10/24/2024) (#6441)
* fix(azure.py): handle /openai/deployment in azure api base
* fix(factory.py): fix faulty anthropic tool result translation check
Fixes https://github.com/BerriAI/litellm/issues/6422
* fix(gpt_transformation.py): add support for parallel_tool_calls to azure
Fixes https://github.com/BerriAI/litellm/issues/6440
* fix(factory.py): support anthropic prompt caching for tool results
* fix(vertex_ai/common_utils): don't pop non-null required field
Fixes https://github.com/BerriAI/litellm/issues/6426
* feat(vertex_ai.py): support code_execution tool call for vertex ai + gemini
Closes https://github.com/BerriAI/litellm/issues/6434
* build(model_prices_and_context_window.json): Add 'supports_assistant_prefill' for bedrock claude-3-5-sonnet v2 models
Closes https://github.com/BerriAI/litellm/issues/6437
* fix(types/utils.py): fix linting
* test: update test to include required fields
* test: fix test
* test: handle flaky test
* test: remove e2e test - hitting gemini rate limits
* Litellm dev 10 26 2024 (#6472)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* (Testing) Add unit testing for DualCache - ensure in memory cache is used when expected (#6471)
* test test_dual_cache_get_set
* unit testing for dual cache
* fix async_set_cache_sadd
* test_dual_cache_local_only
* redis otel tracing + async support for latency routing (#6452)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* fix(dual_cache.py): set default value for parent_otel_span
* fix(transformation.py): support 'response_format' for anthropic calls
* fix(transformation.py): check for cache_control inside 'function' block
* fix: fix linting error
* fix: fix linting errors
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* feat(router.py): add check for max fallback depth
Prevent infinite loop for fallbacks
Closes https://github.com/BerriAI/litellm/issues/6498
* test: update test
* (fix) Prometheus - Log Postgres DB latency, status on prometheus (#6484)
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* docs clarify vertex vs gemini
* (router_strategy/) ensure all async functions use async cache methods (#6489)
* fix router strat
* use async set / get cache in router_strategy
* add coverage for router strategy
* fix imports
* fix batch_get_cache
* use async methods for least busy
* fix least busy use async methods
* fix test_dual_cache_increment
* test async_get_available_deployment when routing_strategy="least-busy"
* (fix) proxy - fix when `STORE_MODEL_IN_DB` should be set (#6492)
* set store_model_in_db at the top
* correctly use store_model_in_db global
* (fix) `PrometheusServicesLogger` `_get_metric` should return metric in Registry (#6486)
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* fix _get_metric in prom services logger
* add clear doc string
* unit testing for prom service logger
* bump: version 1.51.0 → 1.51.1
* Add `azure/gpt-4o-mini-2024-07-18` to model_prices_and_context_window.json (#6477)
* Update utils.py (#6468)
Fixed missing keys
* (perf) Litellm redis router fix - ~100ms improvement (#6483)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* perf(cooldown_cache.py): improve cooldown cache, to store cache results in memory for 5s, prevents redis call from being made on each request
reduces 100ms latency per call with caching enabled on router
* fix: fix test
* fix(cooldown_cache.py): handle if a result is None
* fix(cooldown_cache.py): add debug statements
* refactor(dual_cache.py): move to using an in-memory check for batch get cache, to prevent redis from being hit for every call
* fix(cooldown_cache.py): fix linting erropr
* build: merge main
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: vibhanshu-ob <115142120+vibhanshu-ob@users.noreply.github.com>
* fix(core_helpers.py): return None, instead of raising kwargs is None error
Closes https://github.com/BerriAI/litellm/issues/6500
* docs(cost_tracking.md): cleanup doc
* fix(vertex_and_google_ai_studio.py): handle function call with no params passed in
Closes https://github.com/BerriAI/litellm/issues/6495
* test(test_router_timeout.py): add test for router timeout + retry logic
* test: update test to use module level values
* (fix) Prometheus - Log Postgres DB latency, status on prometheus (#6484)
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* docs clarify vertex vs gemini
* (router_strategy/) ensure all async functions use async cache methods (#6489)
* fix router strat
* use async set / get cache in router_strategy
* add coverage for router strategy
* fix imports
* fix batch_get_cache
* use async methods for least busy
* fix least busy use async methods
* fix test_dual_cache_increment
* test async_get_available_deployment when routing_strategy="least-busy"
* (fix) proxy - fix when `STORE_MODEL_IN_DB` should be set (#6492)
* set store_model_in_db at the top
* correctly use store_model_in_db global
* (fix) `PrometheusServicesLogger` `_get_metric` should return metric in Registry (#6486)
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* fix _get_metric in prom services logger
* add clear doc string
* unit testing for prom service logger
* bump: version 1.51.0 → 1.51.1
* Add `azure/gpt-4o-mini-2024-07-18` to model_prices_and_context_window.json (#6477)
* Update utils.py (#6468)
Fixed missing keys
* (perf) Litellm redis router fix - ~100ms improvement (#6483)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* perf(cooldown_cache.py): improve cooldown cache, to store cache results in memory for 5s, prevents redis call from being made on each request
reduces 100ms latency per call with caching enabled on router
* fix: fix test
* fix(cooldown_cache.py): handle if a result is None
* fix(cooldown_cache.py): add debug statements
* refactor(dual_cache.py): move to using an in-memory check for batch get cache, to prevent redis from being hit for every call
* fix(cooldown_cache.py): fix linting erropr
* refactor(prometheus.py): move to using standard logging payload for reading the remaining request / tokens
Ensures prometheus token tracking works for anthropic as well
* fix: fix linting error
* fix(redis_cache.py): make sure ttl is always int (handle float values)
Fixes issue where redis_client.ex was not working correctly due to float ttl
* fix: fix linting error
* test: update test
* fix: fix linting error
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: vibhanshu-ob <115142120+vibhanshu-ob@users.noreply.github.com>
* fix(anthropic/chat/transformation.py): support anthropic disable_parallel_tool_use param
Fixes https://github.com/BerriAI/litellm/issues/6456
* feat(anthropic/chat/transformation.py): support anthropic computer tool use
Closes https://github.com/BerriAI/litellm/issues/6427
* fix(vertex_ai/common_utils.py): parse out '$schema' when calling vertex ai
Fixes issue when trying to call vertex from vercel sdk
* fix(main.py): add 'extra_headers' support for azure on all translation endpoints
Fixes https://github.com/BerriAI/litellm/issues/6465
* fix: fix linting errors
* fix(transformation.py): handle no beta headers for anthropic
* test: cleanup test
* fix: fix linting error
* fix: fix linting errors
* fix: fix linting errors
* fix(transformation.py): handle dummy tool call
* fix(main.py): fix linting error
* fix(azure.py): pass required param
* LiteLLM Minor Fixes & Improvements (10/24/2024) (#6441)
* fix(azure.py): handle /openai/deployment in azure api base
* fix(factory.py): fix faulty anthropic tool result translation check
Fixes https://github.com/BerriAI/litellm/issues/6422
* fix(gpt_transformation.py): add support for parallel_tool_calls to azure
Fixes https://github.com/BerriAI/litellm/issues/6440
* fix(factory.py): support anthropic prompt caching for tool results
* fix(vertex_ai/common_utils): don't pop non-null required field
Fixes https://github.com/BerriAI/litellm/issues/6426
* feat(vertex_ai.py): support code_execution tool call for vertex ai + gemini
Closes https://github.com/BerriAI/litellm/issues/6434
* build(model_prices_and_context_window.json): Add 'supports_assistant_prefill' for bedrock claude-3-5-sonnet v2 models
Closes https://github.com/BerriAI/litellm/issues/6437
* fix(types/utils.py): fix linting
* test: update test to include required fields
* test: fix test
* test: handle flaky test
* test: remove e2e test - hitting gemini rate limits
* Litellm dev 10 26 2024 (#6472)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* (Testing) Add unit testing for DualCache - ensure in memory cache is used when expected (#6471)
* test test_dual_cache_get_set
* unit testing for dual cache
* fix async_set_cache_sadd
* test_dual_cache_local_only
* redis otel tracing + async support for latency routing (#6452)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* fix(dual_cache.py): set default value for parent_otel_span
* fix(transformation.py): support 'response_format' for anthropic calls
* fix(transformation.py): check for cache_control inside 'function' block
* fix: fix linting error
* fix: fix linting errors
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* perf(cooldown_cache.py): improve cooldown cache, to store cache results in memory for 5s, prevents redis call from being made on each request
reduces 100ms latency per call with caching enabled on router
* fix: fix test
* fix(cooldown_cache.py): handle if a result is None
* fix(cooldown_cache.py): add debug statements
* refactor(dual_cache.py): move to using an in-memory check for batch get cache, to prevent redis from being hit for every call
* fix(cooldown_cache.py): fix linting erropr
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* fix _get_metric in prom services logger
* add clear doc string
* unit testing for prom service logger
* fix router strat
* use async set / get cache in router_strategy
* add coverage for router strategy
* fix imports
* fix batch_get_cache
* use async methods for least busy
* fix least busy use async methods
* fix test_dual_cache_increment
* test async_get_available_deployment when routing_strategy="least-busy"
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* fix(azure.py): handle /openai/deployment in azure api base
* fix(factory.py): fix faulty anthropic tool result translation check
Fixes https://github.com/BerriAI/litellm/issues/6422
* fix(gpt_transformation.py): add support for parallel_tool_calls to azure
Fixes https://github.com/BerriAI/litellm/issues/6440
* fix(factory.py): support anthropic prompt caching for tool results
* fix(vertex_ai/common_utils): don't pop non-null required field
Fixes https://github.com/BerriAI/litellm/issues/6426
* feat(vertex_ai.py): support code_execution tool call for vertex ai + gemini
Closes https://github.com/BerriAI/litellm/issues/6434
* build(model_prices_and_context_window.json): Add 'supports_assistant_prefill' for bedrock claude-3-5-sonnet v2 models
Closes https://github.com/BerriAI/litellm/issues/6437
* fix(types/utils.py): fix linting
* test: update test to include required fields
* test: fix test
* test: handle flaky test
* test: remove e2e test - hitting gemini rate limits
* add type for dd llm obs request ob
* working dd llm obs
* datadog use well defined type
* clean up
* unit test test_create_llm_obs_payload
* fix linting
* add datadog_llm_observability
* add datadog_llm_observability
* docs DD LLM obs
* run testing again
* document DD_ENV
* test_create_llm_obs_payload
* testing for failure events prometheus
* set set_llm_deployment_failure_metrics
* test_async_post_call_failure_hook
* unit testing for all prometheus functions
* fix linting
* fix(utils.py): support passing dynamic api base to validate_environment
Returns True if just api base is required and api base is passed
* fix(litellm_pre_call_utils.py): feature flag sending client headers to llm api
Fixes https://github.com/BerriAI/litellm/issues/6410
* fix(anthropic/chat/transformation.py): return correct error message
* fix(http_handler.py): add error response text in places where we expect it
* fix(factory.py): handle base case of no non-system messages to bedrock
Fixes https://github.com/BerriAI/litellm/issues/6411
* feat(cohere/embed): Support cohere image embeddings
Closes https://github.com/BerriAI/litellm/issues/6413
* fix(__init__.py): fix linting error
* docs(supported_embedding.md): add image embedding example to docs
* feat(cohere/embed): use cohere embedding returned usage for cost calc
* build(model_prices_and_context_window.json): add embed-english-v3.0 details (image cost + 'supports_image_input' flag)
* fix(cohere_transformation.py): fix linting error
* test(test_proxy_server.py): cleanup test
* test: cleanup test
* fix: fix linting errors