* fix(factory.py): ensure tool call converts image url
Fixes https://github.com/BerriAI/litellm/issues/6953
* fix(transformation.py): support mp4 + pdf url's for vertex ai
Fixes https://github.com/BerriAI/litellm/issues/6936
* fix(http_handler.py): mask gemini api key in error logs
Fixes https://github.com/BerriAI/litellm/issues/6963
* docs(prometheus.md): update prometheus FAQs
* feat(auth_checks.py): ensure specific model access > wildcard model access
if wildcard model is in access group, but specific model is not - deny access
* fix(auth_checks.py): handle auth checks for team based model access groups
handles scenario where model access group used for wildcard models
* fix(internal_user_endpoints.py): support adding guardrails on `/user/update`
Fixes https://github.com/BerriAI/litellm/issues/6942
* fix(key_management_endpoints.py): fix prepare_metadata_fields helper
* fix: fix tests
* build(requirements.txt): bump openai dep version
fixes proxies argument
* test: fix tests
* fix(http_handler.py): fix error message masking
* fix(bedrock_guardrails.py): pass in prepped data
* test: fix test
* test: fix nvidia nim test
* fix(http_handler.py): return original response headers
* fix: revert maskedhttpstatuserror
* test: update tests
* test: cleanup test
* fix(key_management_endpoints.py): fix metadata field update logic
* fix(key_management_endpoints.py): maintain initial order of guardrails in key update
* fix(key_management_endpoints.py): handle prepare metadata
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting errors
* fix: fix key management errors
* fix(key_management_endpoints.py): update metadata
* test: update test
* refactor: add more debug statements
* test: skip flaky test
* test: fix test
* fix: fix test
* fix: fix update metadata logic
* fix: fix test
* ci(config.yml): change db url for e2e ui testing
* feat - allow using gemini js SDK with LiteLLM
* add auth for gemini_proxy_route
* basic local test for js
* test cost tagging gemini js requests
* add js sdk test for gemini with litellm
* add docs on gemini JS SDK
* run node.js tests
* fix google ai studio tests
* fix vertex js spend test
* fix(__init__.py): add 'watsonx_text' as mapped llm api route
Fixes https://github.com/BerriAI/litellm/issues/6663
* fix(opentelemetry.py): fix passing parallel tool calls to otel
Fixes https://github.com/BerriAI/litellm/issues/6677
* refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling
reduces bugs in repo
* fix(__init__.py): update provider-model mapping to include all known provider-model mappings
Fixes https://github.com/BerriAI/litellm/issues/6669
* feat(anthropic): support passing document in llm api call
* docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function
* fix(factory.py): fix linting error
* fix(deepseek/chat): convert content list to str
Fixes https://github.com/BerriAI/litellm/issues/6642
* test(test_deepseek_completion.py): implement base llm unit tests
increase robustness across providers
* fix(router.py): support content policy violation fallbacks with default fallbacks
* fix(opentelemetry.py): refactor to move otel imports behing flag
Fixes https://github.com/BerriAI/litellm/issues/6636
* fix(opentelemtry.py): close span on success completion
* fix(user_api_key_auth.py): allow user_role to default to none
* fix: mark flaky test
* fix(opentelemetry.py): move otelconfig.from_env to inside the init
prevent otel errors raised just by importing the litellm class
* fix(user_api_key_auth.py): fix auth error
* log error on prometheus service failure hook
* use a more accurate function name for wrapper that handles logging db metrics
* fix log_db_metrics
* test_log_db_metrics_failure_error_types
* fix linting
* fix auth checks
* perf: move writing key to cache, to background task
* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils
adds 200ms on calls with pgdb connected
* fix(litellm_pre_call_utils.py'): rename call_type to actual call used
* perf(proxy_server.py): remove db logic from _get_config_from_file
was causing db calls to occur on every llm request, if team_id was set on key
* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db
reduces latency/call by ~100ms
* fix(proxy_server.py): minor fix on existing_settings not incl alerting
* fix(exception_mapping_utils.py): map databricks exception string
* fix(auth_checks.py): fix auth check logic
* test: correctly mark flaky test
* fix(utils.py): handle auth token error for tokenizers.from_pretrained
* fix(dual_cache.py): update in-memory check for redis batch get cache
Fixes latency delay for async_batch_redis_cache
* fix(service_logger.py): fix race condition causing otel service logging to be overwritten if service_callbacks set
* feat(user_api_key_auth.py): add parent otel component for auth
allows us to isolate how much latency is added by auth checks
* perf(parallel_request_limiter.py): move async_set_cache_pipeline (from max parallel request limiter) out of execution path (background task)
reduces latency by 200ms
* feat(user_api_key_auth.py): have user api key auth object return user tpm/rpm limits - reduces redis calls in downstream task (parallel_request_limiter)
Reduces latency by 400-800ms
* fix(parallel_request_limiter.py): use batch get cache to reduce user/key/team usage object calls
reduces latency by 50-100ms
* fix: fix linting error
* fix(_service_logger.py): fix import
* fix(user_api_key_auth.py): fix service logging
* fix(dual_cache.py): don't pass 'self'
* fix: fix python3.8 error
* fix: fix init]
* feat(router.py): add check for max fallback depth
Prevent infinite loop for fallbacks
Closes https://github.com/BerriAI/litellm/issues/6498
* test: update test
* (fix) Prometheus - Log Postgres DB latency, status on prometheus (#6484)
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* docs clarify vertex vs gemini
* (router_strategy/) ensure all async functions use async cache methods (#6489)
* fix router strat
* use async set / get cache in router_strategy
* add coverage for router strategy
* fix imports
* fix batch_get_cache
* use async methods for least busy
* fix least busy use async methods
* fix test_dual_cache_increment
* test async_get_available_deployment when routing_strategy="least-busy"
* (fix) proxy - fix when `STORE_MODEL_IN_DB` should be set (#6492)
* set store_model_in_db at the top
* correctly use store_model_in_db global
* (fix) `PrometheusServicesLogger` `_get_metric` should return metric in Registry (#6486)
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* fix _get_metric in prom services logger
* add clear doc string
* unit testing for prom service logger
* bump: version 1.51.0 → 1.51.1
* Add `azure/gpt-4o-mini-2024-07-18` to model_prices_and_context_window.json (#6477)
* Update utils.py (#6468)
Fixed missing keys
* (perf) Litellm redis router fix - ~100ms improvement (#6483)
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* perf(cooldown_cache.py): improve cooldown cache, to store cache results in memory for 5s, prevents redis call from being made on each request
reduces 100ms latency per call with caching enabled on router
* fix: fix test
* fix(cooldown_cache.py): handle if a result is None
* fix(cooldown_cache.py): add debug statements
* refactor(dual_cache.py): move to using an in-memory check for batch get cache, to prevent redis from being hit for every call
* fix(cooldown_cache.py): fix linting erropr
* build: merge main
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: vibhanshu-ob <115142120+vibhanshu-ob@users.noreply.github.com>
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* feat(custom_logger.py): expose new `async_dataset_hook` for modifying/rejecting argilla items before logging
Allows user more control on what gets logged to argilla for annotations
* feat(google_ai_studio_endpoints.py): add new `/azure/*` pass through route
enables pass-through for azure provider
* feat(utils.py): support checking ollama `/api/show` endpoint for retrieving ollama model info
Fixes https://github.com/BerriAI/litellm/issues/6322
* fix(user_api_key_auth.py): add `/key/delete` to an allowed_ui_routes
Fixes https://github.com/BerriAI/litellm/issues/6236
* fix(user_api_key_auth.py): remove type ignore
* fix(user_api_key_auth.py): route ui vs. api token checks differently
Fixes https://github.com/BerriAI/litellm/issues/6238
* feat(internal_user_endpoints.py): support setting models as a default internal user param
Closes https://github.com/BerriAI/litellm/issues/6239
* fix(user_api_key_auth.py): fix exception string
* fix(user_api_key_auth.py): fix error string
* fix: fix test
* track LiteLLM_OrganizationMembership
* add add_internal_user_to_organization
* add org membership to schema
* read organization membership when reading user info in auth checks
* add check for valid organization_id
* add test for test_create_new_user_in_organization
* test test_create_new_user_in_organization
* add new ADMIN role
* add test for org admins creating teams
* add test for test_org_admin_create_user_permissions
* test_org_admin_create_user_team_wrong_org_permissions
* test_org_admin_create_user_team_wrong_org_permissions
* fix organization_role_based_access_check
* fix getting user members
* fix TeamBase
* fix types used for use role
* fix type checks
* sync prisma schema
* docs - organization admins
* fix use organization_endpoints for /organization management
* add types for org member endpoints
* fix role name for org admin
* add type for member add response
* add organization/member_add
* add error handling for adding members to an org
* add nice doc string for oranization/member_add
* fix test_create_new_user_in_organization
* linting fix
* use simple route changes
* fix types
* add organization member roles
* add org admin auth checks
* add auth checks for orgs
* test for creating teams as org admin
* simplify org id usage
* fix typo
* test test_org_admin_create_user_team_wrong_org_permissions
* fix type check issue
* code quality fix
* fix schema.prisma
* fix(caching.py): set ttl for async_increment cache
fixes issue where ttl for redis client was not being set on increment_cache
Fixes https://github.com/BerriAI/litellm/issues/5609
* fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis
Fixes https://github.com/BerriAI/litellm/issues/5609
* fix(router.py): support adding retry policy + allowed fails policy via config.yaml
* fix(router.py): don't cooldown single deployments
No point, as there's no other deployment to loadbalance with.
* fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens
Closes https://github.com/BerriAI/litellm/issues/5605
* docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs
* fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set
Fixes issue where key logging would not be set if team metadata was not none
* fix(secret_managers/main.py): load environment variables correctly
Fixes issue where os.environ/ was not being loaded correctly
* test(test_router.py): fix test
* feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek
* test: fix tests
* test: fix test
* test: fix test
* test: fix test
* test: fix test
* Minor IAM AWS OIDC Improvements (#5246)
* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.
* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.
* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.
* fix(router.py): log rejected requests
Fixes https://github.com/BerriAI/litellm/issues/5498
* refactor: don't use verbose_logger.exception, if exception is raised
User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.
* fix(datadog.py): support setting datadog source as an env var
Fixes https://github.com/BerriAI/litellm/issues/5508
* docs(logging.md): add dd_source to datadog docs
* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers
* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509)
* feat(anthropic.py): support 'cache_control' param for content when it is a string
* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519)
This reverts commit 3fac0349c2.
* refactor: ci/cd run again
---------
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>