* fix(main.py): support passing max retries to azure/openai embedding integrations
Fixes https://github.com/BerriAI/litellm/issues/7003
* feat(team_endpoints.py): allow updating team model aliases
Closes https://github.com/BerriAI/litellm/issues/6956
* feat(router.py): allow specifying model id as fallback - skips any cooldown check
Allows a default model to be checked if all models in cooldown
s/o @micahjsmith
* docs(reliability.md): add fallback to specific model to docs
* fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util
Allows user to identify if messages/tools have prompt caching
Related issue: https://github.com/BerriAI/litellm/issues/6784
* feat(router.py): store model id for prompt caching valid prompt
Allows routing to that model id on subsequent requests
* fix(router.py): only cache if prompt is valid prompt caching prompt
prevents storing unnecessary items in cache
* feat(router.py): support routing prompt caching enabled models to previous deployments
Closes https://github.com/BerriAI/litellm/issues/6784
* test: fix linting errors
* feat(databricks/): convert basemodel to dict and exclude none values
allow passing pydantic message to databricks
* fix(utils.py): ensure all chat completion messages are dict
* (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081)
* add custom_llm_provider to SpendLogsPayload
* add custom_llm_provider to SpendLogs
* add custom llm provider to SpendLogs payload
* test_spend_logs_payload
* Add MLflow to the side bar (#7031)
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* (bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082)
* catch DB_CONNECTION_ERROR_TYPES
* fix DB retry mechanism for SpendLog updates
* use DB_CONNECTION_ERROR_TYPES in auth checks
* fix exp back off for writing SpendLogs
* use _raise_failed_update_spend_exception to ensure errors print as NON blocking
* test_update_spend_logs_multiple_batches_with_failure
* (Feat) Add StructuredOutputs support for Fireworks.AI (#7085)
* fix model cost map fireworks ai "supports_response_schema": true,
* fix supports_response_schema
* fix map openai params fireworks ai
* test_map_response_format
* test_map_response_format
* added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084)
* bump: version 1.53.9 → 1.54.0
* fix deepinfra
* litellm db fixes LiteLLM_UserTable (#7089)
* ci/cd queue new release
* fix llama-3.3-70b-versatile
* refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090)
* fix refactor - use consistent file naming convention
* ci/cd run again
* fix naming structure
* fix use consistent naming (#7092)
---------
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>
* catch DB_CONNECTION_ERROR_TYPES
* fix DB retry mechanism for SpendLog updates
* use DB_CONNECTION_ERROR_TYPES in auth checks
* fix exp back off for writing SpendLogs
* use _raise_failed_update_spend_exception to ensure errors print as NON blocking
* test_update_spend_logs_multiple_batches_with_failure
* feat(langfuse/): support langfuse prompt management
Initial working commit for langfuse prompt management support
Closes https://github.com/BerriAI/litellm/issues/6269
* test: update test
* fix(litellm_logging.py): suppress linting error
* fix get_standard_logging_object_payload
* fix async_post_call_failure_hook
* fix post_call_failure_hook
* fix change
* fix _is_proxy_only_error
* fix async_post_call_failure_hook
* fix getting request body
* remove redundant code
* use a well named original function name for auth errors
* fix logging auth fails on DD
* fix using request body
* use helper for _handle_logging_proxy_only_error
* feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc.
* fix(utils.py): allow disabling end user cost tracking with new param
Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small
* docs(configs.md): add disable_end_user_cost_tracking reference to docs
* feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role
Enables admin to restrict key creation, and assign team admins to handle distributing keys
* test(test_key_management.py): add unit testing for personal / team key restriction checks
* docs: add docs on restricting key creation
* docs(finetuned_models.md): add new guide on calling finetuned models
* docs(input.md): cleanup anthropic supported params
Closes https://github.com/BerriAI/litellm/issues/6856
* test(test_embedding.py): add test for passing extra headers via embedding
* feat(cohere/embed): pass client to async embedding
* feat(rerank.py): add `/v1/rerank` if missing for cohere base url
Closes https://github.com/BerriAI/litellm/issues/6844
* fix(main.py): pass extra_headers param to openai
Fixes https://github.com/BerriAI/litellm/issues/6836
* fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set
Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically
* fix(handler.py): fix linting error
* fix: fix typing
* build: add conftest to proxy_admin_ui_tests/
* test: fix test
* fix: fix linting errors
* test: fix test
* fix: fix pass through testing
* feat(customer_endpoints.py): support passing budget duration via `/customer/new` endpoint
Closes https://github.com/BerriAI/litellm/issues/5651
* docs: add missing params to swagger + api documentation test
* docs: add documentation for all key endpoints
documents all params on swagger
* docs(internal_user_endpoints.py): document all /user/new params
Ensures all params are documented
* docs(team_endpoints.py): add missing documentation for team endpoints
Ensures 100% param documentation on swagger
* docs(organization_endpoints.py): document all org params
Adds documentation for all params in org endpoint
* docs(customer_endpoints.py): add coverage for all params on /customer endpoints
ensures all /customer/* params are documented
* ci(config.yml): add endpoint doc testing to ci/cd
* fix: fix internal_user_endpoints.py
* fix(internal_user_endpoints.py): support 'duration' param
* fix(partner_models/main.py): fix anthropic re-raise exception on vertex
* fix: fix pydantic obj
* fix(caching): convert arg to equivalent kwargs in llm caching handler
prevent unexpected errors
* fix(caching_handler.py): don't pass args to caching
* fix(caching): remove all *args from caching.py
* fix(caching): consistent function signatures + abc method
* test(caching_unit_tests.py): add unit tests for llm caching
ensures coverage for common caching scenarios across different implementations
* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one
* fix(router.py): drop redis password requirement
* fix(proxy_server.py): fix faulty slack alerting check
* fix(langfuse.py): avoid copying functions/thread lock objects in metadata
fixes metadata copy error when parent otel span in metadata
* test: update test
* fix raise correct error on /key/info
* add not_found_error error
* fix key not found in DB error
* use 1 helper for checking token hash
* fix error code on key info
* fix test key gen prisma
* test_generate_and_call_key_info
* test fix test_call_with_valid_model_using_all_models
* fix key info tests
* log error on prometheus service failure hook
* use a more accurate function name for wrapper that handles logging db metrics
* fix log_db_metrics
* test_log_db_metrics_failure_error_types
* fix linting
* fix auth checks
* fix debug statements
* fix assert prisma_client.health_check is called on _setup
* asser that _setup_prisma_client is called on startup proxy
* fix prisma client health_check
* add test_bad_database_url
* add strict checks on db startup
* temp remove fix to validate if check works as expected
* add health_check back
* test_proxy_server_prisma_setup_invalid_db
* perf: move writing key to cache, to background task
* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils
adds 200ms on calls with pgdb connected
* fix(litellm_pre_call_utils.py'): rename call_type to actual call used
* perf(proxy_server.py): remove db logic from _get_config_from_file
was causing db calls to occur on every llm request, if team_id was set on key
* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db
reduces latency/call by ~100ms
* fix(proxy_server.py): minor fix on existing_settings not incl alerting
* fix(exception_mapping_utils.py): map databricks exception string
* fix(auth_checks.py): fix auth check logic
* test: correctly mark flaky test
* fix(utils.py): handle auth token error for tokenizers.from_pretrained
* fix(dual_cache.py): update in-memory check for redis batch get cache
Fixes latency delay for async_batch_redis_cache
* fix(service_logger.py): fix race condition causing otel service logging to be overwritten if service_callbacks set
* feat(user_api_key_auth.py): add parent otel component for auth
allows us to isolate how much latency is added by auth checks
* perf(parallel_request_limiter.py): move async_set_cache_pipeline (from max parallel request limiter) out of execution path (background task)
reduces latency by 200ms
* feat(user_api_key_auth.py): have user api key auth object return user tpm/rpm limits - reduces redis calls in downstream task (parallel_request_limiter)
Reduces latency by 400-800ms
* fix(parallel_request_limiter.py): use batch get cache to reduce user/key/team usage object calls
reduces latency by 50-100ms
* fix: fix linting error
* fix(_service_logger.py): fix import
* fix(user_api_key_auth.py): fix service logging
* fix(dual_cache.py): don't pass 'self'
* fix: fix python3.8 error
* fix: fix init]
* fix use failing_model as cache key for failed_tracking_alert
* fix use standard logging payload for getting response cost
* fix kwargs.get("response_cost")
* fix getting response cost
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* fix _get_metric in prom services logger
* add clear doc string
* unit testing for prom service logger
* fix logging DB fails on prometheus
* unit testing log to otel wrapper
* unit testing for service logger + prometheus
* use LATENCY buckets for service logging
* fix service logging
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
* feat(proxy_server.py): check if views exist on proxy server startup + refactor startup event logic to <50 LOC
* refactor(redis_cache.py): use a default cache value when writing to r… (#6358)
* refactor(redis_cache.py): use a default cache value when writing to redis
prevent redis from blowing up in high traffic
* refactor(redis_cache.py): refactor all cache writes to use self.get_ttl
ensures default ttl always used when writing to redis
Prevents redis db from blowing up in prod
* feat(proxy_cli.py): add new 'log_config' cli param (#6352)
* feat(proxy_cli.py): add new 'log_config' cli param
Allows passing logging.conf to uvicorn on startup
* docs(cli.md): add logging conf to uvicorn cli docs
* fix(get_llm_provider_logic.py): fix default api base for litellm_proxy
Fixes https://github.com/BerriAI/litellm/issues/6332
* feat(openai_like/embedding): Add support for jina ai embeddings
Closes https://github.com/BerriAI/litellm/issues/6337
* docs(deploy.md): update entrypoint.sh filepath post-refactor
Fixes outdated docs
* feat(prometheus.py): emit time_to_first_token metric on prometheus
Closes https://github.com/BerriAI/litellm/issues/6334
* fix(prometheus.py): only emit time to first token metric if stream is True
enables more accurate ttft usage
* test: handle vertex api instability
* fix(get_llm_provider_logic.py): fix import
* fix(openai.py): fix deepinfra default api base
* fix(anthropic/transformation.py): remove anthropic beta header (#6361)
* docs(sidebars.js): add jina ai embedding to docs
* docs(sidebars.js): add jina ai to left nav
* bump: version 1.50.1 → 1.50.2
* langfuse use helper for get_langfuse_logging_config
* Refactor: apply early return (#6369)
* (refactor) remove berrispendLogger - unused logging integration (#6363)
* fix remove berrispendLogger
* remove unused clickhouse logger
* fix docs configs.md
* (fix) standard logging metadata + add unit testing (#6366)
* fix setting StandardLoggingMetadata
* add unit testing for standard logging metadata
* fix otel logging test
* fix linting
* fix typing
* Revert "(fix) standard logging metadata + add unit testing (#6366)" (#6381)
This reverts commit 8359cb6fa9.
* add new 35 mode lcard (#6378)
* Add claude 3 5 sonnet 20241022 models for all provides (#6380)
* Add Claude 3.5 v2 on Amazon Bedrock and Vertex AI.
* added anthropic/claude-3-5-sonnet-20241022
* add new 35 mode lcard
---------
Co-authored-by: Paul Gauthier <paul@paulg.com>
Co-authored-by: lowjiansheng <15527690+lowjiansheng@users.noreply.github.com>
* test(skip-flaky-google-context-caching-test): google is not reliable. their sample code is also not working
* test(test_alangfuse.py): handle flaky langfuse test better
* (feat) Arize - Allow using Arize HTTP endpoint (#6364)
* arize use helper for get_arize_opentelemetry_config
* use helper to get Arize OTEL config
* arize add helpers for arize
* docs allow using arize http endpoint
* fix importing OTEL for Arize
* use static methods for ArizeLogger
* fix ArizeLogger tests
* Litellm dev 10 22 2024 (#6384)
* fix(utils.py): add 'disallowed_special' for token counting on .encode()
Fixes error when '<
endoftext
>' in string
* Revert "(fix) standard logging metadata + add unit testing (#6366)" (#6381)
This reverts commit 8359cb6fa9.
* add new 35 mode lcard (#6378)
* Add claude 3 5 sonnet 20241022 models for all provides (#6380)
* Add Claude 3.5 v2 on Amazon Bedrock and Vertex AI.
* added anthropic/claude-3-5-sonnet-20241022
* add new 35 mode lcard
---------
Co-authored-by: Paul Gauthier <paul@paulg.com>
Co-authored-by: lowjiansheng <15527690+lowjiansheng@users.noreply.github.com>
* test(skip-flaky-google-context-caching-test): google is not reliable. their sample code is also not working
* Fix metadata being overwritten in speech() (#6295)
* fix: adding missing redis cluster kwargs (#6318)
Co-authored-by: Ali Arian <ali.arian@breadfinancial.com>
* Add support for `max_completion_tokens` in Azure OpenAI (#6376)
Now that Azure supports `max_completion_tokens`, no need for special handling for this param and let it pass thru. More details: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#api-support
* build(model_prices_and_context_window.json): add voyage-finance-2 pricing
Closes https://github.com/BerriAI/litellm/issues/6371
* build(model_prices_and_context_window.json): fix llama3.1 pricing model name on map
Closes https://github.com/BerriAI/litellm/issues/6310
* feat(realtime_streaming.py): just log specific events
Closes https://github.com/BerriAI/litellm/issues/6267
* fix(utils.py): more robust checking if unmapped vertex anthropic model belongs to that family of models
Fixes https://github.com/BerriAI/litellm/issues/6383
* Fix Ollama stream handling for tool calls with None content (#6155)
* test(test_max_completions): update test now that azure supports 'max_completion_tokens'
* fix(handler.py): fix linting error
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Paul Gauthier <paul@paulg.com>
Co-authored-by: John HU <hszqqq12@gmail.com>
Co-authored-by: Ali Arian <113945203+ali-arian@users.noreply.github.com>
Co-authored-by: Ali Arian <ali.arian@breadfinancial.com>
Co-authored-by: Anand Taralika <46954145+taralika@users.noreply.github.com>
Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
* bump: version 1.50.2 → 1.50.3
* build(deps): bump http-proxy-middleware in /docs/my-website (#6395)
Bumps [http-proxy-middleware](https://github.com/chimurai/http-proxy-middleware) from 2.0.6 to 2.0.7.
- [Release notes](https://github.com/chimurai/http-proxy-middleware/releases)
- [Changelog](https://github.com/chimurai/http-proxy-middleware/blob/v2.0.7/CHANGELOG.md)
- [Commits](https://github.com/chimurai/http-proxy-middleware/compare/v2.0.6...v2.0.7)
---
updated-dependencies:
- dependency-name: http-proxy-middleware
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* (docs + testing) Correctly document the timeout value used by litellm proxy is 6000 seconds + add to best practices for prod (#6339)
* fix docs use documented timeout
* document request timeout
* add test for litellm.request_timeout
* add test for checking value of timeout
* (refactor) move convert dict to model response to llm_response_utils/ (#6393)
* refactor move convert dict to model response
* fix imports
* fix import _handle_invalid_parallel_tool_calls
* (refactor) litellm.Router client initialization utils (#6394)
* refactor InitalizeOpenAISDKClient
* use helper func for _should_create_openai_sdk_client_for_model
* use static methods for set client on litellm router
* reduce LOC in _get_client_initialization_params
* fix _should_create_openai_sdk_client_for_model
* code quality fix
* test test_should_create_openai_sdk_client_for_model
* test test_get_client_initialization_params_openai
* fix mypy linting errors
* fix OpenAISDKClientInitializationParams
* test_get_client_initialization_params_all_env_vars
* test_get_client_initialization_params_azure_ai_studio_mistral
* test_get_client_initialization_params_default_values
* fix _get_client_initialization_params
* (fix) Langfuse key based logging (#6372)
* langfuse use helper for get_langfuse_logging_config
* fix get_langfuse_logger_for_request
* fix import
* fix get_langfuse_logger_for_request
* test_get_langfuse_logger_for_request_with_dynamic_params
* unit testing for test_get_langfuse_logger_for_request_with_no_dynamic_params
* parameterized langfuse testing
* fix langfuse test
* fix langfuse logging
* fix test_aaalangfuse_logging_metadata
* fix langfuse log metadata test
* fix langfuse logger
* use create_langfuse_logger_from_credentials
* fix test_get_langfuse_logger_for_request_with_no_dynamic_params
* fix correct langfuse/ folder structure
* use static methods for langfuse logger
* add commment on langfuse handler
* fix linting error
* add unit testing for langfuse logging
* fix linting
* fix failure handler langfuse
* Revert "(refactor) litellm.Router client initialization utils (#6394)" (#6403)
This reverts commit b70147f63b.
* def test_text_completion_with_echo(stream): (#6401)
test
* fix linting - remove # noqa PLR0915 from fixed function
* test: cleanup codestral tests - backend api unavailable
* (refactor) prometheus async_log_success_event to be under 100 LOC (#6416)
* unit testig for prometheus
* unit testing for success metrics
* use 1 helper for _increment_token_metrics
* use helper for _increment_remaining_budget_metrics
* use _increment_remaining_budget_metrics
* use _increment_top_level_request_and_spend_metrics
* use helper for _set_latency_metrics
* remove noqa violation
* fix test prometheus
* test prometheus
* unit testing for all prometheus helper functions
* fix prom unit tests
* fix unit tests prometheus
* fix unit test prom
* (refactor) router - use static methods for client init utils (#6420)
* use InitalizeOpenAISDKClient
* use InitalizeOpenAISDKClient static method
* fix # noqa: PLR0915
* (code cleanup) remove unused and undocumented logging integrations - litedebugger, berrispend (#6406)
* code cleanup remove unused and undocumented code files
* fix unused logging integrations cleanup
* bump: version 1.50.3 → 1.50.4
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Hakan Taşköprü <Haknt@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Paul Gauthier <paul@paulg.com>
Co-authored-by: John HU <hszqqq12@gmail.com>
Co-authored-by: Ali Arian <113945203+ali-arian@users.noreply.github.com>
Co-authored-by: Ali Arian <ali.arian@breadfinancial.com>
Co-authored-by: Anand Taralika <46954145+taralika@users.noreply.github.com>
Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix: enable new 'disable_prisma_schema_update' flag
* build(config.yml): remove setup remote docker step
* ci(config.yml): give container time to start up
* ci(config.yml): update test
* build(config.yml): actually start docker
* build(config.yml): simplify grep check
* fix(prisma_client.py): support reading disable_schema_update via env vars
* ci(config.yml): add test to check if all general settings are documented
* build(test_General_settings.py): check available dir
* ci: check ../ repo path
* build: check ./
* build: fix test
* LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925)
* fix(litellm_logging.py): don't initialize prometheus_logger if non premium user
Prevents bad error messages in logs
Fixes https://github.com/BerriAI/litellm/issues/5897
* Add Support for Custom Providers in Vision and Function Call Utils (#5688)
* Add Support for Custom Providers in Vision and Function Call Utils Lookup
* Remove parallel function call due to missing model info param
* Add Unit Tests for Vision and Function Call Changes
* fix-#5920: set header value to string to fix "'int' object has no att… (#5922)
* LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880)
* LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842)
* feat(auth_utils.py): enable admin to allow client-side credentials to be passed
Makes it easier for devs to experiment with finetuned fireworks ai models
* feat(router.py): allow setting configurable_clientside_auth_params for a model
Closes https://github.com/BerriAI/litellm/issues/5843
* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit
Fixes https://github.com/BerriAI/litellm/issues/5850
* fix(azure_ai/): support content list for azure ai
Fixes https://github.com/BerriAI/litellm/issues/4237
* fix(litellm_logging.py): always set saved_cache_cost
Set to 0 by default
* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing
handles calling 405b+ size models
* fix(slack_alerting.py): fix error alerting for failed spend tracking
Fixes regression with slack alerting error monitoring
* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error
* docs(bedrock.md): add llama3-1 models
* test: fix tests
* fix(azure_ai/chat): fix transformation for azure ai calls
* feat(azure_ai/embed): Add azure ai embeddings support
Closes https://github.com/BerriAI/litellm/issues/5861
* fix(azure_ai/embed): enable async embedding
* feat(azure_ai/embed): support azure ai multimodal embeddings
* fix(azure_ai/embed): support async multi modal embeddings
* feat(together_ai/embed): support together ai embedding calls
* feat(rerank/main.py): log source documents for rerank endpoints to langfuse
improves rerank endpoint logging
* fix(langfuse.py): support logging `/audio/speech` input to langfuse
* test(test_embedding.py): fix test
* test(test_completion_cost.py): fix helper util
* fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'"
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926)
This reverts commit a554ae2695.
* build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing
Enables cost tracking for azure ai cohere rerank models
* fix(litellm_logging.py): fix debug log to be clearer
Closes https://github.com/BerriAI/litellm/issues/5909
* test(test_utils.py): fix test name
* fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models
* fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints
* fix(converse_handler.py): support new llama 3-2 models
Fixes https://github.com/BerriAI/litellm/issues/5901
* fix(litellm_logging.py): ensure response is redacted for standard message logging
Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360
* fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation
allows user to set custom cost for model
* fix(config.yml): fix docker hub auht
* build(config.yml): add docker auth to all tests
* fix(db/create_views.py): fix linting error
* fix(main.py): fix circular import
* fix(azure_ai/__init__.py): fix circular import
* fix(main.py): fix import
* fix: fix linting errors
* test: fix test
* fix(proxy_server.py): pass premium user value on startup
used for prometheus init
---------
Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>
* handle streaming for azure ai studio error
* [Perf Proxy] parallel request limiter - use one cache update call (#5932)
* fix parallel request limiter - use one cache update call
* ci/cd run again
* run ci/cd again
* use docker username password
* fix config.yml
* fix config
* fix config
* fix config.yml
* ci/cd run again
* use correct typing for batch set cache
* fix async_set_cache_pipeline
* fix only check user id tpm / rpm limits when limits set
* fix test_openai_azure_embedding_with_oidc_and_cf
* test: fix test
* test(test_rerank.py): fix test
---------
Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix parallel request limiter - use one cache update call
* ci/cd run again
* run ci/cd again
* use docker username password
* fix config.yml
* fix config
* fix config
* fix config.yml
* ci/cd run again
* use correct typing for batch set cache
* fix async_set_cache_pipeline
* fix only check user id tpm / rpm limits when limits set
* fix test_openai_azure_embedding_with_oidc_and_cf
* LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842)
* feat(auth_utils.py): enable admin to allow client-side credentials to be passed
Makes it easier for devs to experiment with finetuned fireworks ai models
* feat(router.py): allow setting configurable_clientside_auth_params for a model
Closes https://github.com/BerriAI/litellm/issues/5843
* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit
Fixes https://github.com/BerriAI/litellm/issues/5850
* fix(azure_ai/): support content list for azure ai
Fixes https://github.com/BerriAI/litellm/issues/4237
* fix(litellm_logging.py): always set saved_cache_cost
Set to 0 by default
* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing
handles calling 405b+ size models
* fix(slack_alerting.py): fix error alerting for failed spend tracking
Fixes regression with slack alerting error monitoring
* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error
* docs(bedrock.md): add llama3-1 models
* test: fix tests
* fix(azure_ai/chat): fix transformation for azure ai calls
* fix(model_prices_and_context_window.json): add cost tracking for more vertex llama3.1 model
8b and 70b models
* fix(proxy/utils.py): handle data being none on pre-call hooks
* fix(proxy/): create views on initial proxy startup
fixes base case, where user starts proxy for first time
Fixes https://github.com/BerriAI/litellm/issues/5756
* build(config.yml): fix vertex version for test
* feat(ui/): support enabling/disabling slack alerting
Allows admin to turn on/off slack alerting through ui
* feat(rerank/main.py): support langfuse logging
* fix(proxy/utils.py): fix linting errors
* fix(langfuse.py): log clean metadata
* test(tests): replace deprecated openai model
* fix(proxy_server.py): use default azure credentials to support azure non-client secret kms
* fix(langsmith.py): raise error if credentials missing
* feat(langsmith.py): support error logging for langsmith + standard logging payload
Fixes https://github.com/BerriAI/litellm/issues/5738
* Fix hardcoding of schema in view check (#5749)
* fix - deal with case when check view exists returns None (#5740)
* Revert "fix - deal with case when check view exists returns None (#5740)" (#5741)
This reverts commit 535228159b.
* test(test_router_debug_logs.py): move to mock response
* Fix hardcoding of schema
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* fix(proxy_server.py): allow admin to disable ui via `DISABLE_ADMIN_UI` flag
* fix(router.py): fix default model name value
Fixes 55db19a1e4 (r1763712148)
* fix(utils.py): fix unbound variable error
* feat(rerank/main.py): add azure ai rerank endpoints
Closes https://github.com/BerriAI/litellm/issues/5667
* feat(secret_detection.py): Allow configuring secret detection params
Allows admin to control what plugins to run for secret detection. Prevents overzealous secret detection.
* docs(secret_detection.md): add secret detection guardrail docs
* fix: fix linting errors
* fix - deal with case when check view exists returns None (#5740)
* Revert "fix - deal with case when check view exists returns None (#5740)" (#5741)
This reverts commit 535228159b.
* Litellm fix router testing (#5748)
* test: fix testing - azure changed content policy error logic
* test: fix tests to use mock responses
* test(test_image_generation.py): handle api instability
* test(test_image_generation.py): handle azure api instability
* fix(utils.py): fix unbounded variable error
* fix(utils.py): fix unbounded variable error
* test: refactor test to use mock response
* test: mark flaky azure tests
* Bump next from 14.1.1 to 14.2.10 in /ui/litellm-dashboard (#5753)
Bumps [next](https://github.com/vercel/next.js) from 14.1.1 to 14.2.10.
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v14.1.1...v14.2.10)
---
updated-dependencies:
- dependency-name: next
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* [Fix] o1-mini causes pydantic warnings on `reasoning_tokens` (#5754)
* add requester_metadata in standard logging payload
* log requester_metadata in metadata
* use StandardLoggingPayload for logging
* docs StandardLoggingPayload
* fix import
* include standard logging object in failure
* add test for requester metadata
* handle completion_tokens_details
* add test for completion_tokens_details
* [Feat-Proxy-DataDog] Log Redis, Postgres Failure events on DataDog (#5750)
* dd - start tracking redis status on dd
* add async_service_succes_hook / failure hook in custom logger
* add async_service_failure_hook
* log service failures on dd
* fix import error
* add test for redis errors / warning
* [Fix] Router/ Proxy - Tag Based routing, raise correct error when no deployments found and tag filtering is on (#5745)
* fix tag routing - raise correct error when no model with tag based routing
* fix error string from tag based routing
* test router tag based routing
* raise 401 error when no tags avialable for deploymen
* linting fix
* [Feat] Log Request metadata on gcs bucket logging (#5743)
* add requester_metadata in standard logging payload
* log requester_metadata in metadata
* use StandardLoggingPayload for logging
* docs StandardLoggingPayload
* fix import
* include standard logging object in failure
* add test for requester metadata
* fix(litellm_logging.py): fix logging message
* fix(rerank_api/main.py): fix linting errors
* fix(custom_guardrails.py): maintain backwards compatibility for older guardrails
* fix(rerank_api/main.py): fix cost tracking for rerank endpoints
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: steffen-sbt <148480574+steffen-sbt@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>