litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 10:44:24 +00:00

Author	SHA1	Message	Date
Sunny Wan	f9a5109203	Merge branch 'BerriAI:main' into main	2025-03-13 19:37:22 -04:00
Krish Dholakia	2d957a0ed9	Merge branch 'main' into litellm_dev_03_10_2025_p3	2025-03-12 14:56:01 -07:00
Ishaan Jaff	342741ede1	Merge branch 'main' into litellm_responses_api_support	2025-03-12 12:04:12 -07:00
Ishaan Jaff	b790f0a5c6	log input of response API	2025-03-11 22:34:18 -07:00
Ishaan Jaff	4d55212c62	add BaseResponsesAPIConfig	2025-03-11 15:57:53 -07:00
Krrish Dholakia	f56c5ca380	feat: working e2e credential management - support reusing existing credentials	2025-03-10 19:29:24 -07:00
Krrish Dholakia	fdd5ba3084	feat(credential_accessor.py): support loading in credentials from credential_list Resolves https://github.com/BerriAI/litellm/issues/9114	2025-03-10 17:15:58 -07:00
omrishiv	0674491386	add support for Amazon Nova Canvas model (#7838 ) * add initial support for Amazon Nova Canvas model Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * adjust name to AmazonNovaCanvas and map function variables to config Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * tighten model name check Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * fix quality mapping Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * add premium quality in config Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * support all Amazon Nova Canvas tasks * remove unused import Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * add tests for image generation tasks and fix payload Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * add missing util file Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * update model prices backup file Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * remove image tasks other than text->image Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> --------- Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>	2025-03-10 08:02:00 -07:00
Krish Dholakia	4330ef8e81	Fix batches api cost tracking + Log batch models in spend logs / standard logging payload (#9077 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 42s Details * feat(batches/): fix batch cost calculation - ensure it's accurate use the correct cost value - prev. defaulting to non-batch cost * feat(batch_utils.py): log batch models to spend logs + standard logging payload makes it easy to understand how cost was calculated * fix: fix stored payload for test * test: fix test	2025-03-08 11:47:25 -08:00
Krish Dholakia	0e3caf92b9	UI - new API Playground for testing LiteLLM translation (#9073 ) * feat: initial commit - enable dev to see translated request * feat(utils.py): expose new endpoint - `/utils/transform_request` to see the raw request sent by litellm * feat(transform_request.tsx): allow user to see their transformed request * refactor(litellm_logging.py): return raw request in 3 parts - api_base, headers, request body easier to render each individually on UI vs. extracting from combined string * feat: transform_request.tsx working e2e raw request viewing * fix(litellm_logging.py): fix transform viewing for bedrock models * fix(litellm_logging.py): don't return sensitive headers in raw request headers prevent accidental leak * feat(transform_request.tsx): style improvements	2025-03-07 19:39:31 -08:00
Ishaan Jaff	f47987e673	(Refactor) `/v1/messages` to follow simpler logic for Anthropic API spec (#9013 ) * anthropic_messages_handler v0 * fix /messages * working messages with router methods * test_anthropic_messages_handler_litellm_router_non_streaming * test_anthropic_messages_litellm_router_non_streaming_with_logging * AnthropicMessagesConfig * _handle_anthropic_messages_response_logging * working with /v1/messages endpoint * working /v1/messages endpoint * refactor to use router factory function * use aanthropic_messages * use BaseConfig for Anthropic /v1/messages * track api key, team on /v1/messages endpoint * fix get_logging_payload * BaseAnthropicMessagesTest * align test config * test_anthropic_messages_with_thinking * test_anthropic_streaming_with_thinking * fix - display anthropic url for debugging * test_bad_request_error_handling * test_anthropic_messages_router_streaming_with_bad_request * fix ProxyException * test_bad_request_error_handling_streaming * use provider_specific_header * test_anthropic_messages_with_extra_headers * test_anthropic_messages_to_wildcard_model * fix gcs pub sub test * standard_logging_payload * fix unit testing for anthopic /v1/messages support * fix pass through anthropic messages api * delete dead code * fix anthropic pass through response * revert change to spend tracking utils * fix get_litellm_metadata_from_kwargs * fix spend logs payload json * proxy_pass_through_endpoint_tests * TestAnthropicPassthroughBasic * fix pass through tests * test_async_vertex_proxy_route_api_key_auth * _handle_anthropic_messages_response_logging * vertex_credentials * test_set_default_vertex_config * test_anthropic_messages_litellm_router_non_streaming_with_logging * test_ageneric_api_call_with_fallbacks_basic * test__aadapter_completion	2025-03-06 00:43:08 -08:00
Sunny Wan	a2fed4059e	added Snowflake config to ProviderConfigManager	2025-03-05 20:32:18 -05:00
Krish Dholakia	662c59adcf	Support caching on reasoning content + other fixes (#8973 ) * fix(factory.py): pass on anthropic thinking content from assistant call * fix(factory.py): fix anthropic messages to handle thinking blocks Fixes https://github.com/BerriAI/litellm/issues/8961 * fix(factory.py): fix bedrock handling for assistant content in messages Fixes https://github.com/BerriAI/litellm/issues/8961 * feat(convert_dict_to_response.py): handle reasoning content + thinking blocks in chat completion block ensures caching works for anthropic thinking block * fix(convert_dict_to_response.py): pass all message params to delta block ensures streaming delta also contains the reasoning content / thinking block * test(test_prompt_factory.py): remove redundant test anthropic now supports assistant as the first message * fix(factory.py): fix linting errors * fix: fix code qa * test: remove falsy test * fix(litellm_logging.py): fix str conversion	2025-03-04 21:12:16 -08:00
Krrish Dholakia	8ea3d4c046	build: merge litellm_dev_03_01_2025_p2	2025-03-03 23:05:41 -08:00
Krish Dholakia	a65bfab697	Fix calling claude via invoke route + response_format support for claude on invoke route (#8908 ) * fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route move to using anthropic config as base * fix(utils.py): expose anthropic config via providerconfigmanager * fix(llm_http_handler.py): support json mode on async completion calls * fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke * fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config Prevents error when passing in 'response_format: {"type": "text"} * test: fix test * fix(utils.py): fix base invoke provider check * fix(anthropic_claude3_transformation.py): don't pass 'stream' param * fix: fix linting errors * fix(converse_transformation.py): handle response_format type=text for converse	2025-02-28 17:56:26 -08:00
Krish Dholakia	017c482d7b	fix(o_series_transformation.py): fix optional param check for o-serie… (#8787 ) * fix(o_series_transformation.py): fix optional param check for o-series models o3-mini and o-1 do not support parallel tool calling * fix(utils.py): support 'drop_params' for 'thinking' param across models allows switching to older claude versions (or non-anthropic models) and param to be safely dropped * fix: fix passing thinking param in optional params allows dropping thinking_param where not applicable * test: update old model * fix(utils.py): fix linting errors * fix(main.py): add param to acompletion	2025-02-26 12:26:55 -08:00
Krrish Dholakia	fcf4ea3608	build: merge squashed commit Squashed commit of the following: commit `6678e15381` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 09:29:15 2025 -0800 test_prompt_caching commit `bd86e0ac47` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 08:57:16 2025 -0800 test_prompt_caching commit `2fc21ad51e` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 08:13:45 2025 -0800 test_aprompt_caching commit `d94cff55ff` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 08:13:12 2025 -0800 test_prompt_caching commit `49c5e7811e` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 07:43:53 2025 -0800 ui new build commit `cb8d5e5917` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 07:38:56 2025 -0800 (UI) - Create Key flow for existing users (#8844) * working create user button * working create user for a key flow * allow searching users * working create user + key * use clear sections on create key * better search for users * fix create key * ui fix create key button - make it neater / cleaner * ui fix all keys table commit `335ba30467` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Wed Feb 26 08:53:17 2025 -0800 fix: fix file name commit `b8c5b31a4e` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Tue Feb 25 22:54:46 2025 -0800 fix: fix utils commit `ac6e503461` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Feb 24 10:43:31 2025 -0800 fix(main.py): fix openai message for assistant msg if role is missing - openai allows this Fixes https://github.com/BerriAI/litellm/issues/8661 commit `de3989dbc5` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Feb 24 21:19:25 2025 -0800 fix(get_litellm_params.py): handle no-log being passed in via kwargs Fixes https://github.com/BerriAI/litellm/issues/8380	2025-02-26 09:39:27 -08:00
Krish Dholakia	09462ba80c	Add cohere v2/rerank support (#8421 ) (#8605 ) * Add cohere v2/rerank support (#8421) * Support v2 endpoint cohere rerank * Add tests and docs * Make v1 default if old params used * Update docs * Update docs pt 2 * Update tests * Add e2e test * Clean up code * Use inheritence for new config * Fix linting issues (#8608) * Fix cohere v2 failing test + linting (#8672) * Fix test and unused imports * Fix tests * fix: fix linting errors * test: handle tgai instability * fix: skip service unavailable err * test: print logs for unstable test * test: skip unreliable tests --------- Co-authored-by: vibhavbhat <vibhavb00@gmail.com>	2025-02-22 22:25:29 -08:00
Krish Dholakia	b682dc4ec8	Add cost tracking for rerank via bedrock (#8691 ) * feat(bedrock/rerank): infer model region if model given as arn * test: add unit testing to ensure bedrock region name inferred from arn on rerank * feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137 * test(test_bedrock_completion.py): add testing for bedrock cohere rerank * feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking * build(model_prices_and_context_window.json): add amazon.rerank model to model cost map * fix(cost_calculator.py): bedrock/common_utils.py get base model from model w/ arn -> handles rerank model * build(model_prices_and_context_window.json): add bedrock cohere rerank pricing * feat(bedrock/rerank): migrate bedrock config to basererank config * Revert "feat(bedrock/rerank): migrate bedrock config to basererank config" This reverts commit `84fae1f167`. * test: add testing to ensure large doc / queries are correctly counted * Revert "test: add testing to ensure large doc / queries are correctly counted" This reverts commit `4337f1657e`. * fix(migrate-jina-ai-to-rerank-config): enables cost tracking * refactor(jina_ai/): finish migrating jina ai to base rerank config enables cost tracking * fix(jina_ai/rerank): e2e jina ai rerank cost tracking * fix: cleanup dead code * fix: fix python3.8 compatibility error * test: fix test * test: add e2e testing for azure ai rerank * fix: fix linting error * test: mark cohere as flaky	2025-02-20 21:00:18 -08:00
Krish Dholakia	f9df01fbc6	fix(utils.py): handle token counter error when invalid message passed in (#8670 ) * fix(utils.py): handle token counter error * fix(utils.py): testing fixes * fix(utils.py): fix incr for num tokens from list * fix(utils.py): fix text str token counting	2025-02-19 22:21:34 -08:00
Krrish Dholakia	9470f57e86	build: extract <think>..</think> block for amazon deepseek r1 and put in reasoning_content	2025-02-19 21:10:38 -08:00
Krish Dholakia	2340f1b31f	Pass router tags in request headers - `x-litellm-tags` (#8609 ) * feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header allow tag based routing + spend tracking via request headers * docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking * docs(tag_routing.md): add to docs * fix(utils.py): only pass str values for openai metadata param * fix(utils.py): drop non-str values for metadata param to openai preview-feature, otel span was being sent in	2025-02-18 08:26:22 -08:00
Krish Dholakia	a9276f27f9	fix(main.py): fix key leak error when unknown provider given (#8556 ) * fix(main.py): fix key leak error when unknown provider given don't return passed in args if unknown route on embedding * fix(main.py): remove instances of {args} being passed in exception prevent potential key leaks * test(code_coverage/prevent_key_leaks_in_codebase.py): ban usage of {args} in codebase * fix: fix linting errors * fix: remove unused variable	2025-02-15 14:02:55 -08:00
Ishaan Jaff	753290e5b8	(perf) Fix memory leak on `/completions` route (#8551 ) * initial mem util test * fix _cached_get_model_info_helper * test memory usage * fix tests * fix mem usage	2025-02-14 18:58:16 -08:00
Krish Dholakia	58141df65d	Litellm dev 02 13 2025 p2 (#8525 ) * fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param Closes https://github.com/BerriAI/litellm/issues/8500 * build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model * style: cleanup invalid json trailing commma * feat(utils.py): support passing 'tokenizer_config' to register_prompt_template enables passing complete tokenizer config of model to litellm Allows calling deepseek on bedrock with the correct prompt template * fix(utils.py): fix register_prompt_template for custom model names * test(test_prompt_factory.py): fix test * test(test_completion.py): add e2e test for bedrock invoke deepseek ft model * feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls enables proxy admin to set base model for ft bedrock deepseek model * feat(bedrock/invoke): support deepseek_r1 route for bedrock makes it easy to apply the right chat template to that call * feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work * test(test_completion.py): add e2e mock test for bedrock deepseek * docs(bedrock.md): document new deepseek_r1 route for bedrock allows us to use the right config * fix(exception_mapping_utils.py): catch read operation timeout	2025-02-13 20:28:42 -08:00
Krish Dholakia	8903bd1c7f	fix(utils.py): fix vertex ai optional param handling (#8477 ) * fix(utils.py): fix vertex ai optional param handling don't pass max retries to unsupported route Fixes https://github.com/BerriAI/litellm/issues/8254 * fix(get_supported_openai_params.py): fix linting error * fix(get_supported_openai_params.py): default to openai-like spec * test: fix test * fix: fix linting error * Improved wildcard route handling on `/models` and `/model_group/info` (#8473) * fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix ensures wildcard route - `vertex_ai/gemini-` just returns known vertex_ai/gemini- models test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper * test(test_models.py): add e2e testing for `/model_group/info` endpoint * feat(prometheus.py): support tracking total requests by user_email on prometheus adds initial support for tracking total requests by user_email * test(test_prometheus.py): add testing to ensure user email is always tracked * test: update testing for new prometheus metric * test(test_prometheus_unit_tests.py): add user email to total proxy metric * test: update tests * test: fix spend tests * test: fix test * fix(pagerduty.py): fix linting error * (Bug fix) - Using `include_usage` for /completions requests + unit testing (#8484) * pass stream options (#8419) * test_completion_streaming_usage_metrics * test_text_completion_include_usage --------- Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com> * fix naming docker stable release * build(model_prices_and_context_window.json): handle azure model update * docs(token_auth.md): clarify scopes can be a list or comma separated string * docs: fix docs * add sonar pricings (#8476) * add sonar pricings * Update model_prices_and_context_window.json * Update model_prices_and_context_window.json * Update model_prices_and_context_window_backup.json * update load testing script * fix test_async_router_context_window_fallback * pplx - fix supports tool choice openai param (#8496) * fix prom check startup (#8492) * test_async_router_context_window_fallback * ci(config.yml): mark daily docker builds with `-nightly` (#8499) Resolves https://github.com/BerriAI/litellm/discussions/8495 * (Redis Cluster) - Fixes for using redis cluster + pipeline (#8442) * update RedisCluster creation * update RedisClusterCache * add redis ClusterCache * update async_set_cache_pipeline * cleanup redis cluster usage * fix redis pipeline * test_init_async_client_returns_same_instance * fix redis cluster * update mypy_path * fix init_redis_cluster * remove stub * test redis commit * ClusterPipeline * fix import * RedisCluster import * fix redis cluster * Potential fix for code scanning alert no. 2129: Clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * fix naming of redis cluster integration * test_redis_caching_ttl_pipeline * fix async_set_cache_pipeline --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Litellm UI stable version 02 12 2025 (#8497) * fix(key_management_endpoints.py): fix `/key/list` to include `return_full_object` as a top-level query param Allows user to specify they want the keys as a list of objects * refactor(key_list.tsx): initial refactor of key table in user dashboard offloads key filtering logic to backend api prevents common error of user not being able to see their keys * fix(key_management_endpoints.py): allow internal user to query `/key/list` to see their keys * fix(key_management_endpoints.py): add validation checks and filtering to `/key/list` endpoint allow internal user to see their keys. not anybody else's * fix(view_key_table.tsx): fix issue where internal user could not see default team keys * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * test_supports_tool_choice * test_async_router_context_window_fallback * fix: fix test (#8501) * Litellm dev 02 12 2025 p1 (#8494) * Resolves https://github.com/BerriAI/litellm/issues/6625 (#8459) - enables no auth for SMTP Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> * add sonar pricings (#8476) * add sonar pricings * Update model_prices_and_context_window.json * Update model_prices_and_context_window.json * Update model_prices_and_context_window_backup.json * test: fix test --------- Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> Co-authored-by: Dani Regli <1daniregli@gmail.com> Co-authored-by: Lucca Zenóbio <luccazen@gmail.com> * test: fix test * UI Fixes p2 (#8502) * refactor(admin.tsx): cleanup add new admin flow removes buggy flow. Ensures just 1 simple way to add users / update roles. * fix(user_search_modal.tsx): ensure 'add member' button is always visible * fix(edit_membership.tsx): ensure 'save changes' button always visible * fix(internal_user_endpoints.py): ensure user in org can be deleted Fixes issue where user couldn't be deleted if they were a member of an org * fix: fix linting error * add phoenix docs for observability integration (#8522) * Add files via upload * Update arize_integration.md * Update arize_integration.md * add Phoenix docs * Added custom_attributes to additional_keys which can be sent to athina (#8518) * (UI) fix log details page (#8524) * rollback changes to view logs page * ui new build * add interface for prefetch * fix spread operation * fix max size for request view page * clean up table * ui fix column on request logs page * ui new build * Add UI Support for Admins to Call /cache/ping and View Cache Analytics (#8475) (#8519) * [Bug] UI: Newly created key does not display on the View Key Page (#8039) - Fixed issue where all keys appeared blank for admin users. - Implemented filtering of data via team settings to ensure all keys are displayed correctly. * Fix: - Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`. - Ensured other teams still follow the original validation rules. * - added some classes in global.css - added text wrap in output of request,response and metadata in index.tsx - fixed styles of table in table.tsx * - added full payload when we open single log entry - added Combined Info Card in index.tsx * fix: keys not showing on refresh for internal user * merge * main merge * cache page * ca remove * terms change * fix:places caching inside exp --------- Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com> Co-authored-by: Lucca Zenóbio <luccazen@gmail.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: Dani Regli <1daniregli@gmail.com> Co-authored-by: exiao <exiao@users.noreply.github.com> Co-authored-by: vivek-athina <153479827+vivek-athina@users.noreply.github.com> Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>	2025-02-13 19:58:50 -08:00
Krish Dholakia	ce3ead6f91	Log applied guardrails on LLM API call (#8452 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 40s Details * fix(litellm_logging.py): support saving applied guardrails in logging object allows list of applied guardrails to be logged for proxy admin's knowledge * feat(spend_tracking_utils.py): log applied guardrails to spend logs makes it easy for admin to know what guardrails were applied on a request * ci(config.yml): uninstall posthog from ci/cd * test: fix tests * test: update test	2025-02-10 22:57:30 -08:00
Ishaan Jaff	b242c66a3b	(Feat) - Add `/bedrock/invoke` support for all Anthropic models (#8383 ) * use anthropic transformation for bedrock/invoke * use anthropic transforms for bedrock invoke claude * TestBedrockInvokeClaudeJson * add AmazonAnthropicClaudeStreamDecoder * pass bedrock_invoke_provider to make_call * fix _get_base_bedrock_model * fix get_bedrock_route * fix bedrock routing * fixes for bedrock invoke * test_all_model_configs * fix AWSEventStreamDecoder linting * fix code qa * test_bedrock_get_base_model * test_get_model_info_bedrock_models * test_bedrock_base_model_helper * test_bedrock_route_detection	2025-02-07 22:41:11 -08:00
Krish Dholakia	b5850b6b65	Handle azure deepseek reasoning response (#8288 ) (#8366 ) * Handle azure deepseek reasoning response (#8288) * Handle deepseek reasoning response * Add helper method + unit test * Fix: Follow infinity api url format (#8346) * Follow infinity api url format * Update test_infinity.py * fix(infinity/transformation.py): fix linting error --------- Co-authored-by: vibhavbhat <vibhavb00@gmail.com> Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>	2025-02-07 17:45:51 -08:00
Krish Dholakia	6b8b49451f	Fix azure max retries error (#8340 ) * fix(azure.py): ensure max_retries=0 is respected Fixes https://github.com/BerriAI/litellm/issues/6129 * fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0 * test(test_azure_openai.py): add unit testing for azure_text/ route * fix(azure.py): fix passing max retries on streaming * fix(azure.py): fix azure max retries on async completion + streaming * fix(completion/handler.py): fix azure text async completion + streaming * test(test_azure_openai.py): ensure azure openai max retries always respected * test(test_azure_o_series.py): add testing to ensure max retries always respected * Added gemini providers for 2.0-flash and 2.0-flash lite (#8321) * Update model_prices_and_context_window.json added gemini providers for 2.0-flash and 2.0-flash light * Update model_prices_and_context_window.json fixed URL --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Convert tool use arguments to string before counting tokens (#6989) In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine. * build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing * build(model_prices_and_context_window.json): add gemini commercial rate limits * fix(utils.py): fix linting error * refactor(utils.py): refactor to maintain function size --------- Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com> Co-authored-by: Josh Morrow <josh@jcmorrow.com>	2025-02-06 23:20:48 -08:00
Krish Dholakia	f031926b82	fix(utils.py): handle key error in msg validation (#8325 ) * fix(utils.py): handle key error in msg validation * Support running Aim Guard during LLM call (#7918) * support running Aim Guard during LLM call * Rename header * adjust docs and fix type annotations * fix(timeout.md): doc fix for openai example on dynamic timeouts --------- Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>	2025-02-06 18:13:46 -08:00
Ishaan Jaff	818792228c	(Refactor) - migrate bedrock invoke to `BaseLLMHTTPHandler` class (#8290 ) * initial transform for invoke * invoke transform_response * working - able to make request * working get_complete_url * working - invoke now runs on llm_http_handler * fix unused imports * track litellm overhead ms * working stream request * sign_request transform * sign_request update * use has_async_custom_stream_wrapper property * use get_async_custom_stream_wrapper in base llm http handler * fix make_call in invoke handler * fix invoke with streaming get_async_custom_stream_wrapper * working bedrock async streaming with invoke * fix make call handler for bedrock * test_all_model_configs * fix test_bedrock_custom_prompt_template * sync streaming for bedrock invoke * fix _add_stream_param_to_request_body * test_async_text_completion_bedrock * fix transform_request * fix get_supported_openai_params * fix test supports tool choice * fix test_supports_tool_choice * add unit test coverage for bedrock invoke transform * fix location of transformation files * update import loc * fix bedrock invoke unit tests * fix import for max completion tokens	2025-02-05 18:58:55 -08:00
Ishaan Jaff	3a6349d871	(Feat) - Add support for structured output on `bedrock/nova` models + add util `litellm.supports_tool_choice` (#8264 ) * fix supports_tool_choice * TestBedrockNovaJson * use supports_tool_choice * fix supports_tool_choice * add supports_tool_choice param * script to add fields to model cost map * test_supports_tool_choice * test_supports_tool_choice * fix supports tool choice check * test_supports_tool_choice_simple_tests * fix supports_tool_choice check * fix supports_tool_choice bedrock * test_supports_tool_choice * test_supports_tool_choice * fix bedrock/eu-west-3/mistral.mistral-large-2402-v1:0 * ci/cd run again * test_supports_tool_choice_simple_tests * TestGoogleAIStudioGemini temp - remove to run ci/cd * test_aaalangfuse_logging_metadata * TestGoogleAIStudioGemini * test_check_provider_match * remove add param to map	2025-02-04 21:47:16 -08:00
Krrish Dholakia	c743475aba	build: Squashed commit of the following: commit `3e4e2cb20a` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Tue Feb 4 15:10:34 2025 -0800 fix(proxy_server.py): fix redirect from `/sso/key/callback` to redirect on custom server path Fixes https://github.com/BerriAI/litellm/issues/5997	2025-02-04 21:45:33 -08:00
Krish Dholakia	1105e35538	Complete o3 model support (#8183 ) * fix(o_series_transformation.py): add 'reasoning_effort' as o series model param Closes https://github.com/BerriAI/litellm/issues/8182 * fix(main.py): ensure `reasoning_effort` is a mapped openai param * refactor(azure/): rename o1_[x] files to o_series_[x] * refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort * test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests * feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers Makes it easy to switch from openai to anthropic * fix: fix linting errors * fix(base_llm_unit_tests.py): fix test * fix(main.py): add missing param	2025-02-02 22:36:37 -08:00
Krish Dholakia	23f458d2da	Improved O3 + Azure O3 support (#8181 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix: support azure o3 model family for fake streaming workaround (#8162) * fix: support azure o3 model family for fake streaming workaround * refactor: rename helper to is_o_series_model for clarity * update function calling parameters for o3 models (#8178) * refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3 ensures max_tokens is correctly translated for o3 * feat(openai/): refactor o1 files to be 'o_series' files expands naming to cover o3 * fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets * test(test_azure_o_series.py): assert stream faked for azure o3 mini Resolves https://github.com/BerriAI/litellm/pull/8162 * fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing * docs(azure.md): update doc with `o_series/` model name --------- Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com> Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>	2025-02-01 09:52:28 -08:00
Ishaan Jaff	9ff27809b2	(Feat) add bedrock/deepseek custom import models (#8132 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 16s Details * add support for using llama spec with bedrock * fix get_bedrock_invoke_provider * add support for using bedrock provider in mappings * working request * test_bedrock_custom_deepseek * test_bedrock_custom_deepseek * fix _get_model_id_for_llama_like_model * test_bedrock_custom_deepseek * doc DeepSeek-R1-Distill-Llama-70B * test_bedrock_custom_deepseek	2025-01-31 18:40:44 -08:00
Ishaan Jaff	2cf0daa31c	(Fixes) OpenAI Streaming Token Counting + Fixes usage track when `litellm.turn_off_message_logging=True` (#8156 ) * working streaming usage tracking * fix test_async_chat_openai_stream_options * fix await asyncio.sleep(1) * test_async_chat_azure * fix s3 logging * fix get_stream_options * fix get_stream_options * fix streaming handler * test_stream_token_counting_with_redaction * fix codeql concern	2025-01-31 15:06:37 -08:00
Krish Dholakia	de261e2120	Doc updates + management endpoint fixes (#8138 ) * Litellm dev 01 29 2025 p4 (#8107) * fix(key_management_endpoints.py): always get db team Fixes https://github.com/BerriAI/litellm/issues/7983 * test(test_key_management.py): add unit test enforcing check_db_only is always true on key generate checks * test: fix test * test: skip gemini thinking * Litellm dev 01 29 2025 p3 (#8106) * fix(__init__.py): reduces size of __init__.py and reduces scope for errors by using correct param * refactor(__init__.py): refactor init by cleaning up redundant params * refactor(__init__.py): move more constants into constants.py cleanup root * refactor(__init__.py): more cleanup * feat(__init__.py): expose new 'disable_hf_tokenizer_download' param enables hf model usage in offline env * docs(config_settings.md): document new disable_hf_tokenizer_download param * fix: fix linting error * fix: fix unsafe comparison * test: fix test * docs(public_teams.md): add doc showing how to expose public teams for users to join * docs: add beta disclaimer on public teams * test: update tests	2025-01-30 22:56:41 -08:00
Ishaan Jaff	8a235e7d38	(Refactor / QA) - Use `LoggingCallbackManager` to append callbacks and ensure no duplicate callbacks are added (#8112 ) * LoggingCallbackManager * add logging_callback_manager * use logging_callback_manager * add add_litellm_failure_callback * use add_litellm_callback * use add_litellm_async_success_callback * add_litellm_async_failure_callback * linting fix * fix logging callback manager * test_duplicate_multiple_loggers_test * use _reset_all_callbacks * fix testing with dup callbacks * test_basic_image_generation * reset callbacks for tests * fix check for _add_custom_logger_to_list * fix test_amazing_sync_embedding * fix _get_custom_logger_key * fix batches testing * fix _reset_all_callbacks * fix _check_callback_list_size * add callback_manager_test * fix test gemini-2.0-flash-thinking-exp-01-21	2025-01-30 19:35:50 -08:00
Krish Dholakia	dad24f2b52	Litellm dev 01 29 2025 p2 (#8102 ) * docs: cleanup doc * feat(bedrock/): initial commit adding bedrock/converse_like/<model> route support allows routing to a converse like endpoint Resolves https://github.com/BerriAI/litellm/issues/8085 * feat(bedrock/chat/converse_transformation.py): make converse config base config compatible enables new 'converse_like' route * feat(converse_transformation.py): enables using the proxy with converse like api endpoint Resolves https://github.com/BerriAI/litellm/issues/8085	2025-01-29 20:53:37 -08:00
Ishaan Jaff	b812286534	(fix) - proxy reliability, ensure duplicate callbacks are not added to proxy (#8067 ) * refactor _add_callbacks_from_db_config * fix check for _custom_logger_exists_in_litellm_callbacks * move loc of test utils * run ci/cd again * test_add_custom_logger_callback_to_specific_event_with_duplicates_callbacks * fix _custom_logger_class_exists_in_success_callbacks * unit testing for test_add_callbacks_from_db_config * test_custom_logger_exists_in_callbacks_individual_functions * fix config.yml * fix test test_stream_chunk_builder_openai_audio_output_usage - use direct dict comparison	2025-01-28 21:01:56 -08:00
Krish Dholakia	d9eb8f42ff	Litellm dev 01 27 2025 p3 (#8047 ) * docs(reliability.md): add doc on disabling fallbacks per request * feat(litellm_pre_call_utils.py): support reading request timeout from request headers - new `x-litellm-timeout` param Allows setting dynamic model timeouts from vercel's AI sdk * test(test_proxy_server.py): add simple unit test for reading request timeout * test(test_fallbacks.py): add e2e test to confirm timeout passed in request headers is correctly read * feat(main.py): support passing metadata to openai in preview Resolves https://github.com/BerriAI/litellm/issues/6022#issuecomment-2616119371 * fix(main.py): fix passing openai metadata * docs(request_headers.md): document new request headers * build: Merge branch 'main' into litellm_dev_01_27_2025_p3 * test: loosen test	2025-01-28 18:01:27 -08:00
Krish Dholakia	c2e3986bbc	fix(utils.py): handle failed hf tokenizer request during calls (#8032 ) * fix(utils.py): handle failed hf tokenizer request during calls prevents proxy from failing due to bad hf tokenizer calls * fix(utils.py): convert failure callback str to custom logger class Fixes https://github.com/BerriAI/litellm/issues/8013 * test(test_utils.py): fix test - avoid adding mlflow dep on ci/cd * fix: add missing env vars to test * test: cleanup redundant test	2025-01-28 17:20:36 -08:00
Krish Dholakia	03eef5a2a0	Fix custom pricing - separate provider info from model info (#7990 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 34s Details * fix(utils.py): initial commit fixing custom cost tracking refactors out provider specific model info from `get_model_info` - this was causing custom costs to be registered incorrectly * fix(utils.py): cleanup `_supports_factory` to check provider info, if model info is None some providers support features like vision across all models * fix(utils.py): refactor to use _supports_factory * test: update testing * fix: fix linting errors * test: fix testing	2025-01-25 21:49:28 -08:00
Krish Dholakia	4857a1e9ba	refactor: cleanup dead codeblock (#7936 ) * refactor: cleanup dead codeblock * fix(main.py): add extra headers to headers * fix: remove dead codeblock	2025-01-24 21:48:23 -08:00
Krish Dholakia	8ca3229b26	Ensure base_model cost tracking works across all endpoints (#7989 ) * test(test_completion_cost.py): add sdk test to ensure base model is used for cost tracking * test(test_completion_cost.py): add sdk test to ensure custom pricing works * fix(main.py): add base model cost tracking support for embedding calls Enables base model cost tracking for embedding calls when base model set as a litellm_param * fix(litellm_logging.py): update logging object with litellm params - including base model, if given ensures base model param is always tracked * fix(main.py): fix linting errors	2025-01-24 21:05:26 -08:00
Krish Dholakia	1e011b66d3	Ollama ssl verify = False + Spend Logs reliability fixes (#7931 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(http_handler.py): support passing ssl verify dynamically and using the correct httpx client based on passed ssl verify param Fixes https://github.com/BerriAI/litellm/issues/6499 * feat(llm_http_handler.py): support passing `ssl_verify=False` dynamically in call args Closes https://github.com/BerriAI/litellm/issues/6499 * fix(proxy/utils.py): prevent bad logs from breaking all cost tracking + reset list regardless of success/failure prevents malformed logs from causing all spend tracking to break since they're constantly retried * test(test_proxy_utils.py): add test to ensure bad log is dropped * test(test_proxy_utils.py): ensure in-memory spend logs reset after bad log error * test(test_user_api_key_auth.py): add unit test to ensure end user id as str works * fix(auth_utils.py): ensure extracted end user id is always a str prevents db cost tracking errors * test(test_auth_utils.py): ensure get end user id from request body always returns a string * test: update tests * test: skip bedrock test- behaviour now supported * test: fix testing * refactor(spend_tracking_utils.py): reduce size of get_logging_payload * test: fix test * bump: version 1.59.4 → 1.59.5 * Revert "bump: version 1.59.4 → 1.59.5" This reverts commit `1182b46b2e`. * fix(utils.py): fix spend logs retry logic * fix(spend_tracking_utils.py): fix get tags * fix(spend_tracking_utils.py): fix end user id spend tracking on pass-through endpoints	2025-01-23 23:05:41 -08:00
Krish Dholakia	c6e9240405	Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958 ) * fix(bedrock/converse_handler.py): fix bedrock region name on async calls * fix(utils.py): fix split model handling Fixes bedrock cost calculation when region name is given * feat(_health_endpoints.py): support health checking datadog integration Closes https://github.com/BerriAI/litellm/issues/7921	2025-01-23 22:17:09 -08:00
Krish Dholakia	513b1904ab	Add `attempted-retries` and `timeout` values to response headers + more testing (#7926 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 14s Details * feat(router.py): add retry headers to response makes it easy to add testing to ensure model-specific retries are respected * fix(add_retry_headers.py): clarify attempted retries vs. max retries * test(test_fallbacks.py): add test for checking if max retries set for model is respected * test(test_fallbacks.py): assert values for attempted retries and max retries are as expected * fix(utils.py): return timeout in litellm proxy response headers * test(test_fallbacks.py): add test to assert model specific timeout used on timeout error * test: add bad model with timeout to proxy * fix: fix linting error * fix(router.py): fix get model list from model alias * test: loosen test restriction - account for other events on proxy	2025-01-22 22:19:44 -08:00

1 2 3 4 5 ...

2080 commits