Commit graph

12117 commits

Author SHA1 Message Date
Krish Dholakia
6eb2346fd6 QA: ensure all bedrock regional models have same supported_ as base + Anthropic nested pydantic object support (#7844)
* build: ensure all regional bedrock models have same supported values as base bedrock model

prevents drift

* test(base_llm_unit_tests.py): add testing for nested pydantic objects

* fix(test_utils.py): add test_get_potential_model_names

* fix(anthropic/chat/transformation.py): support nested pydantic objects

Fixes https://github.com/BerriAI/litellm/issues/7755
2025-01-17 19:49:12 -08:00
Ishaan Jaff
bc6a9cd29c [Hashicorp - secret manager] - use vault namespace for tls auth (#7834)
* hcorp - use x-vault-namespace

* _get_tls_cert_auth_body

* HCP_VAULT_CERT_ROLE

* test_hashicorp_secret_manager_tls_cert_auth

* HCP_VAULT_CERT_ROLE
2025-01-17 19:27:56 -08:00
Ishaan Jaff
c8d6254b78 ui new build 2025-01-17 19:23:41 -08:00
Ishaan Jaff
d4d6498e14 ui new build 2025-01-17 19:14:44 -08:00
Ishaan Jaff
f37e848f7f [fix dd llm obs] - use env vars for setting dd tags, service name (#7835)
* fix custom logger

* fix debugging dd llm obs
2025-01-17 18:57:16 -08:00
Ishaan Jaff
16673ab488 (UI - View SpendLogs Table) (#7842)
* litellm log messages / responses

* add messages/response to schema.prisma

* add support for logging messages / responses in DB

* test_spend_logs_payload_with_prompts_enabled

* _get_messages_for_spend_logs_payload

* ui_view_spend_logs endpoint

* add tanstack and moment

* add uiSpendLogsCall

* ui view logs table

* ui view spendLogs table

* ui_view_spend_logs

* fix code quality

* test_spend_logs_payload_with_prompts_enabled

* _get_messages_for_spend_logs_payload

* test_spend_logs_payload_with_prompts_enabled

* test_spend_logs_payload_with_prompts_enabled

* ui view spend logs

* minor ui fix

* ui - update leftnav

* ui - clean up ui

* fix leftnav

* ui fix navbar

* ui fix moving chat ui tab
2025-01-17 18:53:45 -08:00
Krish Dholakia
fc7a931485 fix(key_management_endpoints.py): fix default allowed team member roles (#7843)
admin and user, not admin and member
2025-01-17 17:15:22 -08:00
yujonglee
b0e30906e0 add key and team level budget (#7831) 2025-01-17 09:04:12 -08:00
Ishaan Jaff
23fca72fb8 Revert "fix custom logger"
This reverts commit 9d2707ecfe.
2025-01-17 08:26:59 -08:00
Ishaan Jaff
4ef821984d fix custom logger 2025-01-17 07:39:49 -08:00
Ishaan Jaff
081da2b1c3 Revert "test_completion_mistral_api_mistral_large_function_call"
This reverts commit ef9177f0a8.
2025-01-17 07:20:46 -08:00
Ishaan Jaff
2e91ab52ec _handle_tool_call_message linting 2025-01-16 22:34:16 -08:00
Ishaan Jaff
b6d1ab6152 test_completion_mistral_api_mistral_large_function_call 2025-01-16 22:27:48 -08:00
Ishaan Jaff
e89c3bad5f sec fix minor (#7810) 2025-01-16 22:03:28 -08:00
Ishaan Jaff
b4735fbbc0 (Fix + Testing) - Add dd-trace-run to litellm ci/cd pipeline + fix bug caused by dd-trace patching OpenAI sdk (#7820)
* add dd trace to e2e docker run tests

* update dd trace v

* fix entrypoint

* dd trace fixes

* proxy_build_from_pip_tests

* build python3.13

* use py 3.13

* fix build from pip

* dd trace fix

* proxy_build_from_pip_tests

* bump build from pip
2025-01-16 22:03:09 -08:00
Ishaan Jaff
559006643d (fix) IBM Watsonx using ZenApiKey (#7821)
* ibm watsonx fix

* test ZenAPIKey

* fix zenapikey
2025-01-16 22:02:36 -08:00
Ishaan Jaff
2177bdc836 (datadog llm observability) - fixes + improvements for using datadog llm observability logging integration (#7824)
* dd llm obs fixes

* _ensure_string_content

* fix _get_dd_llm_obs_payload_metadata
2025-01-16 22:02:24 -08:00
Ishaan Jaff
a2364b1fd6 test_completion_mistral_api_mistral_large_function_call 2025-01-16 21:50:56 -08:00
Ishaan Jaff
5e9ea29ffe llama-v3p1-8b-instruct 2025-01-16 21:34:42 -08:00
Krish Dholakia
f37dc43e92 test: initial commit enforcing testing on all anthropic pass through … (#7794)
* test: initial commit enforcing testing on all anthropic pass through functions

prevents future regressions

* test(test_unit_test_anthropic_pass_through.py): add unit test for '_get_user_from_metadata' function

* test(test_unit_test_anthropic_passthrough.py): add unit test for handle_logging_anthropic_collected_chunks

* test(test_unit_test_anthropic_pass_through): add coverage for all anthropic pass through functions
2025-01-15 22:02:35 -08:00
Krish Dholakia
fbdd88d79c test: initial test to enforce all functions in user_api_key_auth.py h… (#7797)
* test: initial test to enforce all functions in user_api_key_auth.py have direct testing

* test(test_user_api_key_auth.py): add is_allowed_route unit test

* test(test_user_api_key_auth.py): add more tests

* test(test_user_api_key_auth.py): add complete testing coverage for all functions in `user_api_key_auth.py`

* test(test_db_schema_changes.py): add a unit test to ensure all db schema changes are backwards compatible

gives user an easy rollback path

* test: fix schema compatibility test filepath

* test: fix test
2025-01-15 21:52:45 -08:00
Krish Dholakia
543655adc7 Litellm dev 01 14 2025 p2 (#7772)
* feat(pass_through_endpoints.py): fix anthropic end user cost tracking

* fix(anthropic/chat/transformation.py): use returned provider model for anthropic

handles anthropic `-latest` tag in request body throwing cost calculation errors

ensures we can be accurate in our model cost tracking

* feat(model_prices_and_context_window.json): add gemini-2.0-flash-thinking-exp pricing

* test: update test to use assumption that user_api_key_dict can get anthropic user id

* test: fix test

* fix: fix test

* fix(anthropic_pass_through.py): uncomment previous anthropic end-user cost tracking code block

can't guarantee user api key dict always has end user id - too many code paths

* fix(user_api_key_auth.py): this allows end user id from request body to always be read and set in auth object

* fix(auth_check.py): fix linting error

* test: fix auth check

* fix(auth_utils.py): fix get end user id to handle metadata = None
2025-01-15 21:34:50 -08:00
Krish Dholakia
26c1b86f4e Litellm dev 01 2025 p4 (#7776)
* fix(gemini/): support gemini 'frequency_penalty' and 'presence_penalty'

Closes https://github.com/BerriAI/litellm/issues/7748

* feat(proxy_server.py): new env var to disable prisma health check on startup

* test: fix test
2025-01-14 21:49:25 -08:00
Krish Dholakia
142662a504 build(pyproject.toml): bump uvicorn depedency requirement (#7773)
* build(pyproject.toml): bump uvicorn depedency requirement

Fixes https://github.com/BerriAI/litellm/issues/7768

* fix(anthropic/chat/transformation.py): fix is_vertex_request check to actually use optional param passed in

Fixes https://github.com/BerriAI/litellm/issues/6898#issuecomment-2590860695

* fix(o1_transformation.py): fix azure o1 'is_o1_model' check to just check for o1 in model string

https://github.com/BerriAI/litellm/issues/7743

* test: load vertex creds
2025-01-14 21:47:11 -08:00
Ishaan Jaff
6a4e8c33b3 (fix) BaseAWSLLM - cache IAM role credentials when used (#7775)
* fix base aws llm

* fix auth with aws role

* test aws base llm

* fix base aws llm init

* run ci/cd again

* fix get_credentials

* ci/cd run again

* _auth_with_aws_role
2025-01-14 20:16:22 -08:00
Ishaan Jaff
25ae1e9117 (Feat) prometheus - emit remaining team budget metric on proxy startup (#7777)
* fix get_paginated_teams

* use _initialize_remaining_budget_metrics

* fix prom metric

* run ci/cd again

* fix run async func

* fix _initialize_prometheus_startup_metrics

* fix _initialize_prometheus_startup_metrics

* prom unit tests

* test_get_paginated_teams
2025-01-14 20:08:23 -08:00
Krish Dholakia
178cfe3c57 Litellm dev 01 13 2025 p2 (#7758)
* fix(factory.py): fix bedrock document url check

Make check more generic - if starts with 'text' or 'application' assume it's a document and let it go through

 Fixes https://github.com/BerriAI/litellm/issues/7746

* feat(key_management_endpoints.py): support writing new key alias to aws secret manager - on key rotation

adds rotation endpoint to aws key management hook - allows for rotated litellm virtual keys with new key alias to be written to it

* feat(key_management_event_hooks.py): support rotating keys and updating secret manager

* refactor(base_secret_manager.py): support rotate secret at the base level

since it's just an abstraction function, it's easy to implement at the base manager level

* style: cleanup unused imports
2025-01-14 17:04:01 -08:00
Krish Dholakia
d7a13ad561 Support temporary budget increases on keys (#7754)
* fix(gpt_transformation.py): fix response_format translation check for 4o models

Fixes https://github.com/BerriAI/litellm/issues/7616

* feat(key_management_endpoints.py): support 'temp_budget_increase' and 'temp_budget_expiry' fields

Allow proxy admin to grant temporary budget increases to keys

* fix(proxy/_types.py): enforce temp_budget_increase and temp_budget_expiry are always passed together

* feat(user_api_key_auth.py): initial working temp budget increase logic

ensures key budget exceeded error checks for temp budget in key metadata

* feat(proxy_server.py): return the key max budget and key spend in the response headers

Allows clientside user to know their remaining limits

* test: add unit testing for new proxy utils

Ensures new key budget is correctly handled

* docs(temporary_budget_increase.md): add doc on temporary budget increase

* fix(utils.py): remove 3.5 from response_format check for now

not all azure  3.5 models support response_format

* fix(user_api_key_auth.py): return valid user api key auth object on all paths
2025-01-14 17:03:11 -08:00
Krish Dholakia
000d3152a8 Litellm dev 01 14 2025 p1 (#7771)
* First-class Aim Guardrails support (#7738)

* initial aim support

* add tests

* docs(langsmith_integration.md): cleanup

* style: cleanup unused imports

---------

Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
2025-01-14 16:18:21 -08:00
Ishaan Jaff
f30c87f4f0 (fix) health check - allow setting health_check_model (#7752)
* use _update_litellm_params_for_health_check

* fix Wildcard Routes

* test_update_litellm_params_for_health_check

* test_perform_health_check_with_health_check_model

* fix doc string

* huggingface/mistralai/Mistral-7B-Instruct-v0.3
2025-01-13 20:16:44 -08:00
Ishaan Jaff
640b71e4af (prometheus - minor bug fix) - litellm_llm_api_time_to_first_token_metric not populating for bedrock models (#7740)
* fix prometheus ttft

* fix test_set_latency_metrics

* fix _set_latency_metrics

* fix _set_latency_metrics

* fix test_set_latency_metrics

* test_async_log_success_event

* huggingface/mistralai/Mistral-7B-Instruct-v0.3
2025-01-13 20:16:34 -08:00
Ishaan Jaff
a9b9df1d2e (litellm SDK perf improvements) - handle cases when unable to lookup model in model cost map (#7750)
* use lru cache wrapper

* use lru_cache_wrapper for _cached_get_model_info_helper

* fix _get_traceback_str_for_error

* huggingface/mistralai/Mistral-7B-Instruct-v0.3
2025-01-13 19:58:46 -08:00
Ishaan Jaff
87ebe3fde2 fix http parsing utils (#7753) 2025-01-13 19:58:26 -08:00
Ishaan Jaff
392eb265f9 (core sdk fix) - fix fallbacks stuck in infinite loop (#7751)
* test_acompletion_fallbacks_basic

* use common run_async_function

* fix completion_with_fallbacks

* fix completion with fallbacks

* fix fallback utils

* test_acompletion_fallbacks_basic

* test_completion_fallbacks_sync

* huggingface/mistralai/Mistral-7B-Instruct-v0.3
2025-01-13 19:34:34 -08:00
Ishaan Jaff
3716107899 (proxy perf) - only read request body 1 time per request (#7728)
* req body

* fix linting
2025-01-12 22:00:59 -08:00
Ishaan Jaff
b528010491 fix svc logger (#7727) 2025-01-12 22:00:25 -08:00
Krish Dholakia
01e2e26bd1 add azure o1 pricing (#7715)
* build(model_prices_and_context_window.json): add azure o1 pricing

Closes https://github.com/BerriAI/litellm/issues/7712

* refactor: replace regex with string method for whitespace check in stop-sequences handling (#7713)

* Allows overriding keep_alive time in ollama (#7079)

* Allows overriding keep_alive time in ollama

* Also adds to ollama_chat

* Adds some info on the docs about this parameter

* fix: together ai warning (#7688)

Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>

* fix(proxy_server.py): handle config containing thread locked objects when using get_config_state

* fix(proxy_server.py): add exception to debug

* build(model_prices_and_context_window.json): update 'supports_vision' for azure o1

---------

Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>
Co-authored-by: Regis David Souza Mesquita <github@rdsm.dev>
Co-authored-by: Carl <45709281+capsenz@users.noreply.github.com>
Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>
2025-01-12 18:15:35 -08:00
Ishaan Jaff
f778865836 Revert "fix _read_request_body to re-use parsed body already (#7722)" (#7724)
This reverts commit 95183f2103.
2025-01-12 16:45:26 -08:00
Ishaan Jaff
87c7da0eb3 fixes for img gen cost cal 2025-01-12 16:41:18 -08:00
Ishaan Jaff
76a6a57556 fix img gen cost 2025-01-12 16:31:04 -08:00
Ishaan Jaff
0587528acb use set for public routes 2025-01-12 16:22:56 -08:00
Ishaan Jaff
413ea4a6fd fix optimize get llm provider 2025-01-12 16:21:23 -08:00
Ishaan Jaff
21bacaf498 fix _read_request_body to re-use parsed body already (#7722) 2025-01-12 15:41:40 -08:00
Ishaan Jaff
b1e9b84987 (litellm sdk speedup) - use _model_contains_known_llm_provider in response_cost_calculator to check if the model contains a known litellm provider (#7721)
* define _cached_get_model_info_helper

* use _cached_get_model_info_helper

* speed up _select_model_name_for_cost_calc
2025-01-12 15:40:05 -08:00
Ishaan Jaff
73a782c792 (litellm SDK perf improvement) - use verbose_logger.debug and _cached_get_model_info_helper in _response_cost_calculator (#7720)
* define _cached_get_model_info_helper

* use _cached_get_model_info_helper
2025-01-12 15:27:54 -08:00
Ishaan Jaff
b4a99afee3 (litellm sdk speedup router) - adds a helper _cached_get_model_group_info to use when trying to get deployment tpm/rpm limits (#7719)
* fix _cached_get_model_group_info

* fixes get_remaining_model_group_usage

* test_cached_get_model_group_info
2025-01-12 15:14:54 -08:00
Krish Dholakia
8ee79dd5d9 [BETA] Add OpenAI /images/variations + Topaz API support (#7700)
* feat(main.py): initial commit for `/image/variations` endpoint support

* refactor(base_llm/): introduce new base llm base config for image variation endpoints

* refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler

* fix: test

* feat(openai/): working openai `/image/variation` endpoint calls via sdk

* feat(topaz/): topaz sync image variation call support

Addresses https://github.com/BerriAI/litellm/issues/7593

'

* fix(topaz/transformation.py): fix linting errors

* fix(openai/image_variations/handler.py): fix passing json data

* fix(main.py): image_variation/

support async image variation route - `aimage_variation`

* fix(test_get_model_info.py): fix test

* fix: cleanup unused imports

* feat(openai/): add async `/image/variations` endpoint support

* feat(topaz/): support async `/image/variations` calls

* fix: test

* fix(utils.py): fix get_model_info_helper for no model info w/ provider config

handles situation where model info is not known but provider config exists

* test(test_router_fallbacks.py): mark flaky test

* fix: fix unused imports

* test: bump otel load test perf threshold - accounts for current load tests hitting same server
2025-01-11 23:27:46 -08:00
Ishaan Jaff
d21e4dedbd (perf) - only use response_cost_calculator 1 time per request. (Don't re-use the same helper twice per call ) (#7709)
* fix get_llm_provider for aiohttp openai

* fix _success_handler_helper_fn
2025-01-11 23:23:01 -08:00
Ishaan Jaff
9db7893402 (sdk perf fix) - only print args passed to litellm when debugging mode is on (#7708)
* use _is_debugging_on

* fix unused imports
2025-01-11 22:56:20 -08:00
Ishaan Jaff
6fce45e2e9 Merge branch 'litellm_aiohttp_openai_speedup' 2025-01-11 22:26:26 -08:00