Commit graph

18754 commits

Author SHA1 Message Date
Krish Dholakia
c95351e70f Litellm dev 12 24 2024 p2 (#7400)
* fix(utils.py): default custom_llm_provider=None for 'supports_response_schema'

Closes https://github.com/BerriAI/litellm/issues/7397

* refactor(langfuse/): call langfuse logger inside customlogger compatible langfuse class, refactor langfuse logger to use verbose_logger.debug instead of print_verbose

* refactor(litellm_pre_call_utils.py): move config based team callbacks inside dynamic team callback logic

enables simpler unit testing for config-based team callbacks

* fix(proxy/_types.py): handle teamcallbackmetadata - none values

drop none values if present. if all none, use default dict to avoid downstream errors

* test(test_proxy_utils.py): add unit test preventing future issues - asserts team_id in config state not popped off across calls

Fixes https://github.com/BerriAI/litellm/issues/6787

* fix(langfuse_prompt_management.py): add success + failure logging event support

* fix: fix linting error

* test: fix test

* test: fix test

* test: override o1 prompt caching - openai currently not working

* test: fix test
2024-12-24 20:33:41 -08:00
Ishaan Jaff
d790ba0897 bump: version 1.55.11 → 1.55.12 2024-12-24 20:24:41 -08:00
Krish Dholakia
f929a1f309 Litellm dev 12 24 2024 p4 (#7407)
* fix(invoke_handler.py): fix mock response iterator to handle tool calling

returns tool call if returned by model response

* fix(prometheus.py): add new 'tokens_by_tag' metric on prometheus

allows tracking 'token usage' by task

* feat(prometheus.py): add input + output token tracking by tag

* feat(prometheus.py): add tag based deployment failure tracking

allows admin to track failure by use-case
2024-12-24 20:24:06 -08:00
Ishaan Jaff
54cb64d03d (Feat) add `"/v1/batches/{batch_id:path}/cancel" endpoint (#7406)
* use 1 file for azure batches handling

* add cancel_batch endpoint

* add a cancel batch on open ai

* add cancel_batch endpoint

* add cancel batches to test

* remove unused imports

* test_batches_operations

* update test_batches_operations
2024-12-24 20:23:50 -08:00
Krish Dholakia
440009fb32 Litellm dev 12 24 2024 p3 (#7403)
* fix(model_prices_and_context_window.json): specify meta llama is a bedrock converse model route

Fixes https://github.com/BerriAI/litellm/issues/7385

* test(test_get_model_info.py): enforce all new bedrock chat models added have the bedrock_converse route

Prevents https://github.com/BerriAI/litellm/issues/7385 and https://github.com/BerriAI/litellm/discussions/7325

* fix(get_supported_openai_params.py): use vertex gemini config by default for vertex ai route

Fixes https://github.com/BerriAI/litellm/issues/7378

* refactor(vertex_ai/gemini/): rename vertexaiconfig to vertexaibaseconfig - make it clear vertexaiconfig = vertexgemini config

* build(model_prices_and_context_window.json): add gpt-4o-audio-preview-2024-12-17

Closes https://github.com/BerriAI/litellm/issues/7367

* test: fix test

* test: fix o1 tests

* fix: handle llm api errors

* fix: fix linting errors
2024-12-24 18:07:53 -08:00
Ishaan Jaff
e98f1d16fd (feat) /batches - track user_api_key_alias, user_api_key_team_alias etc for /batch requests (#7401)
* run azure testing on ci/cd

* update docs on azure batches endpoints

* add input azure.jsonl

* refactor - use separate file for batches endpoints

* fixes for passing custom llm provider to /batch endpoints

* pass custom llm provider to files endpoints

* update azure batches doc

* add info for azure batches api

* update batches endpoints

* use simple helper for raising proxy exception

* update config.yml

* fix imports

* add type hints to get_litellm_params

* update get_litellm_params

* update get_litellm_params

* update get slp

* QOL - stop double logging a create batch operations on custom loggers

* re use slp from og event

* _create_standard_logging_object_for_completed_batch

* fix linting errors

* reduce num changes in PR

* update BATCH_STATUS_POLL_MAX_ATTEMPTS
2024-12-24 17:44:28 -08:00
Ishaan Jaff
0627450808 (feat) /batches Add support for using /batches endpoints in OAI format (#7402)
* run azure testing on ci/cd

* update docs on azure batches endpoints

* add input azure.jsonl

* refactor - use separate file for batches endpoints

* fixes for passing custom llm provider to /batch endpoints

* pass custom llm provider to files endpoints

* update azure batches doc

* add info for azure batches api

* update batches endpoints

* use simple helper for raising proxy exception

* update config.yml

* fix imports

* update tests

* use existing settings

* update env var used

* update configs

* update config.yml

* update ft testing
2024-12-24 16:58:05 -08:00
Krrish Dholakia
34d8386926 test: override openai o1 prompt caching test - openai backend not caching right now 2024-12-24 16:49:48 -08:00
Krrish Dholakia
3e339d1600 test(test_cost_calc.py): fix test to handle llm api errors 2024-12-24 16:49:02 -08:00
Krish Dholakia
7403d7b046 Add 'end_user', 'user' and 'requested_model' on more prometheus metrics (#7399)
* fix(prometheus.py): support streaming end user litellm_proxy_total_requests_metric tracking

* fix(prometheus.py): add 'requested_model' and 'end_user_id' to 'litellm_request_total_latency_metric_bucket'

enables latency tracking by end user + requested model

* fix(prometheus.py): add end user, user and requested model metrics to 'litellm_llm_api_latency_metric'

* test: update prometheus unit tests

* test(test_prometheus.py): update tests

* test(test_prometheus.py): fix test

* test: reorder test
2024-12-24 14:08:30 -08:00
Krrish Dholakia
cd8ec35540 bump: version 1.55.10 → 1.55.11 2024-12-24 08:30:35 -08:00
Krish Dholakia
8fe1356406 LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394)
* build(model_prices_and_context_window.json): add gemini-1.5-flash context caching

* fix(context_caching/transformation.py): just use last identified cache point

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(vertex_ai/gemini/): track context caching tokens

* refactor(gemini/): place transformation.py inside `chat/` folder

make it easy for user to know we support the equivalent endpoint

* fix: fix import

* refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder

make it easier to see cost calculation logic

* fix: fix linting errors

* fix: fix circular import

* feat(gemini/cost_calculator.py): support gemini context caching cost calculation

generifies anthropic's cost calculation function and uses it across anthropic + gemini

* build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching

Closes https://github.com/BerriAI/litellm/issues/6891

* docs(gemini.md): add gemini context caching architecture diagram

make it easier for user to understand how context caching works

* docs(gemini.md): link to relevant gemini context caching code

* docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code

* fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario

* fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation

* test: fix test
2024-12-23 22:02:52 -08:00
Ishaan Jaff
905e89bf60 update release notes 2024-12-23 21:48:33 -08:00
Ishaan Jaff
918e5a3b67 update release notes 2024-12-23 21:43:47 -08:00
Ishaan Jaff
2690d7485a release notes 2024-12-23 21:38:56 -08:00
Ishaan Jaff
d3e163cae3 docs batches 2024-12-23 21:24:06 -08:00
Ishaan Jaff
eb9108cf51 docs add files to supported endpoints 2024-12-23 20:51:34 -08:00
Ishaan Jaff
73569c3196 bump: version 1.55.9 → 1.55.10 2024-12-23 19:52:49 -08:00
Ishaan Jaff
904fe618cb dd logger fix - handle objects that can't be JSON dumped (#7393)
* dd logger fix - handle objects that can't be dumped

* test_datadog_non_serializable_messages
2024-12-23 18:21:49 -08:00
Ishaan Jaff
2301858379 test_router_get_available_deployments 2024-12-23 18:21:27 -08:00
Ishaan Jaff
00544b97c8 (feat) Add cost tracking for /batches requests OpenAI (#7384)
* add basic logging for create`batch`

* add create_batch as a call type

* add basic dd logging for batches

* basic batch creation logging on DD

* batch endpoints add cost calc

* fix batches_async_logging

* separate folder for batches testing

* new job for batches tests

* test batches logging

* fix validation logic

* add vertex_batch_completions.jsonl

* test test_async_create_batch

* test_async_create_batch

* update tests

* test_completion_with_no_model

* remove dead code

* update load_vertex_ai_credentials

* test_avertex_batch_prediction

* update get async httpx client

* fix get_async_httpx_client

* update test_avertex_batch_prediction

* fix batches testing config.yaml

* add google deps

* fix vertex files handler
2024-12-23 17:47:26 -08:00
Ishaan Jaff
9d66976162 (feat) Add basic logging support for /batches endpoints (#7381)
* add basic logging for create`batch`

* add create_batch as a call type

* add basic dd logging for batches

* basic batch creation logging on DD
2024-12-23 17:45:03 -08:00
Ishaan Jaff
4a90bd03e8 (Feat) Add input_cost_per_token_batches, output_cost_per_token_batches for OpenAI cost tracking Batches API (#7391)
* add input_cost_per_token_batches

* input_cost_per_token_batches
2024-12-23 17:42:58 -08:00
Ishaan Jaff
5ba0fb27d8 [Bug Fix]: Errors in LiteLLM When Using Embeddings Model with Usage-Based Routing (#7390)
* use slp for usage based routing v2

* update error msg

* fix usage based routing v2

* test_tpm_rpm_updated

* fix unused imports

* fix unused imports
2024-12-23 17:42:24 -08:00
Krish Dholakia
51f9f75c85 LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 (#7386)
* fix(main.py): support 'mock_timeout=true' param

allows mock requests on proxy to have a time delay, for testing

* fix(main.py): ensure mock timeouts raise litellm.Timeout error

triggers retry/fallbacks

* fix: fix fallback + mock timeout testing

* fix(router.py): always return remaining tpm/rpm limits, if limits are known

allows for rate limit headers to be guaranteed

* docs(timeout.md): add docs on mock timeout = true

* fix(main.py): fix linting errors

* test: fix test
2024-12-23 17:41:27 -08:00
Krish Dholakia
a89b0d5c39 Litellm dev 12 23 2024 p1 (#7383)
* feat(guardrails_endpoint.py): new `/guardrails/list` endpoint

Allow users to view what the available guardrails are

* docs: document new `/guardrails/list` endpoint

* docs(enterprise.md): update docs

* fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats

* fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests

* fix: fix linting errors

* fix: remove unused import
2024-12-23 16:33:31 -08:00
Ishaan Jaff
82b298acac (security fix) - update base image for all docker images to python:3.13.1-slim (#7388)
* update base image for all docker files

* remove unused files

* fix sec vuln
2024-12-23 16:20:47 -08:00
Ishaan Jaff
b721fd1797 cleanup ui folder (#7363) 2024-12-23 15:18:41 -08:00
Krish Dholakia
71f659d26b Complete 'requests' library removal (#7350)
* refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure

* refactor(watsonx/completion/handler.py): move to using base llm http handler

removes 'requests' library usage

* fix(watsonx_text/transformation.py): fix result transformation

migrates to transformation.py, for usage with base llm http handler

* fix(streaming_handler.py): migrate watsonx streaming to transformation.py

ensures streaming works with base llm http handler

* fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic

* fix(watsonx/): fix chat route post completion route refactor

* refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well

* refactor(base.py): remove requests library usage from litellm

* build(pyproject.toml): remove requests library usage

* fix: fix linting errors

* fix: fix linting errors

* fix(types/utils.py): fix validation errors for modelresponsestream

* fix(replicate/handler.py): fix linting errors

* fix(litellm_logging.py): handle modelresponsestream object

* fix(streaming_handler.py): fix modelresponsestream args

* fix: remove unused imports

* test: fix test

* fix: fix test

* test: fix test

* test: fix tests

* test: fix test

* test: fix patch target

* test: fix test
2024-12-22 07:21:25 -08:00
Ishaan Jaff
4b0bef1823 add img to release notes 2024-12-21 21:24:16 -08:00
Ishaan Jaff
bbc391f852 update release notes 2024-12-21 21:20:13 -08:00
Krish Dholakia
6c2a348cc9 Litellm docs update (#7365)
* docs: fix release notes url + fix urlk

* docs(index.md): add link to github releases

* docs(index.md): add linkedin social url to release notes
2024-12-21 21:09:50 -08:00
Ishaan Jaff
7c3eef8c04 docs 2024-12-21 20:52:39 -08:00
Ishaan Jaff
2de656f2da docs add 1.55.8 changelog 2024-12-21 20:51:39 -08:00
Ishaan Jaff
2049b25d66 release notes v1.55.8 2024-12-21 20:31:54 -08:00
Krish Dholakia
5c34870edf Document team admins + Enforce assigning team admins as an enterprise feature (#7359)
* fix(team_endpoints.py): enforce assigning team admins as an enterprise feature

* fix(proxy/_types.py): fix common proxy error to link to trial key

* fix: fix linting errors
2024-12-21 20:28:31 -08:00
Krish Dholakia
ae7f54498f Litellm enforce enterprise features (#7357)
* fix(proxy_server.py): enforce team id based model add only works if enterprise user

* fix(auth_checks.py): enforce common_checks can only be imported by user_api_key_auth.py

* fix(auth_checks.py): insert not premium user error message on failed common checks run
2024-12-21 19:14:13 -08:00
Ishaan Jaff
152056375a test fix 2024-12-21 18:48:16 -08:00
Ishaan Jaff
26f93faa40 ui - new build 2024-12-21 15:01:17 -08:00
Krrish Dholakia
118e20f9fb fix(__init__.py): correctly return azure_text models in models_by_provider dictionary 2024-12-21 14:56:07 -08:00
Ishaan Jaff
1d7780d458 apply linting fixes 2024-12-21 14:31:23 -08:00
Ishaan Jaff
25f3cd00b7 (Admin UI) - maintain history on chat UI (#7351)
* ui fix - allow searching model list + fix bug on filtering

* qa fix - use correct provider name for azure_text

* ui wrap content onto next line

* ui fix - allow selecting current UI session when logging in

* ui session budgets

* ui show provider models on wildcard models

* test provider name appears in model list

* ui fix auto scroll on chat ui tab

* ui - maintain chat history
2024-12-21 14:25:35 -08:00
Ishaan Jaff
49ea75830a (Admin UI) correctly render provider name in /models with wildcard routing (#7349)
* ui fix - allow searching model list + fix bug on filtering

* qa fix - use correct provider name for azure_text

* ui wrap content onto next line

* ui fix - allow selecting current UI session when logging in

* ui session budgets

* ui show provider models on wildcard models

* test provider name appears in model list

* ui fix auto scroll on chat ui tab
2024-12-21 14:19:12 -08:00
Ishaan Jaff
23d277f167 (chore) - enforce model budgets on virtual keys as enterprise feature (#7353)
* docs - enforce model budget as enterprise feature

* docs link to correct place
2024-12-21 14:18:53 -08:00
Ishaan Jaff
d80307b7bb (refactor) - fix from enterprise.utils import ui_get_spend_by_tags (#7352)
* ui - refactor ui_get_spend_by_tags

* fix typing
2024-12-21 14:17:12 -08:00
Ishaan Jaff
4d9eaa3531 (Admin UI) - Test Key Tab - Allow using UI Session instead of manually creating a virtual key (#7348)
* ui fix - allow searching model list + fix bug on filtering

* qa fix - use correct provider name for azure_text

* ui wrap content onto next line

* ui fix - allow selecting current UI session when logging in

* ui session budgets
2024-12-21 13:14:15 -08:00
Ishaan Jaff
2d2c30b72b (Admin UI) - Test Key Tab - Allow typing in model name + Add wrapping for text response (#7347)
* ui fix - allow searching model list + fix bug on filtering

* qa fix - use correct provider name for azure_text

* ui wrap content onto next line
2024-12-21 13:14:01 -08:00
Krrish Dholakia
8fa08dfef8 ci(reset_stable.yml): modify to work with all kinds of releases 2024-12-21 12:13:26 -08:00
Ishaan Jaff
46a150fc1b use helper for image gen tests (#7343) 2024-12-20 21:28:32 -08:00
Ishaan Jaff
451215b106 (fix) LiteLLM Proxy fix GET /files/{file_id:path}/content" endpoint (#7342)
* fix order of get_file_content

* update e2 files tests

* add e2 batches endpoint testing

* update config.yml

* write content to file

* use correct oai_misc_config

* fixes for openai batches endpoint testing

* remove extra out file

* fix input.jsonl
2024-12-20 21:27:45 -08:00