litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 10:44:24 +00:00

Author	SHA1	Message	Date
Krish Dholakia	8ee32291e0	Squashed commit of the following: (#9709 ) commit `b12a9892b7` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Wed Apr 2 08:09:56 2025 -0700 fix(utils.py): don't modify openai_token_counter commit `294de31803` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 21:22:40 2025 -0700 fix: fix linting error commit `cb6e9fbe40` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 19:52:45 2025 -0700 refactor: complete migration commit `bfc159172d` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 19:09:59 2025 -0700 refactor: refactor more constants commit `43ffb6a558` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:45:24 2025 -0700 fix: test commit `04dbe4310c` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:28:58 2025 -0700 refactor: refactor: move more constants into constants.py commit `3c26284aff` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:14:46 2025 -0700 refactor: migrate hardcoded constants out of __init__.py commit `c11e0de69d` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:11:21 2025 -0700 build: migrate all constants into constants.py commit `7882bdc787` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:07:37 2025 -0700 build: initial test banning hardcoded numbers in repo	2025-04-02 21:24:54 -07:00
Krish Dholakia	9b7ebb6a7d	build(pyproject.toml): add new dev dependencies - for type checking (#9631 ) * build(pyproject.toml): add new dev dependencies - for type checking * build: reformat files to fit black * ci: reformat to fit black * ci(test-litellm.yml): make tests run clear * build(pyproject.toml): add ruff * fix: fix ruff checks * build(mypy/): fix mypy linting errors * fix(hashicorp_secret_manager.py): fix passing cert for tls auth * build(mypy/): resolve all mypy errors * test: update test * fix: fix black formatting * build(pre-commit-config.yaml): use poetry run black * fix(proxy_server.py): fix linting error * fix: fix ruff safe representation error	2025-03-29 11:02:13 -07:00
Ishaan Jaff	b60178f534	fix azure chat logic	2025-03-18 12:42:24 -07:00
Ishaan Jaff	80a5cfa01d	test_azure_embedding_max_retries_0	2025-03-18 12:35:34 -07:00
Ishaan Jaff	d4b3082ca2	fix azure embedding test	2025-03-18 12:19:12 -07:00
Ishaan Jaff	38e2dd00cc	fix amebedding issue on ssl azure	2025-03-18 11:42:11 -07:00
Ishaan Jaff	a0c5fb81b8	fix logic for intializing openai clients	2025-03-18 10:23:30 -07:00
Ishaan Jaff	34142a1b62	_init_azure_client_for_cloudflare_ai_gateway	2025-03-18 10:11:54 -07:00
Ishaan Jaff	edfbf21c39	fix re-using azure openai client	2025-03-18 10:06:56 -07:00
Ishaan Jaff	f2026ef907	fix - correctly re-use azure openai client	2025-03-18 09:51:28 -07:00
Ishaan Jaff	b74f3cb76c	_get_azure_openai_client	2025-03-18 09:38:27 -07:00
Ishaan Jaff	26be805ad3	rename to _get_azure_openai_client	2025-03-18 09:25:26 -07:00
Krrish Dholakia	5ffd3f56f8	fix(azure.py): track azure llm api latency metric	2025-03-13 14:47:35 -07:00
Krish Dholakia	2d957a0ed9	Merge branch 'main' into litellm_dev_03_10_2025_p3	2025-03-12 14:56:01 -07:00
Krrish Dholakia	2469072c50	fix: remove unused imports	2025-03-11 18:15:10 -07:00
Krrish Dholakia	58888f117c	feat(azure.py): fix azure client init	2025-03-11 18:05:11 -07:00
Krrish Dholakia	9af73f339a	test: fix tests	2025-03-11 17:42:36 -07:00
Krrish Dholakia	2c2404dac9	refactor(azure.py): working client init logic in azure image generation	2025-03-11 14:22:25 -07:00
Krrish Dholakia	152bc67d22	refactor(azure.py): working azure client init on audio speech endpoint	2025-03-11 14:19:45 -07:00
Krrish Dholakia	f7d9cce536	refactor(azure.py): refactor acompletion to use base azure sdk client	2025-03-11 13:59:13 -07:00
Krrish Dholakia	b58edb7fa1	test(test_azure_common_utils.py): add unit testing for common azure client params function	2025-03-11 12:24:08 -07:00
Krrish Dholakia	69839b3720	refactor(azure/common_utils.py): refactor azure client param logic create common util for azure client param logic	2025-03-11 12:14:50 -07:00
Krrish Dholakia	bfbe26b91d	feat(azure.py): add azure bad request error support	2025-03-10 15:59:06 -07:00
Krish Dholakia	f651d51f26	Litellm dev 02 07 2025 p2 (#8377 ) * fix(caching_routes.py): mask redis password on `/cache/ping` route * fix(caching_routes.py): fix linting erro * fix(caching_routes.py): fix linting error on caching routes * fix: fix test - ignore mask_dict - has a breakpoint * fix(azure.py): add timeout param + elapsed time in azure timeout error * fix(http_handler.py): add elapsed time to http timeout request makes it easier to debug how long request took before failing	2025-02-07 17:30:38 -08:00
Krish Dholakia	6b8b49451f	Fix azure max retries error (#8340 ) * fix(azure.py): ensure max_retries=0 is respected Fixes https://github.com/BerriAI/litellm/issues/6129 * fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0 * test(test_azure_openai.py): add unit testing for azure_text/ route * fix(azure.py): fix passing max retries on streaming * fix(azure.py): fix azure max retries on async completion + streaming * fix(completion/handler.py): fix azure text async completion + streaming * test(test_azure_openai.py): ensure azure openai max retries always respected * test(test_azure_o_series.py): add testing to ensure max retries always respected * Added gemini providers for 2.0-flash and 2.0-flash lite (#8321) * Update model_prices_and_context_window.json added gemini providers for 2.0-flash and 2.0-flash light * Update model_prices_and_context_window.json fixed URL --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Convert tool use arguments to string before counting tokens (#6989) In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine. * build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing * build(model_prices_and_context_window.json): add gemini commercial rate limits * fix(utils.py): fix linting error * refactor(utils.py): refactor to maintain function size --------- Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com> Co-authored-by: Josh Morrow <josh@jcmorrow.com>	2025-02-06 23:20:48 -08:00
Krish Dholakia	97b8de17ab	LiteLLM Minor Fixes & Improvements (01/16/2025) - p2 (#7828 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 14s Details * fix(vertex_ai/gemini/transformation.py): handle 'http://' image urls * test: add base test for `http:` url's * fix(factory.py/get_image_details): follow redirects allows http calls to work * fix(codestral/): fix stream chunk parsing on last chunk of stream * Azure ad token provider (#6917) * Update azure.py Added optional parameter azure ad token provider * Added parameter to main.py * Found token provider arg location * Fixed embeddings * Fixed ad token provider --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * fix: fix linting errors * fix(main.py): leave out o1 route for azure ad token provider, for now get v0 out for sync azure gpt route to begin with * test: skip http:// test for fireworks ai model does not support it * refactor: cleanup dead code * fix: revert http:// url passthrough for gemini google ai studio raises errors * test: fix test --------- Co-authored-by: bahtman <anton@baht.dk>	2025-02-02 23:17:50 -08:00
Ishaan Jaff	1e06ee3162	(Refactor) - Re use litellm.completion/litellm.embedding etc for health checks (#7455 ) * add mode: realtime * add _realtime_health_check * test_realtime_health_check * azure _realtime_health_check * _realtime_health_check * Realtime Models * fix code quality * delete OAI / Azure custom health check code * simplest version of ahealth check * update tests * working health check post refactor * working aspeech health check * fix realtime health checks * test_audio_transcription_health_check * use get_audio_file_for_health_check * test_text_completion_health_check * ahealth_check * simplify health check code * update ahealth_check * fix import * fix unused imports * fix ahealth_check * fix local testing * test_async_realtime_health_check	2024-12-28 18:38:54 -08:00
Ishaan Jaff	4e65722a00	(Bug Fix) Add health check support for realtime models (#7453 ) * add mode: realtime * add _realtime_health_check * test_realtime_health_check * azure _realtime_health_check * _realtime_health_check * Realtime Models * fix code quality	2024-12-28 18:15:00 -08:00
Krish Dholakia	9237357bcc	Litellm dev 12 25 2024 p1 (#7411 ) * test(test_watsonx.py): e2e unit test for watsonx custom header covers https://github.com/BerriAI/litellm/issues/7408 * fix(common_utils.py): handle auth token already present in headers (watsonx + openai-like base handler) Fixes https://github.com/BerriAI/litellm/issues/7408 * fix(watsonx/chat): fix chat route Fixes https://github.com/BerriAI/litellm/issues/7408 * fix(huggingface/chat/handler.py): fix huggingface async completion calls * Correct handling of max_retries=0 to disable AzureOpenAI retries (#7379) * test: fix test --------- Co-authored-by: Minh Duc <phamminhduc0711@gmail.com>	2024-12-25 17:36:30 -08:00
Ishaan Jaff	81be0b4090	(Feat) add `"/v1/batches/{batch_id:path}/cancel" endpoint (#7406 ) * use 1 file for azure batches handling * add cancel_batch endpoint * add a cancel batch on open ai * add cancel_batch endpoint * add cancel batches to test * remove unused imports * test_batches_operations * update test_batches_operations	2024-12-24 20:23:50 -08:00
Krish Dholakia	404bf2974b	Litellm dev 2024 12 20 p1 (#7335 ) * fix(utils.py): e2e azure tts cost tracking working moves tts response obj to include hidden params (allows for litellm call id, etc. to be sent in response headers) ; fixes spend_Tracking_utils logging payload to account for non-base model use-case Fixes https://github.com/BerriAI/litellm/issues/7223 * fix: fix linting errors * build(model_prices_and_context_window.json): add bedrock llama 3.3 Closes https://github.com/BerriAI/litellm/issues/7329 * fix(openai.py): fix return type for sync openai httpx response * test: update test * fix(spend_tracking_utils.py): fix if check * fix(spend_tracking_utils.py): fix if check * test: improve debugging for test * fix: fix import	2024-12-20 21:22:31 -08:00
Ishaan Jaff	c7f14e936a	(code quality) run ruff rule to ban unused imports (#7313 ) * remove unused imports * fix AmazonConverseConfig * fix test * fix import * ruff check fixes * test fixes * fix testing * fix imports	2024-12-19 12:33:42 -08:00
Krish Dholakia	516c2a6a70	Litellm remove circular imports (#7232 ) * fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py * fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py ' * refactor: fix litellm.ModelResponse import on pass through endpoints * refactor(litellm_logging.py): fix circular import for custom callbacks literal * fix(factory.py): fix circular imports inside prompt factory * fix(cost_calculator.py): fix circular import for 'litellm.Usage' * fix(proxy_server.py): fix potential circular import with `litellm.Router' * fix(proxy/utils.py): fix potential circular import in `litellm.Router` * fix: remove circular imports in 'auth_checks' and 'guardrails/' * fix(prompt_injection_detection.py): fix router impor t * fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through * fix(anthropic_pass_through_logging_handler.py): fix potential circular imports * fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import * fix(base.py): fix potential circular import * fix(handler.py): fix potential circular ref in codestral + cohere handler's * fix(azure.py): fix potential circular imports * fix(gpt_transformation.py): fix modelresponse import * fix(litellm_logging.py): add logging base class - simplify typing makes it easy for other files to type check the logging obj without introducing circular imports * fix(azure_ai/embed): fix potential circular import on handler.py * fix(databricks/): fix potential circular imports in databricks/ * fix(vertex_ai/): fix potential circular imports on vertex ai embeddings * fix(vertex_ai/image_gen): fix import * fix(watsonx-+-bedrock): cleanup imports * refactor(anthropic-pass-through-+-petals): cleanup imports * refactor(huggingface/): cleanup imports * fix(ollama-+-clarifai): cleanup circular imports * fix(openai_like/): fix impor t * fix(openai_like/): fix embedding handler cleanup imports * refactor(openai.py): cleanup imports * fix(sagemaker/transformation.py): fix import * ci(config.yml): add circular import test to ci/cd	2024-12-14 16:28:34 -08:00
Krish Dholakia	e68bb4e051	Litellm dev 12 12 2024 (#7203 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 47s Details * fix(azure/): support passing headers to azure openai endpoints Fixes https://github.com/BerriAI/litellm/issues/6217 * fix(utils.py): move default tokenizer to just openai hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls * fix(router.py): fix pattern matching router - add generic "" to it as well Fixes issue where generic "" model access group wouldn't show up * fix(pattern_match_deployments.py): match to more specific pattern match to more specific pattern allows setting generic wildcard model access group and excluding specific models more easily * fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty don't delete all router models b/c of empty list Fixes https://github.com/BerriAI/litellm/issues/7196 * fix(anthropic/): fix handling response_format for anthropic messages with anthropic api * fix(fireworks_ai/): support passing response_format + tool call in same message Addresses https://github.com/BerriAI/litellm/issues/7135 * Revert "fix(fireworks_ai/): support passing response_format + tool call in same message" This reverts commit `6a30dc6929`. * test: fix test * fix(replicate/): fix replicate default retry/polling logic * test: add unit testing for router pattern matching * test: update test to use default oai tokenizer * test: mark flaky test * test: skip flaky test	2024-12-13 08:54:03 -08:00
Krish Dholakia	350cfc36f7	Litellm merge pr (#7161 ) * build: merge branch * test: fix openai naming * fix(main.py): fix openai renaming * style: ignore function length for config factory * fix(sagemaker/): fix routing logic * fix: fix imports * fix: fix override	2024-12-10 22:49:26 -08:00
Krish Dholakia	0c0498dd60	Litellm dev 12 07 2024 (#7086 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details * fix(main.py): support passing max retries to azure/openai embedding integrations Fixes https://github.com/BerriAI/litellm/issues/7003 * feat(team_endpoints.py): allow updating team model aliases Closes https://github.com/BerriAI/litellm/issues/6956 * feat(router.py): allow specifying model id as fallback - skips any cooldown check Allows a default model to be checked if all models in cooldown s/o @micahjsmith * docs(reliability.md): add fallback to specific model to docs * fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util Allows user to identify if messages/tools have prompt caching Related issue: https://github.com/BerriAI/litellm/issues/6784 * feat(router.py): store model id for prompt caching valid prompt Allows routing to that model id on subsequent requests * fix(router.py): only cache if prompt is valid prompt caching prompt prevents storing unnecessary items in cache * feat(router.py): support routing prompt caching enabled models to previous deployments Closes https://github.com/BerriAI/litellm/issues/6784 * test: fix linting errors * feat(databricks/): convert basemodel to dict and exclude none values allow passing pydantic message to databricks * fix(utils.py): ensure all chat completion messages are dict * (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081) * add custom_llm_provider to SpendLogsPayload * add custom_llm_provider to SpendLogs * add custom llm provider to SpendLogs payload * test_spend_logs_payload * Add MLflow to the side bar (#7031) Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * (bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082) * catch DB_CONNECTION_ERROR_TYPES * fix DB retry mechanism for SpendLog updates * use DB_CONNECTION_ERROR_TYPES in auth checks * fix exp back off for writing SpendLogs * use _raise_failed_update_spend_exception to ensure errors print as NON blocking * test_update_spend_logs_multiple_batches_with_failure * (Feat) Add StructuredOutputs support for Fireworks.AI (#7085) * fix model cost map fireworks ai "supports_response_schema": true, * fix supports_response_schema * fix map openai params fireworks ai * test_map_response_format * test_map_response_format * added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084) * bump: version 1.53.9 → 1.54.0 * fix deepinfra * litellm db fixes LiteLLM_UserTable (#7089) * ci/cd queue new release * fix llama-3.3-70b-versatile * refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090) * fix refactor - use consistent file naming convention * ci/cd run again * fix naming structure * fix use consistent naming (#7092) --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>	2024-12-08 00:30:33 -08:00
Ishaan Jaff	36e99ebce7	fix use consistent naming (#7092 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details	2024-12-07 22:01:00 -08:00

37 commits