Commit graph

11905 commits

Author SHA1 Message Date
Krish Dholakia
a42f008cd0 Litellm dev 12 12 2024 (#7203)
* fix(azure/): support passing headers to azure openai endpoints

Fixes https://github.com/BerriAI/litellm/issues/6217

* fix(utils.py): move default tokenizer to just openai

hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls

* fix(router.py): fix pattern matching router - add generic "*" to it as well

Fixes issue where generic "*" model access group wouldn't show up

* fix(pattern_match_deployments.py): match to more specific pattern

match to more specific pattern

allows setting generic wildcard model access group and excluding specific models more easily

* fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty

don't delete all router models  b/c of empty list

Fixes https://github.com/BerriAI/litellm/issues/7196

* fix(anthropic/): fix handling response_format for anthropic messages with anthropic api

* fix(fireworks_ai/): support passing response_format + tool call in same message

Addresses https://github.com/BerriAI/litellm/issues/7135

* Revert "fix(fireworks_ai/): support passing response_format + tool call in same message"

This reverts commit 6a30dc6929.

* test: fix test

* fix(replicate/): fix replicate default retry/polling logic

* test: add unit testing for router pattern matching

* test: update test to use default oai tokenizer

* test: mark flaky test

* test: skip flaky test
2024-12-13 08:54:03 -08:00
Ishaan Jaff
e65f990319 bump: version 1.55.0 → 1.55.1 2024-12-12 20:50:45 -08:00
Ishaan Jaff
b56e29db36 (fix) latency fix - revert prompt caching check on litellm router (#7211)
* attempt to fix latency issue

* fix latency issues for router prompt caching
2024-12-12 20:50:16 -08:00
Ishaan Jaff
01b20f0bb8 (minor fix proxy) Clarify Proxy Rate limit errors are showing hash of litellm virtual key (#7210)
* fix clarify rate limit errors are showing litellm virtual key

* fix constants.py

* update test

* fix test parallel limiter
2024-12-12 20:13:14 -08:00
Ishaan Jaff
b1c3e2d4ef (feat) UI - Disable Usage Tab once SpendLogs is 1M+ Rows (#7208)
* use utils to set proxy spend logs row count

* store proxy state variables

* fix check for _has_user_setup_sso

* fix proxyStateVariables

* fix dup code

* rename getProxyUISettings

* add fixes

* ui emit num spend logs rows

* test_proxy_server_prisma_setup

* use MAX_SPENDLOG_ROWS_TO_QUERY to constants

* test_get_ui_settings_spend_logs_threshold
2024-12-12 18:43:17 -08:00
Ishaan Jaff
8c7605a164 fix: Support WebP image format and avoid token calculation error (#7182)
* fix get_image_dimensions

* attempt without pillow

* add clear type hints

* fix run_async_function_within_sync_function

* fix calculage_img_tokens

* fix is_prompt_caching_valid_prompt

* fix naming

* fix calculate_img_tokens

* fix unused imports

* fix calculate_img_tokens

* test test_is_prompt_caching_enabled_error_handling

* test_is_prompt_caching_enabled_return_default_image_dimensions

* fix openai_token_counter

* fix get_image_dimensions

* test_token_counter_with_image_url_with_detail_high

* test_img_url_token_counter

* fix test utils

* fix testing

* test_is_prompt_caching_enabled
2024-12-12 14:32:39 -08:00
Ishaan Jaff
0862a233be (feat) add error_code, error_class, llm_provider to StandardLoggingPayload (#7200)
* add StandardLoggingPayloadErrorInformation to error

* test_get_error_information
2024-12-12 12:18:10 -08:00
Ishaan Jaff
02fc8d8738 (Feat) DataDog Logger - Add HOSTNAME and POD_NAME to DataDog logs (#7189)
* add unit test for test_datadog_static_methods

* docs dd vars

* test_datadog_payload_environment_variables

* test_datadog_static_methods

* docs env vars

* fix table
2024-12-12 12:06:26 -08:00
Ishaan Jaff
2185587b4d (feat) add response_time to StandardLoggingPayload - logged on datadog, gcs_bucket, s3_bucket etc (#7199)
* feat - add response_time to slp

* test_get_response_time

* docs slp

* fix test_datadog_logging_http_request
2024-12-12 12:04:43 -08:00
Ishaan Jaff
f1fa2d3fef ci/cd run release pipeline 2024-12-12 10:48:47 -08:00
Ishaan Jaff
bd8f39419d fix hf failing streaming test 2024-12-12 10:48:00 -08:00
Krish Dholakia
a9aeb21d0b fix(acompletion): support fallbacks on acompletion (#7184)
* fix(acompletion): support fallbacks on acompletion

allows health checks for wildcard routes to use fallback models

* test: update cohere generate api testing

* add max tokens to health check (#7000)

* fix: fix health check test

* test: update testing

---------

Co-authored-by: Cameron <561860+wallies@users.noreply.github.com>
2024-12-11 19:20:54 -08:00
Krrish Dholakia
5fe77499d2 build(model_prices_and_context_window.json): add new dbrx llama 3.3 model
fixes llama cost calc on databricks
2024-12-11 13:01:22 -08:00
Ishaan Jaff
74917d7b16 fix test_vertexai_model_garden_model_completion 2024-12-11 12:07:32 -08:00
Krish Dholakia
c466f494f2 fix(get_supported_openai_params.py): cleanup (#7176) 2024-12-11 01:15:53 -08:00
Ishaan Jaff
dfba7e7481 fix merge conflicts 2024-12-11 01:11:53 -08:00
Krrish Dholakia
982ef7ca04 build: Squashed commit of https://github.com/BerriAI/litellm/pull/7171
Closes https://github.com/BerriAI/litellm/pull/7171
2024-12-11 01:10:12 -08:00
Ishaan Jaff
7a69e93b33 fix merge conflicts 2024-12-11 01:08:43 -08:00
Ishaan Jaff
a3f8b88228 fix - handle merge conflicts 2024-12-11 01:06:40 -08:00
Krrish Dholakia
fd97b9d966 build: Squashed commit of https://github.com/BerriAI/litellm/pull/7170
Closes https://github.com/BerriAI/litellm/pull/7170
2024-12-11 01:03:57 -08:00
Krrish Dholakia
6493eaf2ee build: Squashed commit of https://github.com/BerriAI/litellm/pull/7165
Closes https://github.com/BerriAI/litellm/pull/7165
2024-12-11 01:00:33 -08:00
Ishaan Jaff
b79db3616c (Refactor) Code Quality improvement - rename text_completion_codestral.py -> codestral/completion/ (#7172)
* rename files

* fix codestral fim organization

* fix CodestralTextCompletionConfig

* fix import CodestralTextCompletion

* fix BaseLLM

* fix imports

* fix CodestralTextCompletionConfig

* fix imports CodestralTextCompletion
2024-12-11 00:55:47 -08:00
Ishaan Jaff
3afd7be40d Code Quality Improvement - move aleph_alpha to deprecated_providers (#7168)
* move aleph alpha to deprecated providers

* fix import location

* fix aleph_alpha

* pytest skip

* undo change to test file
2024-12-11 00:50:40 -08:00
Ishaan Jaff
e09d3761d8 Code Quality Improvement - use vertex_ai/ as folder name for vertexAI (#7166)
* fix rename vertex ai

* run ci/cd again
2024-12-11 00:32:41 -08:00
Ishaan Jaff
26918487d6 (Refactor) Code Quality improvement - remove /prompt_templates/ , base_aws_llm.py from /llms folder (#7164)
* fix move base_aws_llm

* fix import

* update enforce llms folder style

* move prompt_templates

* update prompt_templates location

* fix imports

* fix imports

* fix imports

* fix imports

* fix checks
2024-12-11 00:02:46 -08:00
Ishaan Jaff
76a00247ea Code Quality Improvement - remove tokenizers/ from /llms (#7163)
* move tokenizers out of /llms

* use updated tokenizers location

* fix test_google_secret_manager_read_in_memory
2024-12-10 23:50:15 -08:00
Krish Dholakia
93000bd8d3 Litellm merge pr (#7161)
* build: merge branch

* test: fix openai naming

* fix(main.py): fix openai renaming

* style: ignore function length for config factory

* fix(sagemaker/): fix routing logic

* fix: fix imports

* fix: fix override
2024-12-10 22:49:26 -08:00
Krish Dholakia
cd9b92b402 Litellm vllm refactor (#7158)
* refactor(vllm/): move vllm to use base llm config

* test: mark flaky test
2024-12-10 21:48:35 -08:00
Krish Dholakia
e9fbefca5d Litellm ollama refactor (#7162)
* refactor(ollama/): refactor ollama `/api/generate` to use base llm config

Addresses https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132

* test: skip unresponsive test

* test(test_secret_manager.py): mark flaky test

* test: fix google sm test

* fix: fix init.py
2024-12-10 21:45:35 -08:00
Krish Dholakia
6c6834dde7 Revert "LiteLLM Common Base LLM Config (pt.4): Move Ollama to Base LLM Config…" (#7160)
This reverts commit 40a22eb4c6.
2024-12-10 21:44:54 -08:00
Ishaan Jaff
91581bc2db Code Quality Improvement - remove file_apis, fine_tuning_apis from /llms (#7156)
* remove files_apis from /llms

* fix imports

* move fine tuning api from /llms

* fix importing fine tuning handlers

* fix imports
2024-12-10 21:44:25 -08:00
Krish Dholakia
71eaedac6f LiteLLM Common Base LLM Config (pt.4): Move Ollama to Base LLM Config (#7157)
* refactor(ollama/): refactor ollama `/api/generate` to use base llm config

Addresses https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132

* test: skip unresponsive test

* test(test_secret_manager.py): mark flaky test

* test: fix google sm test
2024-12-10 21:39:28 -08:00
Ishaan Jaff
d912e562ac remove symlink (#7155) 2024-12-10 21:04:21 -08:00
Ishaan Jaff
0cecff016e fix import 2024-12-10 20:26:16 -08:00
Ishaan Jaff
5ad57dd54b rename llms/OpenAI/ -> llms/openai/ (#7154)
* rename OpenAI -> openai

* fix file rename

* fix rename changes

* fix organization of openai/transcription

* fix import OA fine tuning API

* fix openai ft handler

* fix handler import
2024-12-10 20:14:07 -08:00
Krish Dholakia
61afdab228 refactor(sagemaker/): separate chat + completion routes + make them b… (#7151)
* refactor(sagemaker/): separate chat + completion routes + make them both use base llm config

Addresses https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132

* fix(main.py): pass hf model name + custom prompt dict to litellm params
2024-12-10 19:40:05 -08:00
Krish Dholakia
df12f87a64 LiteLLM Common Base LLM Config (pt.3): Move all OAI compatible providers to base llm config (#7148)
* refactor(fireworks_ai/): inherit from openai like base config

refactors fireworks ai to use a common config

* test: fix import in test

* refactor(watsonx/): refactor watsonx to use llm base config

refactors chat + completion routes to base config path

* fix: fix linting error

* refactor: inherit base llm config for oai compatible routes

* test: fix test

* test: fix test
2024-12-10 17:12:42 -08:00
Krish Dholakia
4eeaaeeacd refactor(fireworks_ai/): inherit from openai like base config (#7146)
* refactor(fireworks_ai/): inherit from openai like base config

refactors fireworks ai to use a common config

* test: fix import in test

* refactor(watsonx/): refactor watsonx to use llm base config

refactors chat + completion routes to base config path

* fix: fix linting error

* test: fix test

* fix: fix test
2024-12-10 16:15:19 -08:00
Ishaan Jaff
6a9225fac2 (Refactor) Code Quality improvement - stop redefining LiteLLMBase (#7147)
* fix stop redefining  LiteLLMBase

* use better name for base pydantic obj
2024-12-10 15:49:01 -08:00
Ishaan Jaff
0df4dc51de (Refactor) Code Quality improvement - Use Common base handler for anthropic_text/ (#7143)
* add anthropic text provider

* add ANTHROPIC_TEXT to LlmProviders

* fix anthropic text implementation

* working anthropic text claude-2

* test_acompletion_claude2_stream

* add param mapping for anthropic text

* fix unused imports

* fix anthropic completion handler.py
2024-12-10 12:23:58 -08:00
Ishaan Jaff
1b377d5229 (Refactor) Code Quality improvement - Use Common base handler for Cohere /generate API (#7122)
* use validate_environment in common utils

* use transform request / response for cohere

* remove unused file

* use cohere base_llm_http_handler

* working cohere generate api on llm http handler

* streaming cohere generate api

* fix get_model_response_iterator

* fix streaming handler

* fix get_model_response_iterator

* test_cohere_generate_api_completion

* fix linting error

* fix testing cohere raising error

* fix get_model_response_iterator type

* add testing cohere generate api
2024-12-10 10:44:42 -08:00
Ishaan Jaff
9c2316b7ec (Refactor) Code Quality improvement - Use Common base handler for cloudflare/ provider (#7127)
* add get_complete_url to base config

* cloudflare - refactor to following existing pattern

* migrate cloudflare chat completions to base llm http handler

* fix unused import

* fix fake stream in cloudflare

* fix cloudflare transformation

* fix naming for BaseModelResponseIterator

* add async cloudflare streaming test

* test cloudflare

* add handler.py

* add handler.py in cohere handler.py
2024-12-10 10:12:22 -08:00
Ishaan Jaff
28ff38e35d (Refactor) Code Quality improvement - Use Common base handler for clarifai/ (#7125)
* use base_llm_http_handler for clarifai

* fix clarifai completion

* handle faking streaming base llm http handler

* add fake streaming for clarifai

* add FakeStreamResponseIterator for base model iterator

* fix get_model_response_iterator

* fix base model iterator

* fix base model iterator

* add support for faking sync streams clarfiai

* add fake streaming for clarifai

* remove unused code

* fix import

* fix llm http handler

* test_async_completion_clarifai

* fix clarifai tests

* fix linting
2024-12-09 21:04:48 -08:00
Ishaan Jaff
c5e0407703 (Refactor) Code Quality improvement - use Common base handler for Cohere (#7117)
* fix use new format for Cohere config

* fix base llm http handler

* Litellm code qa common config (#7116)

* feat(base_llm): initial commit for common base config class

Addresses code qa critique https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132

* feat(base_llm/): add transform request/response abstract methods to base config class

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>

* use base transform helpers

* use base_llm_http_handler for cohere

* working cohere using base llm handler

* add async cohere chat completion support on base handler

* fix completion code

* working sync cohere stream

* add async support cohere_chat

* fix types get_model_response_iterator

* async / sync tests cohere

* feat  cohere using base llm class

* fix linting errors

* fix _abc error

* add cohere params to transformation

* remove old cohere file

* fix type error

* fix merge conflicts

* fix cohere merge conflicts

* fix linting error

* fix litellm.llms.custom_httpx.http_handler.HTTPHandler.post

* fix passing cohere specific params

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2024-12-09 17:45:29 -08:00
Krish Dholakia
501885d653 Litellm code qa common config (#7113)
* feat(base_llm): initial commit for common base config class

Addresses code qa critique https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132

* feat(base_llm/): add transform request/response abstract methods to base config class

* feat(cohere-+-clarifai): refactor integrations to use common base config class

* fix: fix linting errors

* refactor(anthropic/): move anthropic + vertex anthropic to use base config

* test: fix xai test

* test: fix tests

* fix: fix linting errors

* test: comment out WIP test

* fix(transformation.py): fix is pdf used check

* fix: fix linting error
2024-12-09 15:58:25 -08:00
Krish Dholakia
70c4e1b4d2 Litellm dev 12 07 2024 (#7086)
* fix(main.py): support passing max retries to azure/openai embedding integrations

Fixes https://github.com/BerriAI/litellm/issues/7003

* feat(team_endpoints.py): allow updating team model aliases

Closes https://github.com/BerriAI/litellm/issues/6956

* feat(router.py): allow specifying model id as fallback - skips any cooldown check

Allows a default model to be checked if all models in cooldown

s/o @micahjsmith

* docs(reliability.md): add fallback to specific model to docs

* fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util

Allows user to identify if messages/tools have prompt caching

Related issue: https://github.com/BerriAI/litellm/issues/6784

* feat(router.py): store model id for prompt caching valid prompt

Allows routing to that model id on subsequent requests

* fix(router.py): only cache if prompt is valid prompt caching prompt

prevents storing unnecessary items in cache

* feat(router.py): support routing prompt caching enabled models to previous deployments

Closes https://github.com/BerriAI/litellm/issues/6784

* test: fix linting errors

* feat(databricks/): convert basemodel to dict and exclude none values

allow passing pydantic message to databricks

* fix(utils.py): ensure all chat completion messages are dict

* (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081)

* add custom_llm_provider to SpendLogsPayload

* add custom_llm_provider to SpendLogs

* add custom llm provider to SpendLogs payload

* test_spend_logs_payload

* Add MLflow to the side bar (#7031)

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* (bug fix) SpendLogs update DB catch all possible DB errors for retrying  (#7082)

* catch DB_CONNECTION_ERROR_TYPES

* fix DB retry mechanism for SpendLog updates

* use DB_CONNECTION_ERROR_TYPES in auth checks

* fix exp back off for writing SpendLogs

* use _raise_failed_update_spend_exception to ensure errors print as NON blocking

* test_update_spend_logs_multiple_batches_with_failure

* (Feat) Add StructuredOutputs support for Fireworks.AI (#7085)

* fix model cost map fireworks ai "supports_response_schema": true,

* fix supports_response_schema

* fix map openai params fireworks ai

* test_map_response_format

* test_map_response_format

* added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084)

* bump: version 1.53.9 → 1.54.0

* fix deepinfra

* litellm db fixes LiteLLM_UserTable (#7089)

* ci/cd queue new release

* fix llama-3.3-70b-versatile

* refactor - use consistent file naming convention `AI21/` -> `ai21`  (#7090)

* fix refactor - use consistent file naming convention

* ci/cd run again

* fix naming structure

* fix use consistent naming (#7092)

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>
2024-12-08 00:30:33 -08:00
Ishaan Jaff
664d82ca9e fix use consistent naming (#7092) 2024-12-07 22:01:00 -08:00
Ishaan Jaff
249506065e refactor - use consistent file naming convention AI21/ -> ai21 (#7090)
* fix refactor - use consistent file naming convention

* ci/cd run again

* fix naming structure
2024-12-07 21:46:34 -08:00
Ishaan Jaff
2a35de0868 fix llama-3.3-70b-versatile 2024-12-07 20:19:02 -08:00
Ishaan Jaff
f34bde2eca ci/cd queue new release 2024-12-07 19:09:57 -08:00