Commit graph

3942 commits

Author SHA1 Message Date
Krish Dholakia
51f9f75c85 LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 (#7386)
* fix(main.py): support 'mock_timeout=true' param

allows mock requests on proxy to have a time delay, for testing

* fix(main.py): ensure mock timeouts raise litellm.Timeout error

triggers retry/fallbacks

* fix: fix fallback + mock timeout testing

* fix(router.py): always return remaining tpm/rpm limits, if limits are known

allows for rate limit headers to be guaranteed

* docs(timeout.md): add docs on mock timeout = true

* fix(main.py): fix linting errors

* test: fix test
2024-12-23 17:41:27 -08:00
Krish Dholakia
a89b0d5c39 Litellm dev 12 23 2024 p1 (#7383)
* feat(guardrails_endpoint.py): new `/guardrails/list` endpoint

Allow users to view what the available guardrails are

* docs: document new `/guardrails/list` endpoint

* docs(enterprise.md): update docs

* fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats

* fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests

* fix: fix linting errors

* fix: remove unused import
2024-12-23 16:33:31 -08:00
Ishaan Jaff
82b298acac (security fix) - update base image for all docker images to python:3.13.1-slim (#7388)
* update base image for all docker files

* remove unused files

* fix sec vuln
2024-12-23 16:20:47 -08:00
Krish Dholakia
5c34870edf Document team admins + Enforce assigning team admins as an enterprise feature (#7359)
* fix(team_endpoints.py): enforce assigning team admins as an enterprise feature

* fix(proxy/_types.py): fix common proxy error to link to trial key

* fix: fix linting errors
2024-12-21 20:28:31 -08:00
Krish Dholakia
ae7f54498f Litellm enforce enterprise features (#7357)
* fix(proxy_server.py): enforce team id based model add only works if enterprise user

* fix(auth_checks.py): enforce common_checks can only be imported by user_api_key_auth.py

* fix(auth_checks.py): insert not premium user error message on failed common checks run
2024-12-21 19:14:13 -08:00
Ishaan Jaff
26f93faa40 ui - new build 2024-12-21 15:01:17 -08:00
Ishaan Jaff
1d7780d458 apply linting fixes 2024-12-21 14:31:23 -08:00
Ishaan Jaff
49ea75830a (Admin UI) correctly render provider name in /models with wildcard routing (#7349)
* ui fix - allow searching model list + fix bug on filtering

* qa fix - use correct provider name for azure_text

* ui wrap content onto next line

* ui fix - allow selecting current UI session when logging in

* ui session budgets

* ui show provider models on wildcard models

* test provider name appears in model list

* ui fix auto scroll on chat ui tab
2024-12-21 14:19:12 -08:00
Ishaan Jaff
23d277f167 (chore) - enforce model budgets on virtual keys as enterprise feature (#7353)
* docs - enforce model budget as enterprise feature

* docs link to correct place
2024-12-21 14:18:53 -08:00
Ishaan Jaff
d80307b7bb (refactor) - fix from enterprise.utils import ui_get_spend_by_tags (#7352)
* ui - refactor ui_get_spend_by_tags

* fix typing
2024-12-21 14:17:12 -08:00
Ishaan Jaff
4d9eaa3531 (Admin UI) - Test Key Tab - Allow using UI Session instead of manually creating a virtual key (#7348)
* ui fix - allow searching model list + fix bug on filtering

* qa fix - use correct provider name for azure_text

* ui wrap content onto next line

* ui fix - allow selecting current UI session when logging in

* ui session budgets
2024-12-21 13:14:15 -08:00
Ishaan Jaff
2d2c30b72b (Admin UI) - Test Key Tab - Allow typing in model name + Add wrapping for text response (#7347)
* ui fix - allow searching model list + fix bug on filtering

* qa fix - use correct provider name for azure_text

* ui wrap content onto next line
2024-12-21 13:14:01 -08:00
Ishaan Jaff
451215b106 (fix) LiteLLM Proxy fix GET /files/{file_id:path}/content" endpoint (#7342)
* fix order of get_file_content

* update e2 files tests

* add e2 batches endpoint testing

* update config.yml

* write content to file

* use correct oai_misc_config

* fixes for openai batches endpoint testing

* remove extra out file

* fix input.jsonl
2024-12-20 21:27:45 -08:00
Krish Dholakia
4f3ddebf81 Litellm dev 2024 12 20 p1 (#7335)
* fix(utils.py): e2e azure tts cost tracking working

moves tts response obj to include hidden params (allows for litellm call id, etc. to be sent in response headers) ; fixes spend_Tracking_utils logging payload to account for non-base model use-case

Fixes https://github.com/BerriAI/litellm/issues/7223

* fix: fix linting errors

* build(model_prices_and_context_window.json): add bedrock llama 3.3

Closes https://github.com/BerriAI/litellm/issues/7329

* fix(openai.py): fix return type for sync openai httpx response

* test: update test

* fix(spend_tracking_utils.py): fix if check

* fix(spend_tracking_utils.py): fix if check

* test: improve debugging for test

* fix: fix import
2024-12-20 21:22:31 -08:00
Krish Dholakia
61b4c41c3c Litellm dev 12 20 2024 p3 (#7339)
* fix(proxy_track_cost_callback.py): log to db if only end user param given

* fix: allows for jwt-auth based end user id spend tracking to work

* fix(utils.py): fix 'get_end_user_id_for_cost_tracking' to use 'user_api_key_end_user_id'

more stable - works with jwt-auth based end user tracking as well

* test(test_jwt.py): add e2e unit test to confirm end user cost tracking works for spend logs

* test: update test to use end_user api key hash param

* fix(langfuse.py): support end user cost tracking via jwt auth + langfuse

logs end user to langfuse if decoded from jwt token

* fix: fix linting errors

* test: fix test

* test: fix test

* fix: fix end user id extraction

* fix: run test earlier
2024-12-20 21:13:32 -08:00
Krish Dholakia
e8d2cb4935 Litellm dev 12 19 2024 p2 (#7315)
* fix(proxy_server.py): only update k,v pair if v is not empty/null

Fixes https://github.com/BerriAI/litellm/issues/6787

* test(test_router.py): cleanup duplicate calls

* test: add new test stream options drop params test

* test: update optional params / stream options test to test for vertex ai mistral route specifically

Addresses https://github.com/BerriAI/litellm/issues/7309

* fix(proxy_server.py): fix linting errors

* fix: fix linting errors
2024-12-19 20:28:16 -08:00
Ishaan Jaff
62a1cdec47 (code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Ishaan Jaff
acffdab966 fix use 1 file _PROXY_track_cost_callback (#7304) 2024-12-18 19:46:26 -08:00
Ishaan Jaff
087ab41d66 (proxy admin ui) - show Teams sorted by Team Alias (#7296)
* ui - sort teams by team alias

* test assert /user/info returns teams in a sorted order

* fix team_alias check on team
2024-12-18 19:43:19 -08:00
Ishaan Jaff
6220e17ebf (feat proxy) v2 - model max budgets (#7302)
* clean up unused code

* add _PROXY_VirtualKeyModelMaxBudgetLimiter

* adjust type imports

* working _PROXY_VirtualKeyModelMaxBudgetLimiter

* fix user_api_key_model_max_budget

* fix user_api_key_model_max_budget

* update naming

* update naming

* fix changes to RouterBudgetLimiting

* test_call_with_key_over_model_budget

* test_call_with_key_over_model_budget

* handle _get_request_model_budget_config

* e2e test for test_call_with_key_over_model_budget

* clean up test

* run ci/cd again

* add validate_model_max_budget

* docs fix

* update doc

* add e2e testing for _PROXY_VirtualKeyModelMaxBudgetLimiter

* test_unit_test_max_model_budget_limiter.py
2024-12-18 19:42:46 -08:00
Ishaan Jaff
70883bc1b8 (feat - proxy) Add status_code to litellm_proxy_total_requests_metric_total (#7293)
* fix _select_model_name_for_cost_calc docstring

* add STATUS_CODE  to prometheus

* test prometheus unit tests

* test_prometheus_unit_tests.py

* update Proxy Level Tracking Metrics docs

* fix test_proxy_failure_metrics

* fix test_proxy_failure_metrics
2024-12-18 15:55:02 -08:00
Ishaan Jaff
225e0581a7 Replace deprecated Pydantic Config class with model_config BerriAI/litellm#6902 (#6903) (#7300)
Co-authored-by: Dan Siwiec <daniel.siwiec@gmail.com>
2024-12-18 15:30:08 -08:00
Krish Dholakia
e7918f097b fix(proxy_server.py): pass model access groups to get_key/get_team mo… (#7281)
* fix(proxy_server.py): pass model access groups to get_key/get_team models

allows end user to see actual models they have access to, instead of default models

* fix(auth_checks.py): fix linting errors

* fix: fix linting errors
2024-12-18 09:33:33 -08:00
Krish Dholakia
f966e279a6 LiteLLM Minor Fixes & Improvements (12/16/2024) - p1 (#7263)
* fix(factory.py): skip empty text blocks for bedrock user messages

Fixes https://github.com/BerriAI/litellm/issues/7169

* Add support for Gemini 2.0 GoogleSearch tool (#7257)

* Add support for google_search tool in gemini 2.0

* Add/modify tests

* Fix grounding check

* Remove 2.0 grounding test; exclude experimental model in VERTEX_MODELS_TO_NOT_TEST

* Swap order of tools

* DFix formatting

* fix(get_api_base.py): return api base in streaming response

Fixes https://github.com/BerriAI/litellm/issues/7249

Closes https://github.com/BerriAI/litellm/pull/7250

* fix(cost_calculator.py): only set base model to model if not none

Fixes https://github.com/BerriAI/litellm/issues/7223

* fix(cost_calculator.py): enforce stricter order when picking model for cost calculation

* fix(cost_calculator.py): fix '_select_model_name_for_cost_calc' to return model name with region name prefix if provided

* fix(utils.py): fix 'get_model_info()' to handle edge case where model name starts with custom llm provider AND custom llm provider is given

* fix(cost_calculator.py): handle `custom_llm_provider-` scenario

* fix(cost_calculator.py): e2e working tts cost tracking

ensures initial message is passed in, to cost calculator

* fix(factory.py): suppress linting errors

* fix(cost_calculator.py): strip llm provider from model name after selecting cost calc model

* fix(litellm_logging.py): store initial request in 'input' field + accept base_model to be passed in litellm_params directly

* test: handle none env var value in flaky test

* fix(litellm_logging.py): fix linting errors

---------

Co-authored-by: Sam B <samlingx@gmail.com>
2024-12-17 15:33:36 -08:00
Krish Dholakia
57809cfbf4 Litellm dev 12 17 2024 p2 (#7277)
* fix(openai/transcription/handler.py): call 'log_pre_api_call' on async calls

* fix(openai/transcriptions/handler.py): call 'logging.pre_call' on sync whisper calls as well

* fix(proxy_cli.py): remove default proxy_cli timeout param

gets passed in as a dynamic request timeout and overrides config values

* fix(langfuse.py): pass litellm httpx client - contains ssl certs (#7052)

Fixes https://github.com/BerriAI/litellm/issues/7046
2024-12-17 14:05:14 -08:00
Krish Dholakia
03e711e3e4 LITELLM: Remove requests library usage (#7235)
* fix(generic_api_callback.py): remove requests lib usage

* fix(budget_manager.py): remove requests lib usgae

* fix(main.py): cleanup requests lib usage

* fix(utils.py): remove requests lib usage

* fix(argilla.py): fix argilla test

* fix(athina.py): replace 'requests' lib usage with litellm module

* fix(greenscale.py): replace 'requests' lib usage with httpx

* fix: remove unused 'requests' lib import + replace usage in some places

* fix(prompt_layer.py): remove 'requests' lib usage from prompt layer

* fix(ollama_chat.py): remove 'requests' lib usage

* fix(baseten.py): replace 'requests' lib usage

* fix(codestral/): replace 'requests' lib usage

* fix(predibase/): replace 'requests' lib usage

* refactor: cleanup unused 'requests' lib imports

* fix(oobabooga.py): cleanup 'requests' lib usage

* fix(invoke_handler.py): remove unused 'requests' lib usage

* refactor: cleanup unused 'requests' lib import

* fix: fix linting errors

* refactor(ollama/): move ollama to using base llm http handler

removes 'requests' lib dep for ollama integration

* fix(ollama_chat.py): fix linting errors

* fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama
2024-12-17 12:50:04 -08:00
Ishaan Jaff
427f2173d2 (feat) Add Bedrock knowledge base pass through endpoints (#7267)
* bugfix: Proxy Routing for Bedrock Knowledgebase URLs are incorrect (#7097)

* Fixing routing bug where bedrock knowledgebase urls were being generated incorrectly

* Preparing for PR

* Preparing for PR

* Preparing for PR

---------

Co-authored-by: Luke Birk <lb0737@att.com>

* fix _is_bedrock_agent_runtime_route

* docs - Query Knowledge Base

* test_is_bedrock_agent_runtime_route

* fix bedrock_proxy_route

---------

Co-authored-by: LBirk <2731718+LBirk@users.noreply.github.com>
Co-authored-by: Luke Birk <lb0737@att.com>
2024-12-16 22:19:34 -08:00
Krish Dholakia
194acfa95c Litellm dev 12 14 2024 p1 (#7231)
* fix(router.py): fix reading + using deployment-specific num retries on router

Fixes https://github.com/BerriAI/litellm/issues/7001

* fix(router.py): ensure 'timeout' in litellm_params overrides any value in router settings

Refactors all routes to use common '_update_kwargs_with_deployment' which has the timeout handling

* fix(router.py): fix timeout check
2024-12-14 22:22:29 -08:00
Ishaan Jaff
2459f9735d (feat) Add Tag-based budgets on litellm router / proxy (#7236)
* add BudgetConfig

* add _get_tags_from_request_kwargs

* test_tag_budgets_e2e_test_expect_to_fail

* add a check for request tags

* fix _async_get_cache_keys_for_router_budget_limiting

* fix test

* fix _sync_in_memory_spend_with_redis

* _async_get_cache_keys_for_router_budget_limiting

* fix _init_tag_budgets

* fix type casting

* docs show error for tag budget limit hit

* fix _get_tags_from_request_kwargs

* fix undo change
2024-12-14 17:28:36 -08:00
Ishaan Jaff
3fdd164fee ui new build 2024-12-14 17:15:31 -08:00
Krish Dholakia
edbf5eeeb3 Litellm remove circular imports (#7232)
* fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py

* fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py

'

* refactor: fix litellm.ModelResponse import on pass through endpoints

* refactor(litellm_logging.py): fix circular import for custom callbacks literal

* fix(factory.py): fix circular imports inside prompt factory

* fix(cost_calculator.py): fix circular import for 'litellm.Usage'

* fix(proxy_server.py): fix potential circular import with `litellm.Router'

* fix(proxy/utils.py): fix potential circular import in `litellm.Router`

* fix: remove circular imports in 'auth_checks' and 'guardrails/'

* fix(prompt_injection_detection.py): fix router impor t

* fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through

* fix(anthropic_pass_through_logging_handler.py): fix potential circular imports

* fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import

* fix(base.py): fix potential circular import

* fix(handler.py): fix potential circular ref in codestral + cohere handler's

* fix(azure.py): fix potential circular imports

* fix(gpt_transformation.py): fix modelresponse import

* fix(litellm_logging.py): add logging base class - simplify typing

makes it easy for other files to type check the logging obj without introducing circular imports

* fix(azure_ai/embed): fix potential circular import on handler.py

* fix(databricks/): fix potential circular imports in databricks/

* fix(vertex_ai/): fix potential circular imports on vertex ai embeddings

* fix(vertex_ai/image_gen): fix import

* fix(watsonx-+-bedrock): cleanup imports

* refactor(anthropic-pass-through-+-petals): cleanup imports

* refactor(huggingface/): cleanup imports

* fix(ollama-+-clarifai): cleanup circular imports

* fix(openai_like/): fix impor t

* fix(openai_like/): fix embedding handler

cleanup imports

* refactor(openai.py): cleanup imports

* fix(sagemaker/transformation.py): fix import

* ci(config.yml): add circular import test to ci/cd
2024-12-14 16:28:34 -08:00
Ishaan Jaff
a987a49595 ui new build 2024-12-14 14:16:15 -08:00
Ishaan Jaff
73dcbf8d4e (proxy) - Auth fix, ensure re-using safe request body for checking model field (#7222)
* litellm fix auth check

* fix _read_request_body

* test_auth_with_form_data_and_model

* fix auth check

* fix _read_request_body

* fix _safe_get_request_headers
2024-12-14 12:01:25 -08:00
Krish Dholakia
d02b9a111a fix(main.py): fix retries being multiplied when using openai sdk (#7221)
* fix(main.py): fix retries being multiplied when using openai sdk

Closes https://github.com/BerriAI/litellm/pull/7130

* docs(prompt_management.md): add langfuse prompt management doc

* feat(team_endpoints.py): allow teams to add their own models

Enables teams to call their own finetuned models via the proxy

* test: add better enforcement check testing for `/model/new` now that teams can add their own models

* docs(team_model_add.md): tutorial for allowing teams to add their own models

* test: fix test
2024-12-14 11:56:55 -08:00
Ishaan Jaff
bc46916bb3 (feat - Router / Proxy ) Allow setting budget limits per LLM deployment (#7220)
* fix test_deployment_budget_limits_e2e_test

* refactor async_log_success_event to track spend for provider + deployment

* fix format

* rename class to RouterBudgetLimiting

* rename func

* rename types used for budgets

* add new types for deployment budgets

* add budget limits for deployments

* fix checking budgets set for provider

* update file names

* fix linting error

* _track_provider_remaining_budget_prometheus

* async_filter_deployments

* fix model list passed to router

* update error

* test_deployment_budgets_e2e_test_expect_to_fail

* fix test case

* run deployment budget limits
2024-12-13 19:15:51 -08:00
Krish Dholakia
c3f637012b Litellm dev 12 13 2024 p1 (#7219)
* fix(litellm_logging.py): pass user metadata to langsmith on sdk calls

* fix(litellm_logging.py): pass nested user metadata to logging integration - e.g. langsmith

* fix(exception_mapping_utils.py): catch and clarify watsonx `/text/chat` endpoint not supported error message.

Closes https://github.com/BerriAI/litellm/issues/7213

* fix(watsonx/common_utils.py): accept new 'WATSONX_IAM_URL' env var

allows user to use local watsonx

Fixes https://github.com/BerriAI/litellm/issues/4991

* fix(litellm_logging.py): cleanup unused function

* test: skip bad ibm test
2024-12-13 19:01:28 -08:00
Krish Dholakia
a42f008cd0 Litellm dev 12 12 2024 (#7203)
* fix(azure/): support passing headers to azure openai endpoints

Fixes https://github.com/BerriAI/litellm/issues/6217

* fix(utils.py): move default tokenizer to just openai

hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls

* fix(router.py): fix pattern matching router - add generic "*" to it as well

Fixes issue where generic "*" model access group wouldn't show up

* fix(pattern_match_deployments.py): match to more specific pattern

match to more specific pattern

allows setting generic wildcard model access group and excluding specific models more easily

* fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty

don't delete all router models  b/c of empty list

Fixes https://github.com/BerriAI/litellm/issues/7196

* fix(anthropic/): fix handling response_format for anthropic messages with anthropic api

* fix(fireworks_ai/): support passing response_format + tool call in same message

Addresses https://github.com/BerriAI/litellm/issues/7135

* Revert "fix(fireworks_ai/): support passing response_format + tool call in same message"

This reverts commit 6a30dc6929.

* test: fix test

* fix(replicate/): fix replicate default retry/polling logic

* test: add unit testing for router pattern matching

* test: update test to use default oai tokenizer

* test: mark flaky test

* test: skip flaky test
2024-12-13 08:54:03 -08:00
Ishaan Jaff
e65f990319 bump: version 1.55.0 → 1.55.1 2024-12-12 20:50:45 -08:00
Ishaan Jaff
01b20f0bb8 (minor fix proxy) Clarify Proxy Rate limit errors are showing hash of litellm virtual key (#7210)
* fix clarify rate limit errors are showing litellm virtual key

* fix constants.py

* update test

* fix test parallel limiter
2024-12-12 20:13:14 -08:00
Ishaan Jaff
b1c3e2d4ef (feat) UI - Disable Usage Tab once SpendLogs is 1M+ Rows (#7208)
* use utils to set proxy spend logs row count

* store proxy state variables

* fix check for _has_user_setup_sso

* fix proxyStateVariables

* fix dup code

* rename getProxyUISettings

* add fixes

* ui emit num spend logs rows

* test_proxy_server_prisma_setup

* use MAX_SPENDLOG_ROWS_TO_QUERY to constants

* test_get_ui_settings_spend_logs_threshold
2024-12-12 18:43:17 -08:00
Ishaan Jaff
8c7605a164 fix: Support WebP image format and avoid token calculation error (#7182)
* fix get_image_dimensions

* attempt without pillow

* add clear type hints

* fix run_async_function_within_sync_function

* fix calculage_img_tokens

* fix is_prompt_caching_valid_prompt

* fix naming

* fix calculate_img_tokens

* fix unused imports

* fix calculate_img_tokens

* test test_is_prompt_caching_enabled_error_handling

* test_is_prompt_caching_enabled_return_default_image_dimensions

* fix openai_token_counter

* fix get_image_dimensions

* test_token_counter_with_image_url_with_detail_high

* test_img_url_token_counter

* fix test utils

* fix testing

* test_is_prompt_caching_enabled
2024-12-12 14:32:39 -08:00
Ishaan Jaff
02fc8d8738 (Feat) DataDog Logger - Add HOSTNAME and POD_NAME to DataDog logs (#7189)
* add unit test for test_datadog_static_methods

* docs dd vars

* test_datadog_payload_environment_variables

* test_datadog_static_methods

* docs env vars

* fix table
2024-12-12 12:06:26 -08:00
Ishaan Jaff
2185587b4d (feat) add response_time to StandardLoggingPayload - logged on datadog, gcs_bucket, s3_bucket etc (#7199)
* feat - add response_time to slp

* test_get_response_time

* docs slp

* fix test_datadog_logging_http_request
2024-12-12 12:04:43 -08:00
Ishaan Jaff
e09d3761d8 Code Quality Improvement - use vertex_ai/ as folder name for vertexAI (#7166)
* fix rename vertex ai

* run ci/cd again
2024-12-11 00:32:41 -08:00
Ishaan Jaff
26918487d6 (Refactor) Code Quality improvement - remove /prompt_templates/ , base_aws_llm.py from /llms folder (#7164)
* fix move base_aws_llm

* fix import

* update enforce llms folder style

* move prompt_templates

* update prompt_templates location

* fix imports

* fix imports

* fix imports

* fix imports

* fix checks
2024-12-11 00:02:46 -08:00
Krish Dholakia
93000bd8d3 Litellm merge pr (#7161)
* build: merge branch

* test: fix openai naming

* fix(main.py): fix openai renaming

* style: ignore function length for config factory

* fix(sagemaker/): fix routing logic

* fix: fix imports

* fix: fix override
2024-12-10 22:49:26 -08:00
Ishaan Jaff
6a9225fac2 (Refactor) Code Quality improvement - stop redefining LiteLLMBase (#7147)
* fix stop redefining  LiteLLMBase

* use better name for base pydantic obj
2024-12-10 15:49:01 -08:00
Krish Dholakia
501885d653 Litellm code qa common config (#7113)
* feat(base_llm): initial commit for common base config class

Addresses code qa critique https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132

* feat(base_llm/): add transform request/response abstract methods to base config class

* feat(cohere-+-clarifai): refactor integrations to use common base config class

* fix: fix linting errors

* refactor(anthropic/): move anthropic + vertex anthropic to use base config

* test: fix xai test

* test: fix tests

* fix: fix linting errors

* test: comment out WIP test

* fix(transformation.py): fix is pdf used check

* fix: fix linting error
2024-12-09 15:58:25 -08:00
Krish Dholakia
70c4e1b4d2 Litellm dev 12 07 2024 (#7086)
* fix(main.py): support passing max retries to azure/openai embedding integrations

Fixes https://github.com/BerriAI/litellm/issues/7003

* feat(team_endpoints.py): allow updating team model aliases

Closes https://github.com/BerriAI/litellm/issues/6956

* feat(router.py): allow specifying model id as fallback - skips any cooldown check

Allows a default model to be checked if all models in cooldown

s/o @micahjsmith

* docs(reliability.md): add fallback to specific model to docs

* fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util

Allows user to identify if messages/tools have prompt caching

Related issue: https://github.com/BerriAI/litellm/issues/6784

* feat(router.py): store model id for prompt caching valid prompt

Allows routing to that model id on subsequent requests

* fix(router.py): only cache if prompt is valid prompt caching prompt

prevents storing unnecessary items in cache

* feat(router.py): support routing prompt caching enabled models to previous deployments

Closes https://github.com/BerriAI/litellm/issues/6784

* test: fix linting errors

* feat(databricks/): convert basemodel to dict and exclude none values

allow passing pydantic message to databricks

* fix(utils.py): ensure all chat completion messages are dict

* (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081)

* add custom_llm_provider to SpendLogsPayload

* add custom_llm_provider to SpendLogs

* add custom llm provider to SpendLogs payload

* test_spend_logs_payload

* Add MLflow to the side bar (#7031)

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* (bug fix) SpendLogs update DB catch all possible DB errors for retrying  (#7082)

* catch DB_CONNECTION_ERROR_TYPES

* fix DB retry mechanism for SpendLog updates

* use DB_CONNECTION_ERROR_TYPES in auth checks

* fix exp back off for writing SpendLogs

* use _raise_failed_update_spend_exception to ensure errors print as NON blocking

* test_update_spend_logs_multiple_batches_with_failure

* (Feat) Add StructuredOutputs support for Fireworks.AI (#7085)

* fix model cost map fireworks ai "supports_response_schema": true,

* fix supports_response_schema

* fix map openai params fireworks ai

* test_map_response_format

* test_map_response_format

* added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084)

* bump: version 1.53.9 → 1.54.0

* fix deepinfra

* litellm db fixes LiteLLM_UserTable (#7089)

* ci/cd queue new release

* fix llama-3.3-70b-versatile

* refactor - use consistent file naming convention `AI21/` -> `ai21`  (#7090)

* fix refactor - use consistent file naming convention

* ci/cd run again

* fix naming structure

* fix use consistent naming (#7092)

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>
2024-12-08 00:30:33 -08:00
Ishaan Jaff
92a8f09655 litellm db fixes LiteLLM_UserTable (#7089) 2024-12-07 19:08:37 -08:00