litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 19:24:27 +00:00

Author	SHA1	Message	Date
Krish Dholakia	51f9f75c85	LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 (#7386 ) * fix(main.py): support 'mock_timeout=true' param allows mock requests on proxy to have a time delay, for testing * fix(main.py): ensure mock timeouts raise litellm.Timeout error triggers retry/fallbacks * fix: fix fallback + mock timeout testing * fix(router.py): always return remaining tpm/rpm limits, if limits are known allows for rate limit headers to be guaranteed * docs(timeout.md): add docs on mock timeout = true * fix(main.py): fix linting errors * test: fix test	2024-12-23 17:41:27 -08:00
Krish Dholakia	a89b0d5c39	Litellm dev 12 23 2024 p1 (#7383 ) * feat(guardrails_endpoint.py): new `/guardrails/list` endpoint Allow users to view what the available guardrails are * docs: document new `/guardrails/list` endpoint * docs(enterprise.md): update docs * fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats * fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests * fix: fix linting errors * fix: remove unused import	2024-12-23 16:33:31 -08:00
Ishaan Jaff	82b298acac	(security fix) - update base image for all docker images to `python:3.13.1-slim` (#7388 ) * update base image for all docker files * remove unused files * fix sec vuln	2024-12-23 16:20:47 -08:00
Krish Dholakia	5c34870edf	Document team admins + Enforce assigning team admins as an enterprise feature (#7359 ) * fix(team_endpoints.py): enforce assigning team admins as an enterprise feature * fix(proxy/_types.py): fix common proxy error to link to trial key * fix: fix linting errors	2024-12-21 20:28:31 -08:00
Krish Dholakia	ae7f54498f	Litellm enforce enterprise features (#7357 ) * fix(proxy_server.py): enforce team id based model add only works if enterprise user * fix(auth_checks.py): enforce common_checks can only be imported by user_api_key_auth.py * fix(auth_checks.py): insert not premium user error message on failed common checks run	2024-12-21 19:14:13 -08:00
Ishaan Jaff	26f93faa40	ui - new build	2024-12-21 15:01:17 -08:00
Ishaan Jaff	1d7780d458	apply linting fixes	2024-12-21 14:31:23 -08:00
Ishaan Jaff	49ea75830a	(Admin UI) correctly render provider name in /models with wildcard routing (#7349 ) * ui fix - allow searching model list + fix bug on filtering * qa fix - use correct provider name for azure_text * ui wrap content onto next line * ui fix - allow selecting current UI session when logging in * ui session budgets * ui show provider models on wildcard models * test provider name appears in model list * ui fix auto scroll on chat ui tab	2024-12-21 14:19:12 -08:00
Ishaan Jaff	23d277f167	(chore) - enforce model budgets on virtual keys as enterprise feature (#7353 ) * docs - enforce model budget as enterprise feature * docs link to correct place	2024-12-21 14:18:53 -08:00
Ishaan Jaff	d80307b7bb	(refactor) - fix from enterprise.utils import ui_get_spend_by_tags (#7352 ) * ui - refactor ui_get_spend_by_tags * fix typing	2024-12-21 14:17:12 -08:00
Ishaan Jaff	4d9eaa3531	(Admin UI) - Test Key Tab - Allow using `UI Session` instead of manually creating a virtual key (#7348 ) * ui fix - allow searching model list + fix bug on filtering * qa fix - use correct provider name for azure_text * ui wrap content onto next line * ui fix - allow selecting current UI session when logging in * ui session budgets	2024-12-21 13:14:15 -08:00
Ishaan Jaff	2d2c30b72b	(Admin UI) - Test Key Tab - Allow typing in `model` name + Add wrapping for text response (#7347 ) * ui fix - allow searching model list + fix bug on filtering * qa fix - use correct provider name for azure_text * ui wrap content onto next line	2024-12-21 13:14:01 -08:00
Ishaan Jaff	451215b106	(fix) LiteLLM Proxy fix GET `/files/{file_id:path}/content"` endpoint (#7342 ) * fix order of get_file_content * update e2 files tests * add e2 batches endpoint testing * update config.yml * write content to file * use correct oai_misc_config * fixes for openai batches endpoint testing * remove extra out file * fix input.jsonl	2024-12-20 21:27:45 -08:00
Krish Dholakia	4f3ddebf81	Litellm dev 2024 12 20 p1 (#7335 ) * fix(utils.py): e2e azure tts cost tracking working moves tts response obj to include hidden params (allows for litellm call id, etc. to be sent in response headers) ; fixes spend_Tracking_utils logging payload to account for non-base model use-case Fixes https://github.com/BerriAI/litellm/issues/7223 * fix: fix linting errors * build(model_prices_and_context_window.json): add bedrock llama 3.3 Closes https://github.com/BerriAI/litellm/issues/7329 * fix(openai.py): fix return type for sync openai httpx response * test: update test * fix(spend_tracking_utils.py): fix if check * fix(spend_tracking_utils.py): fix if check * test: improve debugging for test * fix: fix import	2024-12-20 21:22:31 -08:00
Krish Dholakia	61b4c41c3c	Litellm dev 12 20 2024 p3 (#7339 ) * fix(proxy_track_cost_callback.py): log to db if only end user param given * fix: allows for jwt-auth based end user id spend tracking to work * fix(utils.py): fix 'get_end_user_id_for_cost_tracking' to use 'user_api_key_end_user_id' more stable - works with jwt-auth based end user tracking as well * test(test_jwt.py): add e2e unit test to confirm end user cost tracking works for spend logs * test: update test to use end_user api key hash param * fix(langfuse.py): support end user cost tracking via jwt auth + langfuse logs end user to langfuse if decoded from jwt token * fix: fix linting errors * test: fix test * test: fix test * fix: fix end user id extraction * fix: run test earlier	2024-12-20 21:13:32 -08:00
Krish Dholakia	e8d2cb4935	Litellm dev 12 19 2024 p2 (#7315 ) * fix(proxy_server.py): only update k,v pair if v is not empty/null Fixes https://github.com/BerriAI/litellm/issues/6787 * test(test_router.py): cleanup duplicate calls * test: add new test stream options drop params test * test: update optional params / stream options test to test for vertex ai mistral route specifically Addresses https://github.com/BerriAI/litellm/issues/7309 * fix(proxy_server.py): fix linting errors * fix: fix linting errors	2024-12-19 20:28:16 -08:00
Ishaan Jaff	62a1cdec47	(code quality) run ruff rule to ban unused imports (#7313 ) * remove unused imports * fix AmazonConverseConfig * fix test * fix import * ruff check fixes * test fixes * fix testing * fix imports	2024-12-19 12:33:42 -08:00
Ishaan Jaff	acffdab966	fix use 1 file _PROXY_track_cost_callback (#7304 )	2024-12-18 19:46:26 -08:00
Ishaan Jaff	087ab41d66	(proxy admin ui) - show Teams sorted by `Team Alias` (#7296 ) * ui - sort teams by team alias * test assert /user/info returns teams in a sorted order * fix team_alias check on team	2024-12-18 19:43:19 -08:00
Ishaan Jaff	6220e17ebf	(feat proxy) v2 - model max budgets (#7302 ) * clean up unused code * add _PROXY_VirtualKeyModelMaxBudgetLimiter * adjust type imports * working _PROXY_VirtualKeyModelMaxBudgetLimiter * fix user_api_key_model_max_budget * fix user_api_key_model_max_budget * update naming * update naming * fix changes to RouterBudgetLimiting * test_call_with_key_over_model_budget * test_call_with_key_over_model_budget * handle _get_request_model_budget_config * e2e test for test_call_with_key_over_model_budget * clean up test * run ci/cd again * add validate_model_max_budget * docs fix * update doc * add e2e testing for _PROXY_VirtualKeyModelMaxBudgetLimiter * test_unit_test_max_model_budget_limiter.py	2024-12-18 19:42:46 -08:00
Ishaan Jaff	70883bc1b8	(feat - proxy) Add `status_code` to `litellm_proxy_total_requests_metric_total` (#7293 ) * fix _select_model_name_for_cost_calc docstring * add STATUS_CODE to prometheus * test prometheus unit tests * test_prometheus_unit_tests.py * update Proxy Level Tracking Metrics docs * fix test_proxy_failure_metrics * fix test_proxy_failure_metrics	2024-12-18 15:55:02 -08:00
Ishaan Jaff	225e0581a7	Replace deprecated Pydantic Config class with model_config BerriAI/litellm#6902 (#6903 ) (#7300 ) Co-authored-by: Dan Siwiec <daniel.siwiec@gmail.com>	2024-12-18 15:30:08 -08:00
Krish Dholakia	e7918f097b	fix(proxy_server.py): pass model access groups to get_key/get_team mo… (#7281 ) * fix(proxy_server.py): pass model access groups to get_key/get_team models allows end user to see actual models they have access to, instead of default models * fix(auth_checks.py): fix linting errors * fix: fix linting errors	2024-12-18 09:33:33 -08:00
Krish Dholakia	f966e279a6	LiteLLM Minor Fixes & Improvements (12/16/2024) - p1 (#7263 ) * fix(factory.py): skip empty text blocks for bedrock user messages Fixes https://github.com/BerriAI/litellm/issues/7169 * Add support for Gemini 2.0 GoogleSearch tool (#7257) * Add support for google_search tool in gemini 2.0 * Add/modify tests * Fix grounding check * Remove 2.0 grounding test; exclude experimental model in VERTEX_MODELS_TO_NOT_TEST * Swap order of tools * DFix formatting * fix(get_api_base.py): return api base in streaming response Fixes https://github.com/BerriAI/litellm/issues/7249 Closes https://github.com/BerriAI/litellm/pull/7250 * fix(cost_calculator.py): only set base model to model if not none Fixes https://github.com/BerriAI/litellm/issues/7223 * fix(cost_calculator.py): enforce stricter order when picking model for cost calculation * fix(cost_calculator.py): fix '_select_model_name_for_cost_calc' to return model name with region name prefix if provided * fix(utils.py): fix 'get_model_info()' to handle edge case where model name starts with custom llm provider AND custom llm provider is given * fix(cost_calculator.py): handle `custom_llm_provider-` scenario * fix(cost_calculator.py): e2e working tts cost tracking ensures initial message is passed in, to cost calculator * fix(factory.py): suppress linting errors * fix(cost_calculator.py): strip llm provider from model name after selecting cost calc model * fix(litellm_logging.py): store initial request in 'input' field + accept base_model to be passed in litellm_params directly * test: handle none env var value in flaky test * fix(litellm_logging.py): fix linting errors --------- Co-authored-by: Sam B <samlingx@gmail.com>	2024-12-17 15:33:36 -08:00
Krish Dholakia	57809cfbf4	Litellm dev 12 17 2024 p2 (#7277 ) * fix(openai/transcription/handler.py): call 'log_pre_api_call' on async calls * fix(openai/transcriptions/handler.py): call 'logging.pre_call' on sync whisper calls as well * fix(proxy_cli.py): remove default proxy_cli timeout param gets passed in as a dynamic request timeout and overrides config values * fix(langfuse.py): pass litellm httpx client - contains ssl certs (#7052) Fixes https://github.com/BerriAI/litellm/issues/7046	2024-12-17 14:05:14 -08:00
Krish Dholakia	03e711e3e4	LITELLM: Remove `requests` library usage (#7235 ) * fix(generic_api_callback.py): remove requests lib usage * fix(budget_manager.py): remove requests lib usgae * fix(main.py): cleanup requests lib usage * fix(utils.py): remove requests lib usage * fix(argilla.py): fix argilla test * fix(athina.py): replace 'requests' lib usage with litellm module * fix(greenscale.py): replace 'requests' lib usage with httpx * fix: remove unused 'requests' lib import + replace usage in some places * fix(prompt_layer.py): remove 'requests' lib usage from prompt layer * fix(ollama_chat.py): remove 'requests' lib usage * fix(baseten.py): replace 'requests' lib usage * fix(codestral/): replace 'requests' lib usage * fix(predibase/): replace 'requests' lib usage * refactor: cleanup unused 'requests' lib imports * fix(oobabooga.py): cleanup 'requests' lib usage * fix(invoke_handler.py): remove unused 'requests' lib usage * refactor: cleanup unused 'requests' lib import * fix: fix linting errors * refactor(ollama/): move ollama to using base llm http handler removes 'requests' lib dep for ollama integration * fix(ollama_chat.py): fix linting errors * fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama	2024-12-17 12:50:04 -08:00
Ishaan Jaff	427f2173d2	(feat) Add Bedrock knowledge base pass through endpoints (#7267 ) * bugfix: Proxy Routing for Bedrock Knowledgebase URLs are incorrect (#7097) * Fixing routing bug where bedrock knowledgebase urls were being generated incorrectly * Preparing for PR * Preparing for PR * Preparing for PR --------- Co-authored-by: Luke Birk <lb0737@att.com> * fix _is_bedrock_agent_runtime_route * docs - Query Knowledge Base * test_is_bedrock_agent_runtime_route * fix bedrock_proxy_route --------- Co-authored-by: LBirk <2731718+LBirk@users.noreply.github.com> Co-authored-by: Luke Birk <lb0737@att.com>	2024-12-16 22:19:34 -08:00
Krish Dholakia	194acfa95c	Litellm dev 12 14 2024 p1 (#7231 ) * fix(router.py): fix reading + using deployment-specific num retries on router Fixes https://github.com/BerriAI/litellm/issues/7001 * fix(router.py): ensure 'timeout' in litellm_params overrides any value in router settings Refactors all routes to use common '_update_kwargs_with_deployment' which has the timeout handling * fix(router.py): fix timeout check	2024-12-14 22:22:29 -08:00
Ishaan Jaff	2459f9735d	(feat) Add Tag-based budgets on litellm router / proxy (#7236 ) * add BudgetConfig * add _get_tags_from_request_kwargs * test_tag_budgets_e2e_test_expect_to_fail * add a check for request tags * fix _async_get_cache_keys_for_router_budget_limiting * fix test * fix _sync_in_memory_spend_with_redis * _async_get_cache_keys_for_router_budget_limiting * fix _init_tag_budgets * fix type casting * docs show error for tag budget limit hit * fix _get_tags_from_request_kwargs * fix undo change	2024-12-14 17:28:36 -08:00
Ishaan Jaff	3fdd164fee	ui new build	2024-12-14 17:15:31 -08:00
Krish Dholakia	edbf5eeeb3	Litellm remove circular imports (#7232 ) * fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py * fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py ' * refactor: fix litellm.ModelResponse import on pass through endpoints * refactor(litellm_logging.py): fix circular import for custom callbacks literal * fix(factory.py): fix circular imports inside prompt factory * fix(cost_calculator.py): fix circular import for 'litellm.Usage' * fix(proxy_server.py): fix potential circular import with `litellm.Router' * fix(proxy/utils.py): fix potential circular import in `litellm.Router` * fix: remove circular imports in 'auth_checks' and 'guardrails/' * fix(prompt_injection_detection.py): fix router impor t * fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through * fix(anthropic_pass_through_logging_handler.py): fix potential circular imports * fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import * fix(base.py): fix potential circular import * fix(handler.py): fix potential circular ref in codestral + cohere handler's * fix(azure.py): fix potential circular imports * fix(gpt_transformation.py): fix modelresponse import * fix(litellm_logging.py): add logging base class - simplify typing makes it easy for other files to type check the logging obj without introducing circular imports * fix(azure_ai/embed): fix potential circular import on handler.py * fix(databricks/): fix potential circular imports in databricks/ * fix(vertex_ai/): fix potential circular imports on vertex ai embeddings * fix(vertex_ai/image_gen): fix import * fix(watsonx-+-bedrock): cleanup imports * refactor(anthropic-pass-through-+-petals): cleanup imports * refactor(huggingface/): cleanup imports * fix(ollama-+-clarifai): cleanup circular imports * fix(openai_like/): fix impor t * fix(openai_like/): fix embedding handler cleanup imports * refactor(openai.py): cleanup imports * fix(sagemaker/transformation.py): fix import * ci(config.yml): add circular import test to ci/cd	2024-12-14 16:28:34 -08:00
Ishaan Jaff	a987a49595	ui new build	2024-12-14 14:16:15 -08:00
Ishaan Jaff	73dcbf8d4e	(proxy) - Auth fix, ensure re-using safe request body for checking `model` field (#7222 ) * litellm fix auth check * fix _read_request_body * test_auth_with_form_data_and_model * fix auth check * fix _read_request_body * fix _safe_get_request_headers	2024-12-14 12:01:25 -08:00
Krish Dholakia	d02b9a111a	fix(main.py): fix retries being multiplied when using openai sdk (#7221 ) * fix(main.py): fix retries being multiplied when using openai sdk Closes https://github.com/BerriAI/litellm/pull/7130 * docs(prompt_management.md): add langfuse prompt management doc * feat(team_endpoints.py): allow teams to add their own models Enables teams to call their own finetuned models via the proxy * test: add better enforcement check testing for `/model/new` now that teams can add their own models * docs(team_model_add.md): tutorial for allowing teams to add their own models * test: fix test	2024-12-14 11:56:55 -08:00
Ishaan Jaff	bc46916bb3	(feat - Router / Proxy ) Allow setting budget limits per LLM deployment (#7220 ) * fix test_deployment_budget_limits_e2e_test * refactor async_log_success_event to track spend for provider + deployment * fix format * rename class to RouterBudgetLimiting * rename func * rename types used for budgets * add new types for deployment budgets * add budget limits for deployments * fix checking budgets set for provider * update file names * fix linting error * _track_provider_remaining_budget_prometheus * async_filter_deployments * fix model list passed to router * update error * test_deployment_budgets_e2e_test_expect_to_fail * fix test case * run deployment budget limits	2024-12-13 19:15:51 -08:00
Krish Dholakia	c3f637012b	Litellm dev 12 13 2024 p1 (#7219 ) * fix(litellm_logging.py): pass user metadata to langsmith on sdk calls * fix(litellm_logging.py): pass nested user metadata to logging integration - e.g. langsmith * fix(exception_mapping_utils.py): catch and clarify watsonx `/text/chat` endpoint not supported error message. Closes https://github.com/BerriAI/litellm/issues/7213 * fix(watsonx/common_utils.py): accept new 'WATSONX_IAM_URL' env var allows user to use local watsonx Fixes https://github.com/BerriAI/litellm/issues/4991 * fix(litellm_logging.py): cleanup unused function * test: skip bad ibm test	2024-12-13 19:01:28 -08:00
Krish Dholakia	a42f008cd0	Litellm dev 12 12 2024 (#7203 ) * fix(azure/): support passing headers to azure openai endpoints Fixes https://github.com/BerriAI/litellm/issues/6217 * fix(utils.py): move default tokenizer to just openai hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls * fix(router.py): fix pattern matching router - add generic "" to it as well Fixes issue where generic "" model access group wouldn't show up * fix(pattern_match_deployments.py): match to more specific pattern match to more specific pattern allows setting generic wildcard model access group and excluding specific models more easily * fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty don't delete all router models b/c of empty list Fixes https://github.com/BerriAI/litellm/issues/7196 * fix(anthropic/): fix handling response_format for anthropic messages with anthropic api * fix(fireworks_ai/): support passing response_format + tool call in same message Addresses https://github.com/BerriAI/litellm/issues/7135 * Revert "fix(fireworks_ai/): support passing response_format + tool call in same message" This reverts commit `6a30dc6929`. * test: fix test * fix(replicate/): fix replicate default retry/polling logic * test: add unit testing for router pattern matching * test: update test to use default oai tokenizer * test: mark flaky test * test: skip flaky test	2024-12-13 08:54:03 -08:00
Ishaan Jaff	e65f990319	bump: version 1.55.0 → 1.55.1	2024-12-12 20:50:45 -08:00
Ishaan Jaff	01b20f0bb8	(minor fix proxy) Clarify Proxy Rate limit errors are showing hash of litellm virtual key (#7210 ) * fix clarify rate limit errors are showing litellm virtual key * fix constants.py * update test * fix test parallel limiter	2024-12-12 20:13:14 -08:00
Ishaan Jaff	b1c3e2d4ef	(feat) UI - Disable Usage Tab once SpendLogs is 1M+ Rows (#7208 ) * use utils to set proxy spend logs row count * store proxy state variables * fix check for _has_user_setup_sso * fix proxyStateVariables * fix dup code * rename getProxyUISettings * add fixes * ui emit num spend logs rows * test_proxy_server_prisma_setup * use MAX_SPENDLOG_ROWS_TO_QUERY to constants * test_get_ui_settings_spend_logs_threshold	2024-12-12 18:43:17 -08:00
Ishaan Jaff	8c7605a164	fix: Support WebP image format and avoid token calculation error (#7182 ) * fix get_image_dimensions * attempt without pillow * add clear type hints * fix run_async_function_within_sync_function * fix calculage_img_tokens * fix is_prompt_caching_valid_prompt * fix naming * fix calculate_img_tokens * fix unused imports * fix calculate_img_tokens * test test_is_prompt_caching_enabled_error_handling * test_is_prompt_caching_enabled_return_default_image_dimensions * fix openai_token_counter * fix get_image_dimensions * test_token_counter_with_image_url_with_detail_high * test_img_url_token_counter * fix test utils * fix testing * test_is_prompt_caching_enabled	2024-12-12 14:32:39 -08:00
Ishaan Jaff	02fc8d8738	(Feat) DataDog Logger - Add `HOSTNAME` and `POD_NAME` to DataDog logs (#7189 ) * add unit test for test_datadog_static_methods * docs dd vars * test_datadog_payload_environment_variables * test_datadog_static_methods * docs env vars * fix table	2024-12-12 12:06:26 -08:00
Ishaan Jaff	2185587b4d	(feat) add `response_time` to StandardLoggingPayload - logged on `datadog`, `gcs_bucket`, `s3_bucket` etc (#7199 ) * feat - add response_time to slp * test_get_response_time * docs slp * fix test_datadog_logging_http_request	2024-12-12 12:04:43 -08:00
Ishaan Jaff	e09d3761d8	Code Quality Improvement - use `vertex_ai/` as folder name for vertexAI (#7166 ) * fix rename vertex ai * run ci/cd again	2024-12-11 00:32:41 -08:00
Ishaan Jaff	26918487d6	(Refactor) Code Quality improvement - remove `/prompt_templates/` , `base_aws_llm.py` from `/llms` folder (#7164 ) * fix move base_aws_llm * fix import * update enforce llms folder style * move prompt_templates * update prompt_templates location * fix imports * fix imports * fix imports * fix imports * fix checks	2024-12-11 00:02:46 -08:00
Krish Dholakia	93000bd8d3	Litellm merge pr (#7161 ) * build: merge branch * test: fix openai naming * fix(main.py): fix openai renaming * style: ignore function length for config factory * fix(sagemaker/): fix routing logic * fix: fix imports * fix: fix override	2024-12-10 22:49:26 -08:00
Ishaan Jaff	6a9225fac2	(Refactor) Code Quality improvement - stop redefining LiteLLMBase (#7147 ) * fix stop redefining LiteLLMBase * use better name for base pydantic obj	2024-12-10 15:49:01 -08:00
Krish Dholakia	501885d653	Litellm code qa common config (#7113 ) * feat(base_llm): initial commit for common base config class Addresses code qa critique https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132 * feat(base_llm/): add transform request/response abstract methods to base config class * feat(cohere-+-clarifai): refactor integrations to use common base config class * fix: fix linting errors * refactor(anthropic/): move anthropic + vertex anthropic to use base config * test: fix xai test * test: fix tests * fix: fix linting errors * test: comment out WIP test * fix(transformation.py): fix is pdf used check * fix: fix linting error	2024-12-09 15:58:25 -08:00
Krish Dholakia	70c4e1b4d2	Litellm dev 12 07 2024 (#7086 ) * fix(main.py): support passing max retries to azure/openai embedding integrations Fixes https://github.com/BerriAI/litellm/issues/7003 * feat(team_endpoints.py): allow updating team model aliases Closes https://github.com/BerriAI/litellm/issues/6956 * feat(router.py): allow specifying model id as fallback - skips any cooldown check Allows a default model to be checked if all models in cooldown s/o @micahjsmith * docs(reliability.md): add fallback to specific model to docs * fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util Allows user to identify if messages/tools have prompt caching Related issue: https://github.com/BerriAI/litellm/issues/6784 * feat(router.py): store model id for prompt caching valid prompt Allows routing to that model id on subsequent requests * fix(router.py): only cache if prompt is valid prompt caching prompt prevents storing unnecessary items in cache * feat(router.py): support routing prompt caching enabled models to previous deployments Closes https://github.com/BerriAI/litellm/issues/6784 * test: fix linting errors * feat(databricks/): convert basemodel to dict and exclude none values allow passing pydantic message to databricks * fix(utils.py): ensure all chat completion messages are dict * (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081) * add custom_llm_provider to SpendLogsPayload * add custom_llm_provider to SpendLogs * add custom llm provider to SpendLogs payload * test_spend_logs_payload * Add MLflow to the side bar (#7031) Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * (bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082) * catch DB_CONNECTION_ERROR_TYPES * fix DB retry mechanism for SpendLog updates * use DB_CONNECTION_ERROR_TYPES in auth checks * fix exp back off for writing SpendLogs * use _raise_failed_update_spend_exception to ensure errors print as NON blocking * test_update_spend_logs_multiple_batches_with_failure * (Feat) Add StructuredOutputs support for Fireworks.AI (#7085) * fix model cost map fireworks ai "supports_response_schema": true, * fix supports_response_schema * fix map openai params fireworks ai * test_map_response_format * test_map_response_format * added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084) * bump: version 1.53.9 → 1.54.0 * fix deepinfra * litellm db fixes LiteLLM_UserTable (#7089) * ci/cd queue new release * fix llama-3.3-70b-versatile * refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090) * fix refactor - use consistent file naming convention * ci/cd run again * fix naming structure * fix use consistent naming (#7092) --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>	2024-12-08 00:30:33 -08:00
Ishaan Jaff	92a8f09655	litellm db fixes LiteLLM_UserTable (#7089 )	2024-12-07 19:08:37 -08:00

1 2 3 4 5 ...

3942 commits