litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 03:04:13 +00:00

Author	SHA1	Message	Date
Ishaan Jaff	442d309bcd	update release notes	2024-12-23 21:48:33 -08:00
Ishaan Jaff	f4a2143b82	update release notes	2024-12-23 21:43:47 -08:00
Ishaan Jaff	1f466ec9cf	release notes	2024-12-23 21:38:56 -08:00
Ishaan Jaff	957fbc6dfb	docs batches	2024-12-23 21:24:06 -08:00
Ishaan Jaff	43077a88d5	docs add files to supported endpoints	2024-12-23 20:51:34 -08:00
Krish Dholakia	48316520f4	LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 (#7386 ) * fix(main.py): support 'mock_timeout=true' param allows mock requests on proxy to have a time delay, for testing * fix(main.py): ensure mock timeouts raise litellm.Timeout error triggers retry/fallbacks * fix: fix fallback + mock timeout testing * fix(router.py): always return remaining tpm/rpm limits, if limits are known allows for rate limit headers to be guaranteed * docs(timeout.md): add docs on mock timeout = true * fix(main.py): fix linting errors * test: fix test	2024-12-23 17:41:27 -08:00
Krish Dholakia	db59e08958	Litellm dev 12 23 2024 p1 (#7383 ) * feat(guardrails_endpoint.py): new `/guardrails/list` endpoint Allow users to view what the available guardrails are * docs: document new `/guardrails/list` endpoint * docs(enterprise.md): update docs * fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats * fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests * fix: fix linting errors * fix: remove unused import	2024-12-23 16:33:31 -08:00
Krish Dholakia	3671829e39	Complete 'requests' library removal (#7350 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure * refactor(watsonx/completion/handler.py): move to using base llm http handler removes 'requests' library usage * fix(watsonx_text/transformation.py): fix result transformation migrates to transformation.py, for usage with base llm http handler * fix(streaming_handler.py): migrate watsonx streaming to transformation.py ensures streaming works with base llm http handler * fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic * fix(watsonx/): fix chat route post completion route refactor * refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well * refactor(base.py): remove requests library usage from litellm * build(pyproject.toml): remove requests library usage * fix: fix linting errors * fix: fix linting errors * fix(types/utils.py): fix validation errors for modelresponsestream * fix(replicate/handler.py): fix linting errors * fix(litellm_logging.py): handle modelresponsestream object * fix(streaming_handler.py): fix modelresponsestream args * fix: remove unused imports * test: fix test * fix: fix test * test: fix test * test: fix tests * test: fix test * test: fix patch target * test: fix test	2024-12-22 07:21:25 -08:00
Ishaan Jaff	8b1ea40e7b	add img to release notes All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 36s Details	2024-12-21 21:24:16 -08:00
Ishaan Jaff	bb801ad3d9	update release notes	2024-12-21 21:20:13 -08:00
Krish Dholakia	1aed988c11	Litellm docs update (#7365 ) * docs: fix release notes url + fix urlk * docs(index.md): add link to github releases * docs(index.md): add linkedin social url to release notes	2024-12-21 21:09:50 -08:00
Ishaan Jaff	31f5dcf8e5	docs	2024-12-21 20:52:39 -08:00
Ishaan Jaff	291229ee46	docs add 1.55.8 changelog	2024-12-21 20:51:39 -08:00
Ishaan Jaff	5a8f67c171	release notes v1.55.8	2024-12-21 20:31:54 -08:00
Ishaan Jaff	a3e732de39	(chore) - enforce model budgets on virtual keys as enterprise feature (#7353 ) * docs - enforce model budget as enterprise feature * docs link to correct place	2024-12-21 14:18:53 -08:00
Ishaan Jaff	6107f9f3f3	[Bug fix ]: Triton /infer handler incompatible with batch responses (#7337 ) * migrate triton to base llm http handler * clean up triton handler.py * use transform functions for triton * add TritonConfig * get openai params for triton * use triton embedding config * test_completion_triton_generate_api * test_completion_triton_infer_api * fix TritonConfig doc string * use TritonResponseIterator * fix triton embeddings * docs triton chat usage	2024-12-20 20:59:40 -08:00
Krish Dholakia	70a9ea99f2	Controll fallback prompts client-side (#7334 ) * feat(router.py): support passing model-specific messages in fallbacks * docs(routing.md): separate router timeouts into separate doc allow for 1 fallbacks doc (across proxy/router) * docs(routing.md): cleanup router docs * docs(reliability.md): cleanup docs * docs(reliability.md): cleaned up fallback doc just have 1 doc across sdk/proxy simplifies docs * docs(reliability.md): add setting model-specific fallback prompts * fix: fix linting errors * test: skip test causing openai rate limit errros * test: fix test * test: run vertex test first to catch error	2024-12-20 19:09:53 -08:00
jravi-fireworks	f8cf11f6d5	Fix LiteLLM documentation (#7333 ) Co-authored-by: Jetashree Ravi <jetashreeravi@Jetashrees-MBP.attlocal.net>	2024-12-20 15:04:23 -08:00
Krish Dholakia	27a4d08604	Litellm dev 2024 12 19 p3 (#7322 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params Fixes https://github.com/BerriAI/litellm/issues/7242 * test: new test for langfuse prompt management hook Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(main.py): add 'get_chat_completion_prompt' customlogger hook allows for langfuse prompt management Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(langfuse_prompt_management.py): working e2e langfuse prompt management works with `langfuse/` route * feat(main.py): initial tracing for dynamic langfuse params allows admin to specify langfuse keys by model in model_list * feat(main.py): support passing langfuse credentials dynamically * fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params allows dynamic langfuse params to work * fix: fix linting errors * docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial * docs(prompt_management.md): cleanup doc * docs: cleanup topnav * docs(prompt_management.md): update docs to be easier to use * fix: remove unused imports * docs(prompt_management.md): add architectural overview doc * fix(litellm_logging.py): fix dynamic param passing * fix(langfuse_prompt_management.py): fix linting errors * fix: fix linting errors * fix: use typing_extensions for typealias to ensure python3.8 compatibility * test: use stream_options in test to account for tiktoken diff * fix: improve import error message, and check run test earlier	2024-12-20 13:30:16 -08:00
Krrish Dholakia	834067c570	docs: refactor admin ui docs	2024-12-20 10:20:07 -08:00
Ishaan Jaff	35076212ef	docs infinity rerank api docs	2024-12-19 18:51:55 -08:00
Ishaan Jaff	3a6dba8853	docs base rerank config	2024-12-19 18:33:32 -08:00
Ishaan Jaff	236561cdb8	docs add ref prs	2024-12-19 18:31:38 -08:00
Ishaan Jaff	617ac63d14	(feat) add infinity rerank models (#7321 ) * Support Infinity Reranker (custom reranking models) (#7247) * Support Infinity Reranker * Clean code * Included transformation.py * Clean code * Added Infinity reranker test * Clean code --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * transform_rerank_response * update handler.py * infinity rerank updates * ci/cd run again * add infinity unit tests * docs add instruction on how to add a new provider for rerank --------- Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>	2024-12-19 18:30:28 -08:00
Ishaan Jaff	6261ec3599	(feat proxy) v2 - model max budgets (#7302 ) * clean up unused code * add _PROXY_VirtualKeyModelMaxBudgetLimiter * adjust type imports * working _PROXY_VirtualKeyModelMaxBudgetLimiter * fix user_api_key_model_max_budget * fix user_api_key_model_max_budget * update naming * update naming * fix changes to RouterBudgetLimiting * test_call_with_key_over_model_budget * test_call_with_key_over_model_budget * handle _get_request_model_budget_config * e2e test for test_call_with_key_over_model_budget * clean up test * run ci/cd again * add validate_model_max_budget * docs fix * update doc * add e2e testing for _PROXY_VirtualKeyModelMaxBudgetLimiter * test_unit_test_max_model_budget_limiter.py	2024-12-18 19:42:46 -08:00
Krish Dholakia	5253f639cd	fix(health.md): add rerank model health check information (#7295 ) * fix(health.md): add rerank model health check information * build(model_prices_and_context_window.json): add gemini 2.0 for google ai studio - pricing + commercial rate limits * build(model_prices_and_context_window.json): add gemini-2.0 supports audio output = true * docs(team_model_add.md): clarify allowing teams to add models is an enterprise feature * fix(o1_transformation.py): add support for 'n', 'response_format' and 'stop' params for o1 and 'stream_options' param for o1-mini * build(model_prices_and_context_window.json): add 'supports_system_message' to supporting openai models needed as o1-preview, and o1-mini models don't support 'system message * fix(o1_transformation.py): translate system message based on if o1 model supports it * fix(o1_transformation.py): return 'stream' param support if o1-mini/o1-preview o1 currently doesn't support streaming, but the other model versions do Fixes https://github.com/BerriAI/litellm/issues/7292 * fix(o1_transformation.py): return tool calling/response_format in supported params if model map says so Fixes https://github.com/BerriAI/litellm/issues/7292 * fix: fix linting errors * fix: update '_transform_messages' * fix(o1_transformation.py): fix provider passed for supported param checks * test(base_llm_unit_tests.py): skip test if api takes >5s to respond * fix(utils.py): return false in 'supports_factory' if can't find value * fix(o1_transformation.py): always return stream + stream_options as supported params + handle stream options being passed in for azure o1 * feat(openai.py): support stream faking natively in openai handler Allows o1 calls to be faked for just the "o1" model, allows native streaming for o1-mini, o1-preview Fixes https://github.com/BerriAI/litellm/issues/7292 * fix(openai.py): use inference param instead of original optional param	2024-12-18 19:18:10 -08:00
Krish Dholakia	6a45ee1ef7	fix(hosted_vllm/transformation.py): return fake api key, if none give… (#7301 ) * fix(hosted_vllm/transformation.py): return fake api key, if none give. Prevents httpx error Fixes https://github.com/BerriAI/litellm/issues/7291 * test: fix test * fix(main.py): add hosted_vllm/ support for embeddings endpoint Closes https://github.com/BerriAI/litellm/issues/7290 * docs(vllm.md): add docs on vllm embeddings usage * fix(__init__.py): fix sambanova model test * fix(base_llm_unit_tests.py): skip pydantic obj test if model takes >5s to respond	2024-12-18 18:41:53 -08:00
Ishaan Jaff	246e3bafc8	(feat - proxy) Add `status_code` to `litellm_proxy_total_requests_metric_total` (#7293 ) * fix _select_model_name_for_cost_calc docstring * add STATUS_CODE to prometheus * test prometheus unit tests * test_prometheus_unit_tests.py * update Proxy Level Tracking Metrics docs * fix test_proxy_failure_metrics * fix test_proxy_failure_metrics	2024-12-18 15:55:02 -08:00
Krish Dholakia	2f08341a08	Litellm dev readd prompt caching (#7299 ) * fix(router.py): re-add saving model id on prompt caching valid successful deployment * fix(router.py): introduce optional pre_call_checks isolate prompt caching logic in a separate file * fix(prompt_caching_deployment_check.py): fix import * fix(router.py): new 'async_filter_deployments' event hook allows custom logger to filter deployments returned to routing strategy * feat(prompt_caching_deployment_check.py): initial working commit of prompt caching based routing * fix(cooldown_callbacks.py): fix linting error * fix(budget_limiter.py): move budget logger to async_filter_deployment hook * test: add unit test * test(test_router_helper_utils.py): add unit testing * fix(budget_limiter.py): fix linting errors * docs(config_settings.md): add 'optional_pre_call_checks' to router_settings param docs	2024-12-18 15:13:49 -08:00
Ishaan Jaff	523beedb4c	update docs	2024-12-18 11:16:30 -08:00
Ishaan Jaff	9c30387e66	tag budgets fixes	2024-12-18 10:28:37 -08:00
Ishaan Jaff	f3c546b79e	(feat) proxy Azure Blob Storage - Add support for `AZURE_STORAGE_ACCOUNT_KEY` Auth (#7280 ) * add upload_to_azure_data_lake_with_azure_account_key * async_upload_payload_to_azure_blob_storage * docs add AZURE_STORAGE_ACCOUNT_KEY * add azure-storage-file-datalake	2024-12-17 17:35:45 -08:00
Krish Dholakia	cd5bdfcb7a	docs(input.md): document 'extra_headers' param support (#7268 ) * docs(input.md): document 'extra_headers' param support * fix: #7239 to move Nova topK parameter to `additionalModelRequestFields` (#7240) Co-authored-by: Ryan Hoium <rhoium> --------- Co-authored-by: ryanh-ai <3118399+ryanh-ai@users.noreply.github.com>	2024-12-17 07:19:14 -08:00
Ishaan Jaff	f3b13a9af3	(feat) Add Bedrock knowledge base pass through endpoints (#7267 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 56s Details * bugfix: Proxy Routing for Bedrock Knowledgebase URLs are incorrect (#7097) * Fixing routing bug where bedrock knowledgebase urls were being generated incorrectly * Preparing for PR * Preparing for PR * Preparing for PR --------- Co-authored-by: Luke Birk <lb0737@att.com> * fix _is_bedrock_agent_runtime_route * docs - Query Knowledge Base * test_is_bedrock_agent_runtime_route * fix bedrock_proxy_route --------- Co-authored-by: LBirk <2731718+LBirk@users.noreply.github.com> Co-authored-by: Luke Birk <lb0737@att.com>	2024-12-16 22:19:34 -08:00
Ishaan Jaff	3c984ed60e	(feat) Add Azure Blob Storage Logging Integration (#7265 ) * add path to http handler * AzureBlobStorageLogger * test_azure_blob_storage * use constants for Azure storage * use helper get_azure_ad_token_from_entrata_id * azure blob storage support * get_azure_ad_token_from_azure_storage * fix import * azure logging * docs azure storage * add docs on azure blobs * add premium user check * add azure_storage as identified logging callback * async_upload_payload_to_azure_blob_storage * docs azure storage * callback_class_str_to_classType	2024-12-16 22:18:22 -08:00
Ishaan Jaff	d1124a736c	docs add response format on main pages All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details	2024-12-16 08:41:12 -08:00
Ishaan Jaff	5a4b6cd9f5	docs update	2024-12-16 08:06:06 -08:00
Ishaan Jaff	7103198805	(feat) Add Tag-based budgets on litellm router / proxy (#7236 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details * add BudgetConfig * add _get_tags_from_request_kwargs * test_tag_budgets_e2e_test_expect_to_fail * add a check for request tags * fix _async_get_cache_keys_for_router_budget_limiting * fix test * fix _sync_in_memory_spend_with_redis * _async_get_cache_keys_for_router_budget_limiting * fix _init_tag_budgets * fix type casting * docs show error for tag budget limit hit * fix _get_tags_from_request_kwargs * fix undo change	2024-12-14 17:28:36 -08:00
Krish Dholakia	ec36353b41	fix(main.py): fix retries being multiplied when using openai sdk (#7221 ) * fix(main.py): fix retries being multiplied when using openai sdk Closes https://github.com/BerriAI/litellm/pull/7130 * docs(prompt_management.md): add langfuse prompt management doc * feat(team_endpoints.py): allow teams to add their own models Enables teams to call their own finetuned models via the proxy * test: add better enforcement check testing for `/model/new` now that teams can add their own models * docs(team_model_add.md): tutorial for allowing teams to add their own models * test: fix test	2024-12-14 11:56:55 -08:00
Ishaan Jaff	163529b40b	(feat - Router / Proxy ) Allow setting budget limits per LLM deployment (#7220 ) * fix test_deployment_budget_limits_e2e_test * refactor async_log_success_event to track spend for provider + deployment * fix format * rename class to RouterBudgetLimiting * rename func * rename types used for budgets * add new types for deployment budgets * add budget limits for deployments * fix checking budgets set for provider * update file names * fix linting error * _track_provider_remaining_budget_prometheus * async_filter_deployments * fix model list passed to router * update error * test_deployment_budgets_e2e_test_expect_to_fail * fix test case * run deployment budget limits	2024-12-13 19:15:51 -08:00
Krish Dholakia	b150faff90	Litellm dev 12 13 2024 p1 (#7219 ) * fix(litellm_logging.py): pass user metadata to langsmith on sdk calls * fix(litellm_logging.py): pass nested user metadata to logging integration - e.g. langsmith * fix(exception_mapping_utils.py): catch and clarify watsonx `/text/chat` endpoint not supported error message. Closes https://github.com/BerriAI/litellm/issues/7213 * fix(watsonx/common_utils.py): accept new 'WATSONX_IAM_URL' env var allows user to use local watsonx Fixes https://github.com/BerriAI/litellm/issues/4991 * fix(litellm_logging.py): cleanup unused function * test: skip bad ibm test	2024-12-13 19:01:28 -08:00
Ishaan Jaff	621c713400	(docs) Document StandardLoggingPayload Spec (#7201 ) * add slp spec to docs * docs slp * test slp enforcement	2024-12-12 14:00:42 -08:00
Ishaan Jaff	b45777c268	(Feat) DataDog Logger - Add `HOSTNAME` and `POD_NAME` to DataDog logs (#7189 ) * add unit test for test_datadog_static_methods * docs dd vars * test_datadog_payload_environment_variables * test_datadog_static_methods * docs env vars * fix table	2024-12-12 12:06:26 -08:00
Ishaan Jaff	153ab055d6	(feat) add `response_time` to StandardLoggingPayload - logged on `datadog`, `gcs_bucket`, `s3_bucket` etc (#7199 ) * feat - add response_time to slp * test_get_response_time * docs slp * fix test_datadog_logging_http_request	2024-12-12 12:04:43 -08:00
dependabot[bot]	b328d42ebc	build(deps): bump nanoid from 3.3.7 to 3.3.8 in /docs/my-website (#7159 ) Bumps [nanoid](https://github.com/ai/nanoid) from 3.3.7 to 3.3.8. - [Release notes](https://github.com/ai/nanoid/releases) - [Changelog](https://github.com/ai/nanoid/blob/main/CHANGELOG.md) - [Commits](https://github.com/ai/nanoid/compare/3.3.7...3.3.8) --- updated-dependencies: - dependency-name: nanoid dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-12-10 23:51:05 -08:00
Krish Dholakia	350cfc36f7	Litellm merge pr (#7161 ) * build: merge branch * test: fix openai naming * fix(main.py): fix openai renaming * style: ignore function length for config factory * fix(sagemaker/): fix routing logic * fix: fix imports * fix: fix override	2024-12-10 22:49:26 -08:00
Krish Dholakia	0c0498dd60	Litellm dev 12 07 2024 (#7086 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details * fix(main.py): support passing max retries to azure/openai embedding integrations Fixes https://github.com/BerriAI/litellm/issues/7003 * feat(team_endpoints.py): allow updating team model aliases Closes https://github.com/BerriAI/litellm/issues/6956 * feat(router.py): allow specifying model id as fallback - skips any cooldown check Allows a default model to be checked if all models in cooldown s/o @micahjsmith * docs(reliability.md): add fallback to specific model to docs * fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util Allows user to identify if messages/tools have prompt caching Related issue: https://github.com/BerriAI/litellm/issues/6784 * feat(router.py): store model id for prompt caching valid prompt Allows routing to that model id on subsequent requests * fix(router.py): only cache if prompt is valid prompt caching prompt prevents storing unnecessary items in cache * feat(router.py): support routing prompt caching enabled models to previous deployments Closes https://github.com/BerriAI/litellm/issues/6784 * test: fix linting errors * feat(databricks/): convert basemodel to dict and exclude none values allow passing pydantic message to databricks * fix(utils.py): ensure all chat completion messages are dict * (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081) * add custom_llm_provider to SpendLogsPayload * add custom_llm_provider to SpendLogs * add custom llm provider to SpendLogs payload * test_spend_logs_payload * Add MLflow to the side bar (#7031) Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * (bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082) * catch DB_CONNECTION_ERROR_TYPES * fix DB retry mechanism for SpendLog updates * use DB_CONNECTION_ERROR_TYPES in auth checks * fix exp back off for writing SpendLogs * use _raise_failed_update_spend_exception to ensure errors print as NON blocking * test_update_spend_logs_multiple_batches_with_failure * (Feat) Add StructuredOutputs support for Fireworks.AI (#7085) * fix model cost map fireworks ai "supports_response_schema": true, * fix supports_response_schema * fix map openai params fireworks ai * test_map_response_format * test_map_response_format * added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084) * bump: version 1.53.9 → 1.54.0 * fix deepinfra * litellm db fixes LiteLLM_UserTable (#7089) * ci/cd queue new release * fix llama-3.3-70b-versatile * refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090) * fix refactor - use consistent file naming convention * ci/cd run again * fix naming structure * fix use consistent naming (#7092) --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>	2024-12-08 00:30:33 -08:00
Yuki Watanabe	581712d6e7	Add MLflow to the side bar (#7031 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 10s Details Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>	2024-12-07 14:30:32 -08:00
Ishaan Jaff	d43aa6f472	(feat) Allow enabling logging message / response for specific virtual keys (#7071 ) * redact_message_input_output_from_logging * initialize_standard_callback_dynamic_params * allow dynamically opting out of redaction * test_redact_msgs_from_logs_with_dynamic_params * fix AddTeamCallback * _get_turn_off_message_logging_from_dynamic_params * test_global_redaction_with_dynamic_params * test_dynamic_turn_off_message_logging * docs Disable/Enable Message redaction * fix doe qual check * _get_turn_off_message_logging_from_dynamic_params	2024-12-06 21:25:36 -08:00
Ishaan Jaff	87ca62943b	Provider Budget Routing - Get Budget, Spend Details (#7063 ) * add async_get_ttl to dual cache * add ProviderBudgetResponse * add provider_budgets * test_redis_get_ttl * _init_or_get_provider_budget_in_cache * test_init_or_get_provider_budget_in_cache * use _init_provider_budget_in_cache * test_get_current_provider_budget_reset_at * doc Get Budget, Spend Details * doc Provider Budget Routing	2024-12-06 21:14:12 -08:00

... 9 10 11 12 13 ...

3415 commits