litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 02:34:29 +00:00

Author	SHA1	Message	Date
Krish Dholakia	4330ef8e81	Fix batches api cost tracking + Log batch models in spend logs / standard logging payload (#9077 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 42s Details * feat(batches/): fix batch cost calculation - ensure it's accurate use the correct cost value - prev. defaulting to non-batch cost * feat(batch_utils.py): log batch models to spend logs + standard logging payload makes it easy to understand how cost was calculated * fix: fix stored payload for test * test: fix test	2025-03-08 11:47:25 -08:00
Krish Dholakia	5e386c28b2	Litellm dev 03 04 2025 p3 (#8997 ) * fix(core_helpers.py): handle litellm_metadata instead of 'metadata' * feat(batches/): ensure batches logs are written to db makes batches response dict compatible * fix(cost_calculator.py): handle batch response being a dictionary * fix(batches/main.py): modify retrieve endpoints to use @client decorator enables logging to work on retrieve call * fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible * fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type create batch and retrieve batch share the same id * fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted * refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline ensures cost is always logged to db * fix: fix linting errors * fix: fix linting error	2025-03-04 21:58:03 -08:00
Krish Dholakia	09462ba80c	Add cohere v2/rerank support (#8421 ) (#8605 ) * Add cohere v2/rerank support (#8421) * Support v2 endpoint cohere rerank * Add tests and docs * Make v1 default if old params used * Update docs * Update docs pt 2 * Update tests * Add e2e test * Clean up code * Use inheritence for new config * Fix linting issues (#8608) * Fix cohere v2 failing test + linting (#8672) * Fix test and unused imports * Fix tests * fix: fix linting errors * test: handle tgai instability * fix: skip service unavailable err * test: print logs for unstable test * test: skip unreliable tests --------- Co-authored-by: vibhavbhat <vibhavb00@gmail.com>	2025-02-22 22:25:29 -08:00
Krish Dholakia	b682dc4ec8	Add cost tracking for rerank via bedrock (#8691 ) * feat(bedrock/rerank): infer model region if model given as arn * test: add unit testing to ensure bedrock region name inferred from arn on rerank * feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137 * test(test_bedrock_completion.py): add testing for bedrock cohere rerank * feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking * build(model_prices_and_context_window.json): add amazon.rerank model to model cost map * fix(cost_calculator.py): bedrock/common_utils.py get base model from model w/ arn -> handles rerank model * build(model_prices_and_context_window.json): add bedrock cohere rerank pricing * feat(bedrock/rerank): migrate bedrock config to basererank config * Revert "feat(bedrock/rerank): migrate bedrock config to basererank config" This reverts commit `84fae1f167`. * test: add testing to ensure large doc / queries are correctly counted * Revert "test: add testing to ensure large doc / queries are correctly counted" This reverts commit `4337f1657e`. * fix(migrate-jina-ai-to-rerank-config): enables cost tracking * refactor(jina_ai/): finish migrating jina ai to base rerank config enables cost tracking * fix(jina_ai/rerank): e2e jina ai rerank cost tracking * fix: cleanup dead code * fix: fix python3.8 compatibility error * test: fix test * test: add e2e testing for azure ai rerank * fix: fix linting error * test: mark cohere as flaky	2025-02-20 21:00:18 -08:00
Krish Dholakia	03eef5a2a0	Fix custom pricing - separate provider info from model info (#7990 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 34s Details * fix(utils.py): initial commit fixing custom cost tracking refactors out provider specific model info from `get_model_info` - this was causing custom costs to be registered incorrectly * fix(utils.py): cleanup `_supports_factory` to check provider info, if model info is None some providers support features like vision across all models * fix(utils.py): refactor to use _supports_factory * test: update testing * fix: fix linting errors * test: fix testing	2025-01-25 21:49:28 -08:00
Krish Dholakia	8ca3229b26	Ensure base_model cost tracking works across all endpoints (#7989 ) * test(test_completion_cost.py): add sdk test to ensure base model is used for cost tracking * test(test_completion_cost.py): add sdk test to ensure custom pricing works * fix(main.py): add base model cost tracking support for embedding calls Enables base model cost tracking for embedding calls when base model set as a litellm_param * fix(litellm_logging.py): update logging object with litellm params - including base model, if given ensures base model param is always tracked * fix(main.py): fix linting errors	2025-01-24 21:05:26 -08:00
Krish Dholakia	c6e9240405	Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958 ) * fix(bedrock/converse_handler.py): fix bedrock region name on async calls * fix(utils.py): fix split model handling Fixes bedrock cost calculation when region name is given * feat(_health_endpoints.py): support health checking datadog integration Closes https://github.com/BerriAI/litellm/issues/7921	2025-01-23 22:17:09 -08:00
Ishaan Jaff	b7e68eccdd	fixes for img gen cost cal	2025-01-12 16:41:18 -08:00
Ishaan Jaff	bb1489eced	fix optimize get llm provider	2025-01-12 16:21:23 -08:00
Ishaan Jaff	2c25ea5737	(litellm sdk speedup) - use `_model_contains_known_llm_provider` in `response_cost_calculator` to check if the model contains a known litellm provider (#7721 ) * define _cached_get_model_info_helper * use _cached_get_model_info_helper * speed up _select_model_name_for_cost_calc	2025-01-12 15:40:05 -08:00
Ishaan Jaff	6518bc70a0	(litellm SDK perf improvement) - use `verbose_logger.debug` and `_cached_get_model_info_helper` in `_response_cost_calculator` (#7720 ) * define _cached_get_model_info_helper * use _cached_get_model_info_helper	2025-01-12 15:27:54 -08:00
Ishaan Jaff	dab7bebaf2	use _get_model_info_helper (#7703 )	2025-01-11 21:08:15 -08:00
Krish Dholakia	4af23353d6	Allow assigning teams to org on UI + OpenAI `omni-moderation` cost model tracking (#7566 ) * feat(cost_calculator.py): add cost tracking ($0) for openai moderations endpoint removes sentry cost tracking errors caused by this * build(teams.tsx): allow assigning teams to orgs	2025-01-08 16:58:21 -08:00
Krish Dholakia	4e69711411	Litellm dev 01 07 2025 p1 (#7618 ) * fix(main.py): pass custom llm provider on litellm logging provider update * fix(cost_calculator.py): don't append provider name to return model if existing llm provider Fixes https://github.com/BerriAI/litellm/issues/7607 * fix(prometheus_services.py): fix prometheus system health error logging Fixes https://github.com/BerriAI/litellm/issues/7611	2025-01-07 21:22:31 -08:00
Krish Dholakia	c3edfc2c92	LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 35s Details * build(model_prices_and_context_window.json): add gemini-1.5-flash context caching * fix(context_caching/transformation.py): just use last identified cache point Fixes https://github.com/BerriAI/litellm/issues/6738 * fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google Fixes https://github.com/BerriAI/litellm/issues/6738 * fix(vertex_ai/gemini/): track context caching tokens * refactor(gemini/): place transformation.py inside `chat/` folder make it easy for user to know we support the equivalent endpoint * fix: fix import * refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder make it easier to see cost calculation logic * fix: fix linting errors * fix: fix circular import * feat(gemini/cost_calculator.py): support gemini context caching cost calculation generifies anthropic's cost calculation function and uses it across anthropic + gemini * build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching Closes https://github.com/BerriAI/litellm/issues/6891 * docs(gemini.md): add gemini context caching architecture diagram make it easier for user to understand how context caching works * docs(gemini.md): link to relevant gemini context caching code * docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code * fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario * fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation * test: fix test	2024-12-23 22:02:52 -08:00
Krish Dholakia	db59e08958	Litellm dev 12 23 2024 p1 (#7383 ) * feat(guardrails_endpoint.py): new `/guardrails/list` endpoint Allow users to view what the available guardrails are * docs: document new `/guardrails/list` endpoint * docs(enterprise.md): update docs * fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats * fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests * fix: fix linting errors * fix: remove unused import	2024-12-23 16:33:31 -08:00
Ishaan Jaff	11e5960462	use helper for image gen tests (#7343 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details	2024-12-20 21:28:32 -08:00
Ishaan Jaff	c7f14e936a	(code quality) run ruff rule to ban unused imports (#7313 ) * remove unused imports * fix AmazonConverseConfig * fix test * fix import * ruff check fixes * test fixes * fix testing * fix imports	2024-12-19 12:33:42 -08:00
Ishaan Jaff	4622e35d96	fix _select_model_name_for_cost_calc docstring	2024-12-18 09:39:31 -08:00
Krish Dholakia	179d2f56b7	LiteLLM Minor Fixes & Improvements (12/16/2024) - p1 (#7263 ) * fix(factory.py): skip empty text blocks for bedrock user messages Fixes https://github.com/BerriAI/litellm/issues/7169 * Add support for Gemini 2.0 GoogleSearch tool (#7257) * Add support for google_search tool in gemini 2.0 * Add/modify tests * Fix grounding check * Remove 2.0 grounding test; exclude experimental model in VERTEX_MODELS_TO_NOT_TEST * Swap order of tools * DFix formatting * fix(get_api_base.py): return api base in streaming response Fixes https://github.com/BerriAI/litellm/issues/7249 Closes https://github.com/BerriAI/litellm/pull/7250 * fix(cost_calculator.py): only set base model to model if not none Fixes https://github.com/BerriAI/litellm/issues/7223 * fix(cost_calculator.py): enforce stricter order when picking model for cost calculation * fix(cost_calculator.py): fix '_select_model_name_for_cost_calc' to return model name with region name prefix if provided * fix(utils.py): fix 'get_model_info()' to handle edge case where model name starts with custom llm provider AND custom llm provider is given * fix(cost_calculator.py): handle `custom_llm_provider-` scenario * fix(cost_calculator.py): e2e working tts cost tracking ensures initial message is passed in, to cost calculator * fix(factory.py): suppress linting errors * fix(cost_calculator.py): strip llm provider from model name after selecting cost calc model * fix(litellm_logging.py): store initial request in 'input' field + accept base_model to be passed in litellm_params directly * test: handle none env var value in flaky test * fix(litellm_logging.py): fix linting errors --------- Co-authored-by: Sam B <samlingx@gmail.com>	2024-12-17 15:33:36 -08:00
Krish Dholakia	224ead1531	fix(utils.py): fix openai-like api response format parsing (#7273 ) * fix(utils.py): fix openai-like api response format parsing Fixes issue passing structured output to litellm_proxy/ route * fix(cost_calculator.py): fix whisper transcription cost calc to use file duration, not response time ' * test: skip test if credentials not found	2024-12-17 12:49:09 -08:00
Krish Dholakia	516c2a6a70	Litellm remove circular imports (#7232 ) * fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py * fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py ' * refactor: fix litellm.ModelResponse import on pass through endpoints * refactor(litellm_logging.py): fix circular import for custom callbacks literal * fix(factory.py): fix circular imports inside prompt factory * fix(cost_calculator.py): fix circular import for 'litellm.Usage' * fix(proxy_server.py): fix potential circular import with `litellm.Router' * fix(proxy/utils.py): fix potential circular import in `litellm.Router` * fix: remove circular imports in 'auth_checks' and 'guardrails/' * fix(prompt_injection_detection.py): fix router impor t * fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through * fix(anthropic_pass_through_logging_handler.py): fix potential circular imports * fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import * fix(base.py): fix potential circular import * fix(handler.py): fix potential circular ref in codestral + cohere handler's * fix(azure.py): fix potential circular imports * fix(gpt_transformation.py): fix modelresponse import * fix(litellm_logging.py): add logging base class - simplify typing makes it easy for other files to type check the logging obj without introducing circular imports * fix(azure_ai/embed): fix potential circular import on handler.py * fix(databricks/): fix potential circular imports in databricks/ * fix(vertex_ai/): fix potential circular imports on vertex ai embeddings * fix(vertex_ai/image_gen): fix import * fix(watsonx-+-bedrock): cleanup imports * refactor(anthropic-pass-through-+-petals): cleanup imports * refactor(huggingface/): cleanup imports * fix(ollama-+-clarifai): cleanup circular imports * fix(openai_like/): fix impor t * fix(openai_like/): fix embedding handler cleanup imports * refactor(openai.py): cleanup imports * fix(sagemaker/transformation.py): fix import * ci(config.yml): add circular import test to ci/cd	2024-12-14 16:28:34 -08:00
Ishaan Jaff	21003c4337	Code Quality Improvement - use `vertex_ai/` as folder name for vertexAI (#7166 ) * fix rename vertex ai * run ci/cd again	2024-12-11 00:32:41 -08:00
Ishaan Jaff	bfb6891eb7	rename `llms/OpenAI/` -> `llms/openai/` (#7154 ) * rename OpenAI -> openai * fix file rename * fix rename changes * fix organization of openai/transcription * fix import OA fine tuning API * fix openai ft handler * fix handler import	2024-12-10 20:14:07 -08:00
Ishaan Jaff	36e99ebce7	fix use consistent naming (#7092 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details	2024-12-07 22:01:00 -08:00
Krish Dholakia	816f0ef8d2	LiteLLM Minor Fixes & Improvements (12/05/2024) (#7051 ) * fix(cost_calculator.py): move to using `.get_model_info()` for cost per token calculations ensures cost tracking is reliable - handles edge cases of parsing model cost map * build(model_prices_and_context_window.json): add 'supports_response_schema' for select tgai models Fixes https://github.com/BerriAI/litellm/pull/7037#discussion_r1872157329 * build(model_prices_and_context_window.json): remove 'pdf input' and 'vision' support from nova micro in model map Bedrock docs indicate no support for micro - https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html * fix(converse_transformation.py): support amazon nova tool use * fix(opentelemetry): Add missing LLM request type attribute to spans (#7041) * feat(opentelemetry): add LLM request type attribute to spans * lint * fix: curl usage (#7038) curl -d, --data <data> is lowercase d curl -D, --dump-header <filename> is uppercase D references: https://curl.se/docs/manpage.html#-d https://curl.se/docs/manpage.html#-D * fix(spend_tracking.py): handle empty 'id' in model response - when creating spend log Fixes https://github.com/BerriAI/litellm/issues/7023 * fix(streaming_chunk_builder.py): handle initial id being empty string Fixes https://github.com/BerriAI/litellm/issues/7023 * fix(anthropic_passthrough_logging_handler.py): add end user cost tracking for anthropic pass through endpoint * docs(pass_through/): refactor docs location + add table on supported features for pass through endpoints * feat(anthropic_passthrough_logging_handler.py): support end user cost tracking via anthropic sdk * docs(anthropic_completion.md): add docs on passing end user param for cost tracking on anthropic sdk * fix(litellm_logging.py): use standard logging payload if present in kwargs prevent datadog logging error for pass through endpoints * docs(bedrock.md): add rerank api usage example to docs * bugfix/change dummy tool name format (#7053) * fix viewing keys (#7042) * ui new build * build(model_prices_and_context_window.json): add bedrock region models to model cost map (#7044) * bye (#6982) * (fix) litellm router.aspeech (#6962) * doc Migrating Databases * fix aspeech on router * test_audio_speech_router * test_audio_speech_router * docs show supported providers on batches api doc * change dummy tool name format --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> * fix: fix linting errors * test: update test * fix(litellm_logging.py): fix pass through check * fix(test_otel_logging.py): fix test * fix(cost_calculator.py): update handling for cost per second * fix(cost_calculator.py): fix cost check * test: fix test * (fix) adding public routes when using custom header (#7045) * get_api_key_from_custom_header * add test_get_api_key_from_custom_header * fix testing use 1 file for test user api key auth * fix test user api key auth * test_custom_api_key_header_name * build: update ui build --------- Co-authored-by: Doron Kopit <83537683+doronkopit5@users.noreply.github.com> Co-authored-by: lloydchang <lloydchang@gmail.com> Co-authored-by: hgulersen <haymigulersen@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com>	2024-12-06 14:29:53 -08:00
Ishaan Jaff	c03351328f	fix imagegeneration output_cost_per_image on model cost map (#6752 )	2024-11-14 20:37:21 -08:00
Ishaan Jaff	73c7b73aa0	(feat) Add cost tracking for Azure Dall-e-3 Image Generation + use base class to ensure basic image generation tests pass (#6716 ) * add BaseImageGenTest * use 1 class for unit testing * add debugging to BaseImageGenTest * TestAzureOpenAIDalle3 * fix response_cost_calculator * test_basic_image_generation * fix img gen basic test * fix _select_model_name_for_cost_calc * fix test_aimage_generation_bedrock_with_optional_params * fix undo changes cost tracking * fix response_cost_calculator * fix test_cost_azure_gpt_35	2024-11-12 20:02:16 -08:00
Ishaan Jaff	25bae4cc23	(feat) add cost tracking stable diffusion 3 on Bedrock (#6676 ) * add cost tracking for sd3 * test_image_generation_bedrock * fix get model info for image cost * add cost_calculator for stability 1 models * add unit testing for bedrock image cost calc * test_cost_calculator_with_no_optional_params * add test_cost_calculator_basic * correctly allow size Optional * fix cost_calculator * sd3 unit tests cost calc	2024-11-11 20:21:44 -08:00
Krish Dholakia	7cc12bd5c6	LiteLLM Minor Fixes & Improvements (10/18/2024) (#6320 ) * fix(converse_transformation.py): handle cross region model name when getting openai param support Fixes https://github.com/BerriAI/litellm/issues/6291 * LiteLLM Minor Fixes & Improvements (10/17/2024) (#6293) * fix(ui_sso.py): fix faulty admin only check Fixes https://github.com/BerriAI/litellm/issues/6286 * refactor(sso_helper_utils.py): refactor /sso/callback to use helper utils, covered by unit testing Prevent future regressions * feat(prompt_factory): support 'ensure_alternating_roles' param Closes https://github.com/BerriAI/litellm/issues/6257 * fix(proxy/utils.py): add dailytagspend to expected views * feat(auth_utils.py): support setting regex for clientside auth credentials Fixes https://github.com/BerriAI/litellm/issues/6203 * build(cookbook): add tutorial for mlflow + langchain + litellm proxy tracing * feat(argilla.py): add argilla logging integration Closes https://github.com/BerriAI/litellm/issues/6201 * fix: fix linting errors * fix: fix ruff error * test: fix test * fix: update vertex ai assumption - parts not always guaranteed (#6296) * docs(configs.md): add argila env var to docs * docs(user_keys.md): add regex doc for clientside auth params * docs(argilla.md): add doc on argilla logging * docs(argilla.md): add sampling rate to argilla calls * bump: version 1.49.6 → 1.49.7 * add gpt-4o-audio models to model cost map (#6306) * (code quality) add ruff check PLR0915 for `too-many-statements` (#6309) * ruff add PLR0915 * add noqa for PLR0915 * fix noqa * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * doc fix Turn on / off caching per Key. (#6297) * (feat) Support `audio`, `modalities` params (#6304) * add audio, modalities param * add test for gpt audio models * add get_supported_openai_params for GPT audio models * add supported params for audio * test_audio_output_from_model * bump openai to openai==1.52.0 * bump openai on pyproject * fix audio test * fix test mock_chat_response * handle audio for Message * fix handling audio for OAI compatible API endpoints * fix linting * fix mock dbrx test * (feat) Support audio param in responses streaming (#6312) * add audio, modalities param * add test for gpt audio models * add get_supported_openai_params for GPT audio models * add supported params for audio * test_audio_output_from_model * bump openai to openai==1.52.0 * bump openai on pyproject * fix audio test * fix test mock_chat_response * handle audio for Message * fix handling audio for OAI compatible API endpoints * fix linting * fix mock dbrx test * add audio to Delta * handle model_response.choices.delta.audio * fix linting * build(model_prices_and_context_window.json): add gpt-4o-audio audio token cost tracking * refactor(model_prices_and_context_window.json): refactor 'supports_audio' to be 'supports_audio_input' and 'supports_audio_output' Allows for flag to be used for openai + gemini models (both support audio input) * feat(cost_calculation.py): support cost calc for audio model Closes https://github.com/BerriAI/litellm/issues/6302 * feat(utils.py): expose new `supports_audio_input` and `supports_audio_output` functions Closes https://github.com/BerriAI/litellm/issues/6303 * feat(handle_jwt.py): support single dict list * fix(cost_calculator.py): fix linting errors * fix: fix linting error * fix(cost_calculator): move to using standard openai usage cached tokens value * test: fix test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-10-19 22:23:27 -07:00
Ishaan Jaff	610974b4fc	(code quality) add ruff check PLR0915 for `too-many-statements` (#6309 ) * ruff add PLR0915 * add noqa for PLR0915 * fix noqa * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915	2024-10-18 15:36:49 +05:30
Krish Dholakia	f252350881	LiteLLM Minor Fixes & Improvements (10/17/2024) (#6293 ) * fix(ui_sso.py): fix faulty admin only check Fixes https://github.com/BerriAI/litellm/issues/6286 * refactor(sso_helper_utils.py): refactor /sso/callback to use helper utils, covered by unit testing Prevent future regressions * feat(prompt_factory): support 'ensure_alternating_roles' param Closes https://github.com/BerriAI/litellm/issues/6257 * fix(proxy/utils.py): add dailytagspend to expected views * feat(auth_utils.py): support setting regex for clientside auth credentials Fixes https://github.com/BerriAI/litellm/issues/6203 * build(cookbook): add tutorial for mlflow + langchain + litellm proxy tracing * feat(argilla.py): add argilla logging integration Closes https://github.com/BerriAI/litellm/issues/6201 * fix: fix linting errors * fix: fix ruff error * test: fix test * fix: update vertex ai assumption - parts not always guaranteed (#6296) * docs(configs.md): add argila env var to docs	2024-10-17 22:09:11 -07:00
Krish Dholakia	6005450c8f	LiteLLM Minor Fixes & Improvements (10/09/2024) (#6139 ) * fix(utils.py): don't return 'none' response headers Fixes https://github.com/BerriAI/litellm/issues/6123 * fix(vertex_and_google_ai_studio_gemini.py): support parsing out additional properties and strict value for tool calls Fixes https://github.com/BerriAI/litellm/issues/6136 * fix(cost_calculator.py): set default character value to none Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403290196 * fix(google.py): fix cost per token / cost per char conversion Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403370287 * build(model_prices_and_context_window.json): update gemini pricing Fixes https://github.com/BerriAI/litellm/issues/6133 * build(model_prices_and_context_window.json): update gemini pricing * fix(litellm_logging.py): fix streaming caching logging when 'turn_off_message_logging' enabled Stores unredacted response in cache * build(model_prices_and_context_window.json): update gemini-1.5-flash pricing * fix(cost_calculator.py): fix default prompt_character count logic Fixes error in gemini cost calculation * fix(cost_calculator.py): fix cost calc for tts models	2024-10-10 00:42:11 -07:00
Krish Dholakia	9695c1af10	LiteLLM Minor Fixes & Improvements (10/08/2024) (#6119 ) * refactor(cost_calculator.py): move error line to debug - https://github.com/BerriAI/litellm/issues/5683#issuecomment-2398599498 * fix(migrate-hidden-params-to-read-from-standard-logging-payload): Fixes https://github.com/BerriAI/litellm/issues/5546#issuecomment-2399994026 * fix(types/utils.py): mark weight as a litellm param Fixes https://github.com/BerriAI/litellm/issues/5781 * feat(internal_user_endpoints.py): fix /user/info + show user max budget as default max budget Fixes https://github.com/BerriAI/litellm/issues/6117 * feat: support returning team member budget in `/user/info` Sets user max budget in team as max budget on ui Closes https://github.com/BerriAI/litellm/issues/6117 * bug fix for optional parameter passing to replicate (#6067) Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com> * fix(o1_transformation.py): handle o1 temperature=0 o1 doesn't support temp=0, allow admin to drop this param * test: fix test --------- Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com> Co-authored-by: Mandana Vaziri <mvaziri@us.ibm.com>	2024-10-08 21:57:03 -07:00
Krish Dholakia	fac3b2ee42	Add pyright to ci/cd + Fix remaining type-checking errors (#6082 ) * fix: fix type-checking errors * fix: fix additional type-checking errors * fix: additional type-checking error fixes * fix: fix additional type-checking errors * fix: additional type-check fixes * fix: fix all type-checking errors + add pyright to ci/cd * fix: fix incorrect import * ci(config.yml): use mypy on ci/cd * fix: fix type-checking errors in utils.py * fix: fix all type-checking errors on main.py * fix: fix mypy linting errors * fix(anthropic/cost_calculator.py): fix linting errors * fix: fix mypy linting errors * fix: fix linting errors	2024-10-05 17:04:00 -04:00
Ishaan Jaff	ab0b536143	(feat) add azure openai cost tracking for prompt caching (#6077 ) * add azure o1 models to model cost map * add azure o1 cost tracking * fix azure cost calc * add get llm provider test	2024-10-05 15:04:18 +05:30
Ishaan Jaff	3682f661d8	(feat) add cost tracking for OpenAI prompt caching (#6055 ) * add cache_read_input_token_cost for prompt caching models * add prompt caching for latest models * add openai cost calculator * add openai prompt caching test * fix lint check * add not on how usage._cache_read_input_tokens is used * fix cost calc whisper openai * use output_cost_per_second * add input_cost_per_second	2024-10-05 14:20:15 +05:30
Krish Dholakia	bd17424c4b	LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925 ) (#5937 ) * LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925) * fix(litellm_logging.py): don't initialize prometheus_logger if non premium user Prevents bad error messages in logs Fixes https://github.com/BerriAI/litellm/issues/5897 * Add Support for Custom Providers in Vision and Function Call Utils (#5688) * Add Support for Custom Providers in Vision and Function Call Utils Lookup * Remove parallel function call due to missing model info param * Add Unit Tests for Vision and Function Call Changes * fix-#5920: set header value to string to fix "'int' object has no att… (#5922) * LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880) * LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) * feat(auth_utils.py): enable admin to allow client-side credentials to be passed Makes it easier for devs to experiment with finetuned fireworks ai models * feat(router.py): allow setting configurable_clientside_auth_params for a model Closes https://github.com/BerriAI/litellm/issues/5843 * build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit Fixes https://github.com/BerriAI/litellm/issues/5850 * fix(azure_ai/): support content list for azure ai Fixes https://github.com/BerriAI/litellm/issues/4237 * fix(litellm_logging.py): always set saved_cache_cost Set to 0 by default * fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing handles calling 405b+ size models * fix(slack_alerting.py): fix error alerting for failed spend tracking Fixes regression with slack alerting error monitoring * fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error * docs(bedrock.md): add llama3-1 models * test: fix tests * fix(azure_ai/chat): fix transformation for azure ai calls * feat(azure_ai/embed): Add azure ai embeddings support Closes https://github.com/BerriAI/litellm/issues/5861 * fix(azure_ai/embed): enable async embedding * feat(azure_ai/embed): support azure ai multimodal embeddings * fix(azure_ai/embed): support async multi modal embeddings * feat(together_ai/embed): support together ai embedding calls * feat(rerank/main.py): log source documents for rerank endpoints to langfuse improves rerank endpoint logging * fix(langfuse.py): support logging `/audio/speech` input to langfuse * test(test_embedding.py): fix test * test(test_completion_cost.py): fix helper util * fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'" --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926) This reverts commit `a554ae2695`. * build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing Enables cost tracking for azure ai cohere rerank models * fix(litellm_logging.py): fix debug log to be clearer Closes https://github.com/BerriAI/litellm/issues/5909 * test(test_utils.py): fix test name * fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models * fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints * fix(converse_handler.py): support new llama 3-2 models Fixes https://github.com/BerriAI/litellm/issues/5901 * fix(litellm_logging.py): ensure response is redacted for standard message logging Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360 * fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation allows user to set custom cost for model * fix(config.yml): fix docker hub auht * build(config.yml): add docker auth to all tests * fix(db/create_views.py): fix linting error * fix(main.py): fix circular import * fix(azure_ai/__init__.py): fix circular import * fix(main.py): fix import * fix: fix linting errors * test: fix test * fix(proxy_server.py): pass premium user value on startup used for prometheus init --------- Co-authored-by: Cole Murray <colemurray.cs@gmail.com> Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com> * handle streaming for azure ai studio error * [Perf Proxy] parallel request limiter - use one cache update call (#5932) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf * test: fix test * test(test_rerank.py): fix test --------- Co-authored-by: Cole Murray <colemurray.cs@gmail.com> Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-09-27 17:54:13 -07:00
Krish Dholakia	16c0307eab	LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880 ) * LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) * feat(auth_utils.py): enable admin to allow client-side credentials to be passed Makes it easier for devs to experiment with finetuned fireworks ai models * feat(router.py): allow setting configurable_clientside_auth_params for a model Closes https://github.com/BerriAI/litellm/issues/5843 * build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit Fixes https://github.com/BerriAI/litellm/issues/5850 * fix(azure_ai/): support content list for azure ai Fixes https://github.com/BerriAI/litellm/issues/4237 * fix(litellm_logging.py): always set saved_cache_cost Set to 0 by default * fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing handles calling 405b+ size models * fix(slack_alerting.py): fix error alerting for failed spend tracking Fixes regression with slack alerting error monitoring * fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error * docs(bedrock.md): add llama3-1 models * test: fix tests * fix(azure_ai/chat): fix transformation for azure ai calls * feat(azure_ai/embed): Add azure ai embeddings support Closes https://github.com/BerriAI/litellm/issues/5861 * fix(azure_ai/embed): enable async embedding * feat(azure_ai/embed): support azure ai multimodal embeddings * fix(azure_ai/embed): support async multi modal embeddings * feat(together_ai/embed): support together ai embedding calls * feat(rerank/main.py): log source documents for rerank endpoints to langfuse improves rerank endpoint logging * fix(langfuse.py): support logging `/audio/speech` input to langfuse * test(test_embedding.py): fix test * test(test_completion_cost.py): fix helper util	2024-09-25 22:11:57 -07:00
Krish Dholakia	2488e4b45f	Cost tracking improvements (#5828 ) * feat(litellm_logging.py): update standard logging payload to include debug information for cost failures Also includes fixes for cohere rerank cost tracking + databricks llama2 model cost tracking Easier to repro cost failures and improve reliability in prod * fix(proxy_server.py): emit cost failure debug info for slack alerting Improves debug information for cost tracking failures, on slack alerting	2024-09-21 21:47:50 -07:00
Krish Dholakia	d46660ea0f	LiteLLM Minor Fixes & Improvements (09/18/2024) (#5772 ) * fix(proxy_server.py): fix azure key vault logic to not require client id/secret * feat(cost_calculator.py): support fireworks ai cost tracking * build(docker-compose.yml): add lines for mounting config.yaml to docker compose Closes https://github.com/BerriAI/litellm/issues/5739 * fix(input.md): update docs to clarify litellm supports content as a list of dictionaries Fixes https://github.com/BerriAI/litellm/issues/5755 * fix(input.md): update input.md to include all message values * fix(image_handling.py): follow image url redirects Fixes https://github.com/BerriAI/litellm/issues/5763 * fix(router.py): Fix model key/base leak in error message Fixes https://github.com/BerriAI/litellm/issues/5762 * fix(http_handler.py): fix linting error * fix(azure.py): fix logging to show azure_ad_token being used Fixes https://github.com/BerriAI/litellm/issues/5767 * fix(_redis.py): add redis sentinel support Closes https://github.com/BerriAI/litellm/issues/4381 * feat(_redis.py): add redis sentinel support Closes https://github.com/BerriAI/litellm/issues/4381 * test(test_completion_cost.py): fix test * Databricks Integration: Integrate Databricks SDK as optional mechanism for fetching API base and token, if unspecified (#5746) * LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723) * coverage (#5713) Signed-off-by: dbczumar <corey.zumar@databricks.com> * Move (#5714) Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix(litellm_logging.py): fix logging client re-init (#5710) Fixes https://github.com/BerriAI/litellm/issues/5695 * fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config Fixes https://github.com/BerriAI/litellm/issues/5682 * feat(o1_handler.py): fake streaming for openai o1 models Fixes https://github.com/BerriAI/litellm/issues/5694 * docs: deprecated traceloop integration in favor of native otel (#5249) * fix: fix linting errors * fix: fix linting errors * fix(main.py): fix o1 import --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com> * feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730) * feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it * fix(custom_logger.py): reset calltype * fix: fix linting errors * fix: fix linting error * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix: fix import * Fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * DB test Signed-off-by: dbczumar <corey.zumar@databricks.com> * Coverage Signed-off-by: dbczumar <corey.zumar@databricks.com> * progress Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix test name Signed-off-by: dbczumar <corey.zumar@databricks.com> --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com> * test: fix test * test(test_databricks.py): fix test * fix(databricks/chat.py): handle custom endpoint (e.g. sagemaker) * Apply code scanning fix for clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * fix(__init__.py): fix known fireworks ai models --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2024-09-19 13:25:29 -07:00
Krish Dholakia	98c335acd0	LiteLLM Minor Fixes & Improvements (09/17/2024) (#5742 ) * fix(proxy_server.py): use default azure credentials to support azure non-client secret kms * fix(langsmith.py): raise error if credentials missing * feat(langsmith.py): support error logging for langsmith + standard logging payload Fixes https://github.com/BerriAI/litellm/issues/5738 * Fix hardcoding of schema in view check (#5749) * fix - deal with case when check view exists returns None (#5740) * Revert "fix - deal with case when check view exists returns None (#5740)" (#5741) This reverts commit `535228159b`. * test(test_router_debug_logs.py): move to mock response * Fix hardcoding of schema --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com> * fix(proxy_server.py): allow admin to disable ui via `DISABLE_ADMIN_UI` flag * fix(router.py): fix default model name value Fixes `55db19a1e4 (r1763712148)` * fix(utils.py): fix unbound variable error * feat(rerank/main.py): add azure ai rerank endpoints Closes https://github.com/BerriAI/litellm/issues/5667 * feat(secret_detection.py): Allow configuring secret detection params Allows admin to control what plugins to run for secret detection. Prevents overzealous secret detection. * docs(secret_detection.md): add secret detection guardrail docs * fix: fix linting errors * fix - deal with case when check view exists returns None (#5740) * Revert "fix - deal with case when check view exists returns None (#5740)" (#5741) This reverts commit `535228159b`. * Litellm fix router testing (#5748) * test: fix testing - azure changed content policy error logic * test: fix tests to use mock responses * test(test_image_generation.py): handle api instability * test(test_image_generation.py): handle azure api instability * fix(utils.py): fix unbounded variable error * fix(utils.py): fix unbounded variable error * test: refactor test to use mock response * test: mark flaky azure tests * Bump next from 14.1.1 to 14.2.10 in /ui/litellm-dashboard (#5753) Bumps [next](https://github.com/vercel/next.js) from 14.1.1 to 14.2.10. - [Release notes](https://github.com/vercel/next.js/releases) - [Changelog](https://github.com/vercel/next.js/blob/canary/release.js) - [Commits](https://github.com/vercel/next.js/compare/v14.1.1...v14.2.10) --- updated-dependencies: - dependency-name: next dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [Fix] o1-mini causes pydantic warnings on `reasoning_tokens` (#5754) * add requester_metadata in standard logging payload * log requester_metadata in metadata * use StandardLoggingPayload for logging * docs StandardLoggingPayload * fix import * include standard logging object in failure * add test for requester metadata * handle completion_tokens_details * add test for completion_tokens_details * [Feat-Proxy-DataDog] Log Redis, Postgres Failure events on DataDog (#5750) * dd - start tracking redis status on dd * add async_service_succes_hook / failure hook in custom logger * add async_service_failure_hook * log service failures on dd * fix import error * add test for redis errors / warning * [Fix] Router/ Proxy - Tag Based routing, raise correct error when no deployments found and tag filtering is on (#5745) * fix tag routing - raise correct error when no model with tag based routing * fix error string from tag based routing * test router tag based routing * raise 401 error when no tags avialable for deploymen * linting fix * [Feat] Log Request metadata on gcs bucket logging (#5743) * add requester_metadata in standard logging payload * log requester_metadata in metadata * use StandardLoggingPayload for logging * docs StandardLoggingPayload * fix import * include standard logging object in failure * add test for requester metadata * fix(litellm_logging.py): fix logging message * fix(rerank_api/main.py): fix linting errors * fix(custom_guardrails.py): maintain backwards compatibility for older guardrails * fix(rerank_api/main.py): fix cost tracking for rerank endpoints --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: steffen-sbt <148480574+steffen-sbt@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-17 23:00:04 -07:00
Krish Dholakia	0295a22561	LiteLLM Minor Fixes and Improvements (09/10/2024) (#5618 ) * fix(cost_calculator.py): move to debug for noisy warning message on cost calculation error Fixes https://github.com/BerriAI/litellm/issues/5610 * fix(databricks/cost_calculator.py): Handles model name issues for databricks models * fix(main.py): fix stream chunk builder for multiple tool calls Fixes https://github.com/BerriAI/litellm/issues/5591 * fix: correctly set user_alias when passed in Fixes https://github.com/BerriAI/litellm/issues/5612 * fix(types/utils.py): allow passing role for message object https://github.com/BerriAI/litellm/issues/5621 * fix(litellm_logging.py): Fix langfuse logging across multiple projects Fixes issue where langfuse logger was re-using the old logging object * feat(proxy/_types.py): support adding key-based tags for tag-based routing Enable tag based routing at key-level * fix(proxy/_types.py): fix inheritance * test(test_key_generate_prisma.py): fix test * test: fix test * fix(litellm_logging.py): return used callback object	2024-09-11 11:30:29 -07:00
Krish Dholakia	2d2282101b	LiteLLM Minor Fixes and Improvements (09/09/2024) (#5602 ) * fix(main.py): pass default azure api version as alternative in completion call Fixes api error caused due to api version Closes https://github.com/BerriAI/litellm/issues/5584 * Fixed gemini-1.5-flash pricing (#5590) * add /key/list endpoint * bump: version 1.44.21 → 1.44.22 * docs architecture * Fixed gemini-1.5-flash pricing --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * fix(bedrock/chat.py): fix converse api stop sequence param mapping Fixes https://github.com/BerriAI/litellm/issues/5592 * fix(databricks/cost_calculator.py): handle databricks model name changes Fixes https://github.com/BerriAI/litellm/issues/5597 * fix(azure.py): support azure api version 2024-08-01-preview Closes https://github.com/BerriAI/litellm/issues/5377 * fix(proxy/_types.py): allow dev keys to call cohere /rerank endpoint Fixes issue where only admin could call rerank endpoint * fix(azure.py): check if model is gpt-4o * fix(proxy/_types.py): support /v1/rerank on non-admin routes as well * fix(cost_calculator.py): fix split on `/` logic in cost calculator --------- Co-authored-by: F1bos <44951186+F1bos@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-09-09 21:56:12 -07:00
Ishaan Jaff	3c16fcff1b	fix linting errors	2024-09-06 16:41:47 -07:00
Ishaan Jaff	e095daf2e4	add cost tracking for rerank	2024-09-06 16:04:54 -07:00
Ishaan Jaff	4a0fdc40f1	add cost tracking for pass through imagen	2024-09-02 18:10:46 -07:00
Krish Dholakia	9c8f1d7815	anthropic prompt caching cost tracking (#5453 ) * fix(utils.py): support 'drop_params' for embedding requests Fixes https://github.com/BerriAI/litellm/issues/5444 * feat(anthropic/cost_calculation.py): Support calculating cost for prompt caching on anthropic * feat(types/utils.py): allows us to migrate to openai's equivalent, once that comes out * fix: fix linting errors * test: mark flaky test	2024-08-31 14:09:35 -07:00
Krrish Dholakia	55217fa8d7	feat(cost_calculator.py): only override base model if custom pricing is set	2024-08-19 16:05:49 -07:00
Krish Dholakia	1a3b686580	Merge pull request #5219 from dhlidongming/fix-messages-length-check Fix incorrect message length check in cost calculator	2024-08-17 14:01:59 -07:00

1 2

80 commits