Commit graph

80 commits

Author SHA1 Message Date
Krish Dholakia
4330ef8e81
Fix batches api cost tracking + Log batch models in spend logs / standard logging payload (#9077)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 42s
* feat(batches/): fix batch cost calculation - ensure it's accurate

use the correct cost value - prev. defaulting to non-batch cost

* feat(batch_utils.py): log batch models to spend logs + standard logging payload

makes it easy to understand how cost was calculated

* fix: fix stored payload for test

* test: fix test
2025-03-08 11:47:25 -08:00
Krish Dholakia
5e386c28b2
Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krish Dholakia
09462ba80c
Add cohere v2/rerank support (#8421) (#8605)
* Add cohere v2/rerank support (#8421)

* Support v2 endpoint cohere rerank

* Add tests and docs

* Make v1 default if old params used

* Update docs

* Update docs pt 2

* Update tests

* Add e2e test

* Clean up code

* Use inheritence for new config

* Fix linting issues (#8608)

* Fix cohere v2 failing test + linting (#8672)

* Fix test and unused imports

* Fix tests

* fix: fix linting errors

* test: handle tgai instability

* fix: skip service unavailable err

* test: print logs for unstable test

* test: skip unreliable tests

---------

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
2025-02-22 22:25:29 -08:00
Krish Dholakia
b682dc4ec8
Add cost tracking for rerank via bedrock (#8691)
* feat(bedrock/rerank): infer model region if model given as arn

* test: add unit testing to ensure bedrock region name inferred from arn on rerank

* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result

Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137

* test(test_bedrock_completion.py): add testing for bedrock cohere rerank

* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking

* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map

* fix(cost_calculator.py): bedrock/common_utils.py

get base model from model w/ arn -> handles rerank model

* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing

* feat(bedrock/rerank): migrate bedrock config to basererank config

* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"

This reverts commit 84fae1f167.

* test: add testing to ensure large doc / queries are correctly counted

* Revert "test: add testing to ensure large doc / queries are correctly counted"

This reverts commit 4337f1657e.

* fix(migrate-jina-ai-to-rerank-config): enables cost tracking

* refactor(jina_ai/): finish migrating jina ai to base rerank config

enables cost tracking

* fix(jina_ai/rerank): e2e jina ai rerank cost tracking

* fix: cleanup dead code

* fix: fix python3.8 compatibility error

* test: fix test

* test: add e2e testing for azure ai rerank

* fix: fix linting error

* test: mark cohere as flaky
2025-02-20 21:00:18 -08:00
Krish Dholakia
03eef5a2a0
Fix custom pricing - separate provider info from model info (#7990)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 34s
* fix(utils.py): initial commit fixing custom cost tracking

refactors out provider specific model info from `get_model_info` - this was causing custom costs to be registered incorrectly

* fix(utils.py): cleanup `_supports_factory` to check provider info, if model info is None

some providers support features like vision across all models

* fix(utils.py): refactor to use _supports_factory

* test: update testing

* fix: fix linting errors

* test: fix testing
2025-01-25 21:49:28 -08:00
Krish Dholakia
8ca3229b26
Ensure base_model cost tracking works across all endpoints (#7989)
* test(test_completion_cost.py): add sdk test to ensure base model is used for cost tracking

* test(test_completion_cost.py): add sdk test to ensure custom pricing works

* fix(main.py): add base model cost tracking support for embedding calls

Enables base model cost tracking for embedding calls when base model set as a litellm_param

* fix(litellm_logging.py): update logging object with litellm params - including base model, if given

ensures base model param is always tracked

* fix(main.py): fix linting errors
2025-01-24 21:05:26 -08:00
Krish Dholakia
c6e9240405
Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958)
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls

* fix(utils.py): fix split model handling

Fixes bedrock cost calculation when region name is given

* feat(_health_endpoints.py): support health checking datadog integration

Closes https://github.com/BerriAI/litellm/issues/7921
2025-01-23 22:17:09 -08:00
Ishaan Jaff
b7e68eccdd fixes for img gen cost cal 2025-01-12 16:41:18 -08:00
Ishaan Jaff
bb1489eced fix optimize get llm provider 2025-01-12 16:21:23 -08:00
Ishaan Jaff
2c25ea5737
(litellm sdk speedup) - use _model_contains_known_llm_provider in response_cost_calculator to check if the model contains a known litellm provider (#7721)
* define _cached_get_model_info_helper

* use _cached_get_model_info_helper

* speed up _select_model_name_for_cost_calc
2025-01-12 15:40:05 -08:00
Ishaan Jaff
6518bc70a0
(litellm SDK perf improvement) - use verbose_logger.debug and _cached_get_model_info_helper in _response_cost_calculator (#7720)
* define _cached_get_model_info_helper

* use _cached_get_model_info_helper
2025-01-12 15:27:54 -08:00
Ishaan Jaff
dab7bebaf2
use _get_model_info_helper (#7703) 2025-01-11 21:08:15 -08:00
Krish Dholakia
4af23353d6
Allow assigning teams to org on UI + OpenAI omni-moderation cost model tracking (#7566)
* feat(cost_calculator.py): add cost tracking ($0) for openai moderations endpoint

removes sentry cost tracking errors caused by this

* build(teams.tsx): allow assigning teams to orgs
2025-01-08 16:58:21 -08:00
Krish Dholakia
4e69711411
Litellm dev 01 07 2025 p1 (#7618)
* fix(main.py): pass custom llm provider on litellm logging provider update

* fix(cost_calculator.py): don't append provider name to return model if existing llm provider

Fixes https://github.com/BerriAI/litellm/issues/7607

* fix(prometheus_services.py): fix prometheus system health error logging

Fixes https://github.com/BerriAI/litellm/issues/7611
2025-01-07 21:22:31 -08:00
Krish Dholakia
c3edfc2c92
LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 35s
* build(model_prices_and_context_window.json): add gemini-1.5-flash context caching

* fix(context_caching/transformation.py): just use last identified cache point

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(vertex_ai/gemini/): track context caching tokens

* refactor(gemini/): place transformation.py inside `chat/` folder

make it easy for user to know we support the equivalent endpoint

* fix: fix import

* refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder

make it easier to see cost calculation logic

* fix: fix linting errors

* fix: fix circular import

* feat(gemini/cost_calculator.py): support gemini context caching cost calculation

generifies anthropic's cost calculation function and uses it across anthropic + gemini

* build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching

Closes https://github.com/BerriAI/litellm/issues/6891

* docs(gemini.md): add gemini context caching architecture diagram

make it easier for user to understand how context caching works

* docs(gemini.md): link to relevant gemini context caching code

* docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code

* fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario

* fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation

* test: fix test
2024-12-23 22:02:52 -08:00
Krish Dholakia
db59e08958
Litellm dev 12 23 2024 p1 (#7383)
* feat(guardrails_endpoint.py): new `/guardrails/list` endpoint

Allow users to view what the available guardrails are

* docs: document new `/guardrails/list` endpoint

* docs(enterprise.md): update docs

* fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats

* fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests

* fix: fix linting errors

* fix: remove unused import
2024-12-23 16:33:31 -08:00
Ishaan Jaff
11e5960462
use helper for image gen tests (#7343)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
2024-12-20 21:28:32 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Ishaan Jaff
4622e35d96 fix _select_model_name_for_cost_calc docstring 2024-12-18 09:39:31 -08:00
Krish Dholakia
179d2f56b7
LiteLLM Minor Fixes & Improvements (12/16/2024) - p1 (#7263)
* fix(factory.py): skip empty text blocks for bedrock user messages

Fixes https://github.com/BerriAI/litellm/issues/7169

* Add support for Gemini 2.0 GoogleSearch tool (#7257)

* Add support for google_search tool in gemini 2.0

* Add/modify tests

* Fix grounding check

* Remove 2.0 grounding test; exclude experimental model in VERTEX_MODELS_TO_NOT_TEST

* Swap order of tools

* DFix formatting

* fix(get_api_base.py): return api base in streaming response

Fixes https://github.com/BerriAI/litellm/issues/7249

Closes https://github.com/BerriAI/litellm/pull/7250

* fix(cost_calculator.py): only set base model to model if not none

Fixes https://github.com/BerriAI/litellm/issues/7223

* fix(cost_calculator.py): enforce stricter order when picking model for cost calculation

* fix(cost_calculator.py): fix '_select_model_name_for_cost_calc' to return model name with region name prefix if provided

* fix(utils.py): fix 'get_model_info()' to handle edge case where model name starts with custom llm provider AND custom llm provider is given

* fix(cost_calculator.py): handle `custom_llm_provider-` scenario

* fix(cost_calculator.py): e2e working tts cost tracking

ensures initial message is passed in, to cost calculator

* fix(factory.py): suppress linting errors

* fix(cost_calculator.py): strip llm provider from model name after selecting cost calc model

* fix(litellm_logging.py): store initial request in 'input' field + accept base_model to be passed in litellm_params directly

* test: handle none env var value in flaky test

* fix(litellm_logging.py): fix linting errors

---------

Co-authored-by: Sam B <samlingx@gmail.com>
2024-12-17 15:33:36 -08:00
Krish Dholakia
224ead1531
fix(utils.py): fix openai-like api response format parsing (#7273)
* fix(utils.py): fix openai-like api response format parsing

Fixes issue passing structured output to litellm_proxy/ route

* fix(cost_calculator.py): fix whisper transcription cost calc to use file duration, not response time

'

* test: skip test if credentials not found
2024-12-17 12:49:09 -08:00
Krish Dholakia
516c2a6a70
Litellm remove circular imports (#7232)
* fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py

* fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py

'

* refactor: fix litellm.ModelResponse import on pass through endpoints

* refactor(litellm_logging.py): fix circular import for custom callbacks literal

* fix(factory.py): fix circular imports inside prompt factory

* fix(cost_calculator.py): fix circular import for 'litellm.Usage'

* fix(proxy_server.py): fix potential circular import with `litellm.Router'

* fix(proxy/utils.py): fix potential circular import in `litellm.Router`

* fix: remove circular imports in 'auth_checks' and 'guardrails/'

* fix(prompt_injection_detection.py): fix router impor t

* fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through

* fix(anthropic_pass_through_logging_handler.py): fix potential circular imports

* fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import

* fix(base.py): fix potential circular import

* fix(handler.py): fix potential circular ref in codestral + cohere handler's

* fix(azure.py): fix potential circular imports

* fix(gpt_transformation.py): fix modelresponse import

* fix(litellm_logging.py): add logging base class - simplify typing

makes it easy for other files to type check the logging obj without introducing circular imports

* fix(azure_ai/embed): fix potential circular import on handler.py

* fix(databricks/): fix potential circular imports in databricks/

* fix(vertex_ai/): fix potential circular imports on vertex ai embeddings

* fix(vertex_ai/image_gen): fix import

* fix(watsonx-+-bedrock): cleanup imports

* refactor(anthropic-pass-through-+-petals): cleanup imports

* refactor(huggingface/): cleanup imports

* fix(ollama-+-clarifai): cleanup circular imports

* fix(openai_like/): fix impor t

* fix(openai_like/): fix embedding handler

cleanup imports

* refactor(openai.py): cleanup imports

* fix(sagemaker/transformation.py): fix import

* ci(config.yml): add circular import test to ci/cd
2024-12-14 16:28:34 -08:00
Ishaan Jaff
21003c4337
Code Quality Improvement - use vertex_ai/ as folder name for vertexAI (#7166)
* fix rename vertex ai

* run ci/cd again
2024-12-11 00:32:41 -08:00
Ishaan Jaff
bfb6891eb7
rename llms/OpenAI/ -> llms/openai/ (#7154)
* rename OpenAI -> openai

* fix file rename

* fix rename changes

* fix organization of openai/transcription

* fix import OA fine tuning API

* fix openai ft handler

* fix handler import
2024-12-10 20:14:07 -08:00
Ishaan Jaff
36e99ebce7
fix use consistent naming (#7092)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
2024-12-07 22:01:00 -08:00
Krish Dholakia
816f0ef8d2
LiteLLM Minor Fixes & Improvements (12/05/2024) (#7051)
* fix(cost_calculator.py): move to using `.get_model_info()` for cost per token calculations

ensures cost tracking is reliable - handles edge cases of parsing model cost map

* build(model_prices_and_context_window.json): add 'supports_response_schema' for select tgai models

Fixes https://github.com/BerriAI/litellm/pull/7037#discussion_r1872157329

* build(model_prices_and_context_window.json): remove 'pdf input' and 'vision' support from nova micro in model map

Bedrock docs indicate no support for micro - https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html

* fix(converse_transformation.py): support amazon nova tool use

* fix(opentelemetry): Add missing LLM request type attribute to spans (#7041)

* feat(opentelemetry): add LLM request type attribute to spans

* lint

* fix: curl usage (#7038)

curl -d, --data <data> is lowercase d
curl -D, --dump-header <filename> is uppercase D

references:
https://curl.se/docs/manpage.html#-d
https://curl.se/docs/manpage.html#-D

* fix(spend_tracking.py): handle empty 'id' in model response - when creating spend log

Fixes https://github.com/BerriAI/litellm/issues/7023

* fix(streaming_chunk_builder.py): handle initial id being empty string

Fixes https://github.com/BerriAI/litellm/issues/7023

* fix(anthropic_passthrough_logging_handler.py): add end user cost tracking for anthropic pass through endpoint

* docs(pass_through/): refactor docs location + add table on supported features for pass through endpoints

* feat(anthropic_passthrough_logging_handler.py): support end user cost tracking via anthropic sdk

* docs(anthropic_completion.md): add docs on passing end user param for cost tracking on anthropic sdk

* fix(litellm_logging.py): use standard logging payload if present in kwargs

prevent datadog logging error for pass through endpoints

* docs(bedrock.md): add rerank api usage example to docs

* bugfix/change dummy tool name format (#7053)

* fix viewing keys (#7042)

* ui new build

* build(model_prices_and_context_window.json): add bedrock region models to model cost map (#7044)

* bye (#6982)

* (fix) litellm router.aspeech  (#6962)

* doc Migrating Databases

* fix aspeech on router

* test_audio_speech_router

* test_audio_speech_router

* docs show supported providers on batches api doc

* change dummy tool name format

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>

* fix: fix linting errors

* test: update test

* fix(litellm_logging.py): fix pass through check

* fix(test_otel_logging.py): fix test

* fix(cost_calculator.py): update handling for cost per second

* fix(cost_calculator.py): fix cost check

* test: fix test

* (fix) adding public routes when using custom header  (#7045)

* get_api_key_from_custom_header

* add test_get_api_key_from_custom_header

* fix testing use 1 file for test user api key auth

* fix test user api key auth

* test_custom_api_key_header_name

* build: update ui build

---------

Co-authored-by: Doron Kopit <83537683+doronkopit5@users.noreply.github.com>
Co-authored-by: lloydchang <lloydchang@gmail.com>
Co-authored-by: hgulersen <haymigulersen@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
2024-12-06 14:29:53 -08:00
Ishaan Jaff
c03351328f
fix imagegeneration output_cost_per_image on model cost map (#6752) 2024-11-14 20:37:21 -08:00
Ishaan Jaff
73c7b73aa0
(feat) Add cost tracking for Azure Dall-e-3 Image Generation + use base class to ensure basic image generation tests pass (#6716)
* add BaseImageGenTest

* use 1 class for unit testing

* add debugging to BaseImageGenTest

* TestAzureOpenAIDalle3

* fix response_cost_calculator

* test_basic_image_generation

* fix img gen basic test

* fix _select_model_name_for_cost_calc

* fix test_aimage_generation_bedrock_with_optional_params

* fix undo changes cost tracking

* fix response_cost_calculator

* fix test_cost_azure_gpt_35
2024-11-12 20:02:16 -08:00
Ishaan Jaff
25bae4cc23
(feat) add cost tracking stable diffusion 3 on Bedrock (#6676)
* add cost tracking for sd3

* test_image_generation_bedrock

* fix get model info for image cost

* add cost_calculator for stability 1 models

* add unit testing for bedrock image cost calc

* test_cost_calculator_with_no_optional_params

* add test_cost_calculator_basic

* correctly allow size Optional

* fix cost_calculator

* sd3 unit tests cost calc
2024-11-11 20:21:44 -08:00
Krish Dholakia
7cc12bd5c6
LiteLLM Minor Fixes & Improvements (10/18/2024) (#6320)
* fix(converse_transformation.py): handle cross region model name when getting openai param support

Fixes https://github.com/BerriAI/litellm/issues/6291

* LiteLLM Minor Fixes & Improvements (10/17/2024)  (#6293)

* fix(ui_sso.py): fix faulty admin only check

Fixes https://github.com/BerriAI/litellm/issues/6286

* refactor(sso_helper_utils.py): refactor /sso/callback to use helper utils, covered by unit testing

Prevent future regressions

* feat(prompt_factory): support 'ensure_alternating_roles' param

Closes https://github.com/BerriAI/litellm/issues/6257

* fix(proxy/utils.py): add dailytagspend to expected views

* feat(auth_utils.py): support setting regex for clientside auth credentials

Fixes https://github.com/BerriAI/litellm/issues/6203

* build(cookbook): add tutorial for mlflow + langchain + litellm proxy tracing

* feat(argilla.py): add argilla logging integration

Closes https://github.com/BerriAI/litellm/issues/6201

* fix: fix linting errors

* fix: fix ruff error

* test: fix test

* fix: update vertex ai assumption - parts not always guaranteed (#6296)

* docs(configs.md): add argila env var to docs

* docs(user_keys.md): add regex doc for clientside auth params

* docs(argilla.md): add doc on argilla logging

* docs(argilla.md): add sampling rate to argilla calls

* bump: version 1.49.6 → 1.49.7

* add gpt-4o-audio models to model cost map (#6306)

* (code quality) add ruff check PLR0915 for `too-many-statements`  (#6309)

* ruff add PLR0915

* add noqa for PLR0915

* fix noqa

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* doc fix Turn on / off caching per Key. (#6297)

* (feat) Support `audio`,  `modalities` params (#6304)

* add audio, modalities param

* add test for gpt audio models

* add get_supported_openai_params for GPT audio models

* add supported params for audio

* test_audio_output_from_model

* bump openai to openai==1.52.0

* bump openai on pyproject

* fix audio test

* fix test mock_chat_response

* handle audio for Message

* fix handling audio for OAI compatible API endpoints

* fix linting

* fix mock dbrx test

* (feat) Support audio param in responses streaming (#6312)

* add audio, modalities param

* add test for gpt audio models

* add get_supported_openai_params for GPT audio models

* add supported params for audio

* test_audio_output_from_model

* bump openai to openai==1.52.0

* bump openai on pyproject

* fix audio test

* fix test mock_chat_response

* handle audio for Message

* fix handling audio for OAI compatible API endpoints

* fix linting

* fix mock dbrx test

* add audio to Delta

* handle model_response.choices.delta.audio

* fix linting

* build(model_prices_and_context_window.json): add gpt-4o-audio audio token cost tracking

* refactor(model_prices_and_context_window.json): refactor 'supports_audio' to be 'supports_audio_input' and 'supports_audio_output'

Allows for flag to be used for openai + gemini models (both support audio input)

* feat(cost_calculation.py): support cost calc for audio model

Closes https://github.com/BerriAI/litellm/issues/6302

* feat(utils.py): expose new `supports_audio_input` and `supports_audio_output` functions

Closes https://github.com/BerriAI/litellm/issues/6303

* feat(handle_jwt.py): support single dict list

* fix(cost_calculator.py): fix linting errors

* fix: fix linting error

* fix(cost_calculator): move to using standard openai usage cached tokens value

* test: fix test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-10-19 22:23:27 -07:00
Ishaan Jaff
610974b4fc
(code quality) add ruff check PLR0915 for too-many-statements (#6309)
* ruff add PLR0915

* add noqa for PLR0915

* fix noqa

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915
2024-10-18 15:36:49 +05:30
Krish Dholakia
f252350881
LiteLLM Minor Fixes & Improvements (10/17/2024) (#6293)
* fix(ui_sso.py): fix faulty admin only check

Fixes https://github.com/BerriAI/litellm/issues/6286

* refactor(sso_helper_utils.py): refactor /sso/callback to use helper utils, covered by unit testing

Prevent future regressions

* feat(prompt_factory): support 'ensure_alternating_roles' param

Closes https://github.com/BerriAI/litellm/issues/6257

* fix(proxy/utils.py): add dailytagspend to expected views

* feat(auth_utils.py): support setting regex for clientside auth credentials

Fixes https://github.com/BerriAI/litellm/issues/6203

* build(cookbook): add tutorial for mlflow + langchain + litellm proxy tracing

* feat(argilla.py): add argilla logging integration

Closes https://github.com/BerriAI/litellm/issues/6201

* fix: fix linting errors

* fix: fix ruff error

* test: fix test

* fix: update vertex ai assumption - parts not always guaranteed (#6296)

* docs(configs.md): add argila env var to docs
2024-10-17 22:09:11 -07:00
Krish Dholakia
6005450c8f
LiteLLM Minor Fixes & Improvements (10/09/2024) (#6139)
* fix(utils.py): don't return 'none' response headers

Fixes https://github.com/BerriAI/litellm/issues/6123

* fix(vertex_and_google_ai_studio_gemini.py): support parsing out additional properties and strict value for tool calls

Fixes https://github.com/BerriAI/litellm/issues/6136

* fix(cost_calculator.py): set default character value to none

Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403290196

* fix(google.py): fix cost per token / cost per char conversion

Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403370287

* build(model_prices_and_context_window.json): update gemini pricing

Fixes https://github.com/BerriAI/litellm/issues/6133

* build(model_prices_and_context_window.json): update gemini pricing

* fix(litellm_logging.py): fix streaming caching logging when 'turn_off_message_logging' enabled

Stores unredacted response in cache

* build(model_prices_and_context_window.json): update gemini-1.5-flash pricing

* fix(cost_calculator.py): fix default prompt_character count logic

Fixes error in gemini cost calculation

* fix(cost_calculator.py): fix cost calc for tts models
2024-10-10 00:42:11 -07:00
Krish Dholakia
9695c1af10
LiteLLM Minor Fixes & Improvements (10/08/2024) (#6119)
* refactor(cost_calculator.py): move error line to debug - https://github.com/BerriAI/litellm/issues/5683#issuecomment-2398599498

* fix(migrate-hidden-params-to-read-from-standard-logging-payload): Fixes https://github.com/BerriAI/litellm/issues/5546#issuecomment-2399994026

* fix(types/utils.py): mark weight as a litellm param

Fixes https://github.com/BerriAI/litellm/issues/5781

* feat(internal_user_endpoints.py): fix /user/info + show user max budget as default max budget

Fixes https://github.com/BerriAI/litellm/issues/6117

* feat: support returning team member budget in `/user/info`

Sets user max budget in team as max budget on ui

  Closes https://github.com/BerriAI/litellm/issues/6117

* bug fix for optional parameter passing to replicate (#6067)

Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>

* fix(o1_transformation.py): handle o1 temperature=0

o1 doesn't support temp=0, allow admin to drop this param

* test: fix test

---------

Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>
Co-authored-by: Mandana Vaziri <mvaziri@us.ibm.com>
2024-10-08 21:57:03 -07:00
Krish Dholakia
fac3b2ee42
Add pyright to ci/cd + Fix remaining type-checking errors (#6082)
* fix: fix type-checking errors

* fix: fix additional type-checking errors

* fix: additional type-checking error fixes

* fix: fix additional type-checking errors

* fix: additional type-check fixes

* fix: fix all type-checking errors + add pyright to ci/cd

* fix: fix incorrect import

* ci(config.yml): use mypy on ci/cd

* fix: fix type-checking errors in utils.py

* fix: fix all type-checking errors on main.py

* fix: fix mypy linting errors

* fix(anthropic/cost_calculator.py): fix linting errors

* fix: fix mypy linting errors

* fix: fix linting errors
2024-10-05 17:04:00 -04:00
Ishaan Jaff
ab0b536143
(feat) add azure openai cost tracking for prompt caching (#6077)
* add azure o1 models to model cost map

* add azure o1 cost tracking

* fix azure cost calc

* add get llm provider test
2024-10-05 15:04:18 +05:30
Ishaan Jaff
3682f661d8
(feat) add cost tracking for OpenAI prompt caching (#6055)
* add cache_read_input_token_cost for prompt caching models

* add prompt caching for latest models

* add openai cost calculator

* add openai prompt caching test

* fix lint check

* add not on how usage._cache_read_input_tokens is used

* fix cost calc whisper openai

* use output_cost_per_second

* add input_cost_per_second
2024-10-05 14:20:15 +05:30
Krish Dholakia
bd17424c4b
LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925) (#5937)
* LiteLLM Minor Fixes & Improvements (09/26/2024)  (#5925)

* fix(litellm_logging.py): don't initialize prometheus_logger if non premium user

Prevents bad error messages in logs

Fixes https://github.com/BerriAI/litellm/issues/5897

* Add Support for Custom Providers in Vision and Function Call Utils (#5688)

* Add Support for Custom Providers in Vision and Function Call Utils Lookup

* Remove parallel function call due to missing model info param

* Add Unit Tests for Vision and Function Call Changes

* fix-#5920: set header value to string to fix "'int' object has no att… (#5922)

* LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880)

* LiteLLM Minor Fixes & Improvements (09/23/2024)  (#5842)

* feat(auth_utils.py): enable admin to allow client-side credentials to be passed

Makes it easier for devs to experiment with finetuned fireworks ai models

* feat(router.py): allow setting configurable_clientside_auth_params for a model

Closes https://github.com/BerriAI/litellm/issues/5843

* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit

Fixes https://github.com/BerriAI/litellm/issues/5850

* fix(azure_ai/): support content list for azure ai

Fixes https://github.com/BerriAI/litellm/issues/4237

* fix(litellm_logging.py): always set saved_cache_cost

Set to 0 by default

* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing

handles calling 405b+ size models

* fix(slack_alerting.py): fix error alerting for failed spend tracking

Fixes regression with slack alerting error monitoring

* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error

* docs(bedrock.md): add llama3-1 models

* test: fix tests

* fix(azure_ai/chat): fix transformation for azure ai calls

* feat(azure_ai/embed): Add azure ai embeddings support

Closes https://github.com/BerriAI/litellm/issues/5861

* fix(azure_ai/embed): enable async embedding

* feat(azure_ai/embed): support azure ai multimodal embeddings

* fix(azure_ai/embed): support async multi modal embeddings

* feat(together_ai/embed): support together ai embedding calls

* feat(rerank/main.py): log source documents for rerank endpoints to langfuse

improves rerank endpoint logging

* fix(langfuse.py): support logging `/audio/speech` input to langfuse

* test(test_embedding.py): fix test

* test(test_completion_cost.py): fix helper util

* fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'"

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926)

This reverts commit a554ae2695.

* build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing

Enables cost tracking for azure ai cohere rerank models

* fix(litellm_logging.py): fix debug log to be clearer

Closes https://github.com/BerriAI/litellm/issues/5909

* test(test_utils.py): fix test name

* fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models

* fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints

* fix(converse_handler.py): support new llama 3-2 models

Fixes https://github.com/BerriAI/litellm/issues/5901

* fix(litellm_logging.py): ensure response is redacted for standard message logging

Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360

* fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation

allows user to set custom cost for model

* fix(config.yml): fix docker hub auht

* build(config.yml): add docker auth to all tests

* fix(db/create_views.py): fix linting error

* fix(main.py): fix circular import

* fix(azure_ai/__init__.py): fix circular import

* fix(main.py): fix import

* fix: fix linting errors

* test: fix test

* fix(proxy_server.py): pass premium user value on startup

used for prometheus init

---------

Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>

* handle streaming for azure ai studio error

* [Perf Proxy] parallel request limiter - use one cache update call (#5932)

* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* test: fix test

* test(test_rerank.py): fix test

---------

Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-27 17:54:13 -07:00
Krish Dholakia
16c0307eab
LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880)
* LiteLLM Minor Fixes & Improvements (09/23/2024)  (#5842)

* feat(auth_utils.py): enable admin to allow client-side credentials to be passed

Makes it easier for devs to experiment with finetuned fireworks ai models

* feat(router.py): allow setting configurable_clientside_auth_params for a model

Closes https://github.com/BerriAI/litellm/issues/5843

* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit

Fixes https://github.com/BerriAI/litellm/issues/5850

* fix(azure_ai/): support content list for azure ai

Fixes https://github.com/BerriAI/litellm/issues/4237

* fix(litellm_logging.py): always set saved_cache_cost

Set to 0 by default

* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing

handles calling 405b+ size models

* fix(slack_alerting.py): fix error alerting for failed spend tracking

Fixes regression with slack alerting error monitoring

* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error

* docs(bedrock.md): add llama3-1 models

* test: fix tests

* fix(azure_ai/chat): fix transformation for azure ai calls

* feat(azure_ai/embed): Add azure ai embeddings support

Closes https://github.com/BerriAI/litellm/issues/5861

* fix(azure_ai/embed): enable async embedding

* feat(azure_ai/embed): support azure ai multimodal embeddings

* fix(azure_ai/embed): support async multi modal embeddings

* feat(together_ai/embed): support together ai embedding calls

* feat(rerank/main.py): log source documents for rerank endpoints to langfuse

improves rerank endpoint logging

* fix(langfuse.py): support logging `/audio/speech` input to langfuse

* test(test_embedding.py): fix test

* test(test_completion_cost.py): fix helper util
2024-09-25 22:11:57 -07:00
Krish Dholakia
2488e4b45f
Cost tracking improvements (#5828)
* feat(litellm_logging.py): update standard logging payload to include debug information for cost failures

Also includes fixes for cohere rerank cost tracking + databricks llama2 model cost tracking

 Easier to repro cost failures and improve reliability in prod

* fix(proxy_server.py): emit cost failure debug info for slack alerting

Improves debug information for cost tracking failures, on slack alerting
2024-09-21 21:47:50 -07:00
Krish Dholakia
d46660ea0f
LiteLLM Minor Fixes & Improvements (09/18/2024) (#5772)
* fix(proxy_server.py): fix azure key vault logic to not require client id/secret

* feat(cost_calculator.py): support fireworks ai cost tracking

* build(docker-compose.yml): add lines for mounting config.yaml to docker compose

Closes https://github.com/BerriAI/litellm/issues/5739

* fix(input.md): update docs to clarify litellm supports content as a list of dictionaries

Fixes https://github.com/BerriAI/litellm/issues/5755

* fix(input.md): update input.md to include all message values

* fix(image_handling.py): follow image url redirects

Fixes https://github.com/BerriAI/litellm/issues/5763

* fix(router.py): Fix model key/base leak in error message

Fixes https://github.com/BerriAI/litellm/issues/5762

* fix(http_handler.py): fix linting error

* fix(azure.py): fix logging to show azure_ad_token being used

Fixes https://github.com/BerriAI/litellm/issues/5767

* fix(_redis.py): add redis sentinel support

Closes https://github.com/BerriAI/litellm/issues/4381

* feat(_redis.py): add redis sentinel support

Closes https://github.com/BerriAI/litellm/issues/4381

* test(test_completion_cost.py): fix test

* Databricks Integration: Integrate Databricks SDK as optional mechanism for fetching API base and token, if unspecified (#5746)

* LiteLLM Minor Fixes & Improvements (09/16/2024)  (#5723)

* coverage (#5713)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* Move (#5714)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix(litellm_logging.py): fix logging client re-init (#5710)

Fixes https://github.com/BerriAI/litellm/issues/5695

* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config

Fixes https://github.com/BerriAI/litellm/issues/5682

* feat(o1_handler.py): fake streaming for openai o1 models

Fixes https://github.com/BerriAI/litellm/issues/5694

* docs: deprecated traceloop integration in favor of native otel (#5249)

* fix: fix linting errors

* fix: fix linting errors

* fix(main.py): fix o1 import

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730)

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view

Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it

* fix(custom_logger.py): reset calltype

* fix: fix linting errors

* fix: fix linting error

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix: fix import

* Fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* DB test

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* Coverage

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* progress

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix test name

Signed-off-by: dbczumar <corey.zumar@databricks.com>

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

* test: fix test

* test(test_databricks.py): fix test

* fix(databricks/chat.py): handle custom endpoint (e.g. sagemaker)

* Apply code scanning fix for clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix(__init__.py): fix known fireworks ai models

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2024-09-19 13:25:29 -07:00
Krish Dholakia
98c335acd0
LiteLLM Minor Fixes & Improvements (09/17/2024) (#5742)
* fix(proxy_server.py): use default azure credentials to support azure non-client secret kms

* fix(langsmith.py): raise error if credentials missing

* feat(langsmith.py): support error logging for langsmith + standard logging payload

Fixes https://github.com/BerriAI/litellm/issues/5738

* Fix hardcoding of schema in view check (#5749)

* fix - deal with case when check view exists returns None (#5740)

* Revert "fix - deal with case when check view exists returns None (#5740)" (#5741)

This reverts commit 535228159b.

* test(test_router_debug_logs.py): move to mock response

* Fix hardcoding of schema

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>

* fix(proxy_server.py): allow admin to disable ui via `DISABLE_ADMIN_UI` flag

* fix(router.py): fix default model name value

Fixes 55db19a1e4 (r1763712148)

* fix(utils.py): fix unbound variable error

* feat(rerank/main.py): add azure ai rerank endpoints

Closes https://github.com/BerriAI/litellm/issues/5667

* feat(secret_detection.py): Allow configuring secret detection params

Allows admin to control what plugins to run for secret detection. Prevents overzealous secret detection.

* docs(secret_detection.md): add secret detection guardrail docs

* fix: fix linting errors

* fix - deal with case when check view exists returns None (#5740)

* Revert "fix - deal with case when check view exists returns None (#5740)" (#5741)

This reverts commit 535228159b.

* Litellm fix router testing (#5748)

* test: fix testing - azure changed content policy error logic

* test: fix tests to use mock responses

* test(test_image_generation.py): handle api instability

* test(test_image_generation.py): handle azure api instability

* fix(utils.py): fix unbounded variable error

* fix(utils.py): fix unbounded variable error

* test: refactor test to use mock response

* test: mark flaky azure tests

* Bump next from 14.1.1 to 14.2.10 in /ui/litellm-dashboard (#5753)

Bumps [next](https://github.com/vercel/next.js) from 14.1.1 to 14.2.10.
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v14.1.1...v14.2.10)

---
updated-dependencies:
- dependency-name: next
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Fix] o1-mini causes pydantic warnings on `reasoning_tokens`  (#5754)

* add requester_metadata in standard logging payload

* log requester_metadata in metadata

* use StandardLoggingPayload for logging

* docs StandardLoggingPayload

* fix import

* include standard logging object in failure

* add test for requester metadata

* handle completion_tokens_details

* add test for completion_tokens_details

* [Feat-Proxy-DataDog] Log Redis, Postgres Failure events on DataDog  (#5750)

* dd - start tracking redis status on dd

* add async_service_succes_hook / failure hook in custom logger

* add async_service_failure_hook

* log service failures on dd

* fix import error

* add test for redis errors / warning

* [Fix] Router/ Proxy - Tag Based routing, raise correct error when no deployments found and tag filtering is on  (#5745)

* fix tag routing - raise correct error when no model with tag based routing

* fix error string from tag based routing

* test router tag based routing

* raise 401 error when no tags avialable for deploymen

* linting fix

* [Feat] Log Request metadata on gcs bucket logging (#5743)

* add requester_metadata in standard logging payload

* log requester_metadata in metadata

* use StandardLoggingPayload for logging

* docs StandardLoggingPayload

* fix import

* include standard logging object in failure

* add test for requester metadata

* fix(litellm_logging.py): fix logging message

* fix(rerank_api/main.py): fix linting errors

* fix(custom_guardrails.py): maintain backwards compatibility for older guardrails

* fix(rerank_api/main.py): fix cost tracking for rerank endpoints

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: steffen-sbt <148480574+steffen-sbt@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-17 23:00:04 -07:00
Krish Dholakia
0295a22561
LiteLLM Minor Fixes and Improvements (09/10/2024) (#5618)
* fix(cost_calculator.py): move to debug for noisy warning message on cost calculation error

Fixes https://github.com/BerriAI/litellm/issues/5610

* fix(databricks/cost_calculator.py): Handles model name issues for databricks models

* fix(main.py): fix stream chunk builder for multiple tool calls

Fixes https://github.com/BerriAI/litellm/issues/5591

* fix: correctly set user_alias when passed in

Fixes https://github.com/BerriAI/litellm/issues/5612

* fix(types/utils.py): allow passing role for message object

https://github.com/BerriAI/litellm/issues/5621

* fix(litellm_logging.py): Fix langfuse logging across multiple projects

Fixes issue where langfuse logger was re-using the old logging object

* feat(proxy/_types.py): support adding key-based tags for tag-based routing

Enable tag based routing at key-level

* fix(proxy/_types.py): fix inheritance

* test(test_key_generate_prisma.py): fix test

* test: fix test

* fix(litellm_logging.py): return used callback object
2024-09-11 11:30:29 -07:00
Krish Dholakia
2d2282101b
LiteLLM Minor Fixes and Improvements (09/09/2024) (#5602)
* fix(main.py): pass default azure api version as alternative in completion call

Fixes api error caused due to api version

Closes https://github.com/BerriAI/litellm/issues/5584

* Fixed gemini-1.5-flash pricing (#5590)

* add /key/list endpoint

* bump: version 1.44.21 → 1.44.22

* docs architecture

* Fixed gemini-1.5-flash pricing

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* fix(bedrock/chat.py): fix converse api stop sequence param mapping

Fixes https://github.com/BerriAI/litellm/issues/5592

* fix(databricks/cost_calculator.py): handle databricks model name changes

Fixes https://github.com/BerriAI/litellm/issues/5597

* fix(azure.py): support azure api version 2024-08-01-preview

Closes https://github.com/BerriAI/litellm/issues/5377

* fix(proxy/_types.py): allow dev keys to call cohere /rerank endpoint

Fixes issue where only admin could call rerank endpoint

* fix(azure.py): check if model is gpt-4o

* fix(proxy/_types.py): support /v1/rerank on non-admin routes as well

* fix(cost_calculator.py): fix split on `/` logic in cost calculator

---------

Co-authored-by: F1bos <44951186+F1bos@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-09 21:56:12 -07:00
Ishaan Jaff
3c16fcff1b fix linting errors 2024-09-06 16:41:47 -07:00
Ishaan Jaff
e095daf2e4 add cost tracking for rerank 2024-09-06 16:04:54 -07:00
Ishaan Jaff
4a0fdc40f1 add cost tracking for pass through imagen 2024-09-02 18:10:46 -07:00
Krish Dholakia
9c8f1d7815
anthropic prompt caching cost tracking (#5453)
* fix(utils.py): support 'drop_params' for embedding requests

Fixes https://github.com/BerriAI/litellm/issues/5444

* feat(anthropic/cost_calculation.py): Support calculating cost for prompt caching on anthropic

* feat(types/utils.py): allows us to migrate to openai's equivalent, once that comes out

* fix: fix linting errors

* test: mark flaky test
2024-08-31 14:09:35 -07:00
Krrish Dholakia
55217fa8d7 feat(cost_calculator.py): only override base model if custom pricing is set 2024-08-19 16:05:49 -07:00
Krish Dholakia
1a3b686580
Merge pull request #5219 from dhlidongming/fix-messages-length-check
Fix incorrect message length check in cost calculator
2024-08-17 14:01:59 -07:00