Commit graph

461 commits

Author SHA1 Message Date
Krish Dholakia
d58fe5a9f9 Add OpenAI gpt-4o-transcribe support (#9517)
* refactor: introduce new transformation config for gpt-4o-transcribe models

* refactor: expose new transformation configs for audio transcription

* ci: fix config yml

* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions

allows gpt-4o and whisper audio transformation to work as expected

* refactor: migrate fireworks ai + deepgram to new transform request pattern

* feat(openai/): working support for gpt-4o-audio-transcribe

* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map

* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`

* fix(get_supported_openai_params.py): fix return

* refactor(deepgram/): migrate unit test to deepgram handler

* refactor: cleanup unused imports

* fix(get_supported_openai_params.py): fix linting error

* test: update test
2025-03-26 23:10:25 -07:00
Krish Dholakia
9c083e7d2c Support Gemini audio token cost tracking + fix openai audio input token cost tracking (#9535)
* fix(vertex_and_google_ai_studio_gemini.py): log gemini audio tokens in usage object

enables accurate cost tracking

* refactor(vertex_ai/cost_calculator.py): refactor 128k+ token cost calculation to only run if model info has it

Google has moved away from this for gemini-2.0 models

* refactor(vertex_ai/cost_calculator.py): migrate to usage object for more flexible data passthrough

* fix(llm_cost_calc/utils.py): support audio token cost tracking in generic cost per token

enables vertex ai cost tracking to work with audio tokens

* fix(llm_cost_calc/utils.py): default to total prompt tokens if text tokens field not set

* refactor(llm_cost_calc/utils.py): move openai cost tracking to generic cost per token

more consistent behaviour across providers

* test: add unit test for gemini audio token cost calculation

* ci: bump ci config

* test: fix test
2025-03-26 17:26:25 -07:00
Krrish Dholakia
2f01caa919 fix(litellm_logging.py): always log the api base
Fixes issue where api base missing from spend logs due to refactor
2025-03-24 13:45:39 -07:00
Ishaan Jaff
0eec6998e4 fix bug when unable to look up model info 2025-03-22 18:24:47 -07:00
Ishaan Jaff
b6cd32b69b _get_file_search_tool_call 2025-03-22 18:04:33 -07:00
Ishaan Jaff
dc1ba3255f _get_file_search_tool_call 2025-03-22 18:02:22 -07:00
Ishaan Jaff
93a2b00c93 fix StandardBuiltInToolsParams 2025-03-22 17:53:06 -07:00
Ishaan Jaff
8321159889 chat_completion_response_includes_annotations 2025-03-22 17:36:22 -07:00
Ishaan Jaff
c2b9e96b3d fix code quality check 2025-03-22 17:31:47 -07:00
Ishaan Jaff
9ff06e7773 test_openai_responses_api_web_search_cost_tracking 2025-03-22 17:24:29 -07:00
Ishaan Jaff
856e302c2b test_openai_responses_api_web_search_cost_tracking 2025-03-22 17:21:41 -07:00
Ishaan Jaff
3b65360009 get_cost_for_built_in_tools 2025-03-22 17:06:34 -07:00
Ishaan Jaff
cf01f49893 fixes for web search cost tracking 2025-03-22 16:56:32 -07:00
Ishaan Jaff
9a7bdc713e StandardBuiltInToolCostTracking 2025-03-22 16:06:47 -07:00
Ishaan Jaff
55b2cccf75 initialize_standard_built_in_tools_params 2025-03-22 16:03:53 -07:00
Ishaan Jaff
907b2c1c0a test_is_chunk_non_empty_with_annotations 2025-03-22 11:41:53 -07:00
Ishaan Jaff
69da0ed3b5 feat - add openai web search 2025-03-22 10:43:35 -07:00
Ishaan Jaff
2aceeaccd0 fix ModelParamHelper 2025-03-21 21:35:49 -07:00
Ishaan Jaff
32474d6862 _get_litellm_supported_transcription_kwargs 2025-03-21 19:20:13 -07:00
Hammad Saeed
f9c8fac854 fix(model_param_helper.py): update _get_litellm_supported_transcription_kwargs() to use proper annotations from TranscriptionCreateParamsNonStreaming & `TranscriptionCreateParamsStreaming 2025-03-21 16:55:35 -07:00
Hammad Saeed
04fd43bad6 fix(model_param_helper.py): change TranscriptionCreateParams.__annotations__ to TranscriptionCreateParams.__dict__ to clean logging error // handle typeddict 2025-03-21 16:30:46 -07:00
Ishaan Jaff
66b9a4a686 get_chat_completion_prompt 2025-03-19 20:50:15 -07:00
Ishaan Jaff
4db3ca70e4 get_custom_logger_for_prompt_management 2025-03-19 17:38:24 -07:00
Ishaan Jaff
40e1ccf339 CustomPromptManagement 2025-03-19 16:38:59 -07:00
Ishaan Jaff
98c5ffdb20 get_custom_loggers_for_type 2025-03-19 16:36:41 -07:00
Ishaan Jaff
afd7d066bf fix code quality checks 2025-03-18 22:34:43 -07:00
Ishaan Jaff
a27782e657 _arize_otel_logger 2025-03-18 22:19:51 -07:00
Ishaan Jaff
43eedf26f7 Merge branch 'main' into litellm_arize_dynamic_logging 2025-03-18 22:13:35 -07:00
Krrish Dholakia
c8eaf8e419 fix(response_metadata.py): log the litellm_model_name
make it easier to track the model sent to the provider
2025-03-18 17:46:33 -07:00
Krrish Dholakia
d29108bfd5 fix(litellm_logging.py): fix update 2025-03-18 17:18:58 -07:00
Ishaan Jaff
063db7ba91 fix arize logging 2025-03-18 15:51:58 -07:00
Krrish Dholakia
f0da181215 fix(get_llm_provider.py): Fixes https://github.com/BerriAI/litellm/issues/9291 2025-03-18 10:48:28 -07:00
Krish Dholakia
8ee4eead7f Merge pull request #9333 from BerriAI/litellm_dev_03_17_2025_p2
fix(ollama/completions/transformation.py): pass prompt, untemplated o…
2025-03-17 21:48:30 -07:00
Krrish Dholakia
4d56992407 fix(ollama/completions/transformation.py): pass prompt, untemplated on /completions request
Fixes https://github.com/BerriAI/litellm/issues/6900
2025-03-17 18:35:44 -07:00
Krrish Dholakia
301375bf84 fix(streaming_handler.py): support logging complete streaming response on cache hit 2025-03-17 18:10:39 -07:00
Krrish Dholakia
ba6369e359 fix(streaming_handler.py): emit deep copy of completed chunk 2025-03-17 17:26:21 -07:00
Krrish Dholakia
01fc7f4175 fix(logging_utils.py): revert change 2025-03-16 21:04:41 -07:00
Krrish Dholakia
85cf702deb fix(streaming_handler.py): raise stop iteration post-finish reason 2025-03-16 20:40:41 -07:00
Krrish Dholakia
08e73d66a1 fix(litellm_logging.py): remove unused import 2025-03-16 20:24:27 -07:00
Krrish Dholakia
7446038d26 fix(streaming_handler.py): pass complete streaming response on completion 2025-03-16 20:22:12 -07:00
Krrish Dholakia
4d3f4b31d1 fix(streaming_handler.py): return model response on finished chunk 2025-03-16 13:05:46 -07:00
Krrish Dholakia
82320a9b65 refactor(litellm_logging.py): delegate returning a complete response to the streaming_handler
Removes incorrect logic for calculating complete streaming response from litellm logging
2025-03-15 09:55:33 -07:00
Krrish Dholakia
424f51cc06 fix(utils.py): Prevents final chunk w/ usage from being ignored
Fixes https://github.com/BerriAI/litellm/issues/7112
2025-03-15 09:12:14 -07:00
Krrish Dholakia
d818530265 fix(factory.py): reduce ollama pt LOC < 50 2025-03-14 21:10:05 -07:00
Krish Dholakia
7b189e3085 Merge pull request #9261 from briandevvn/fix_ollama_pt
Fix "system" role has become unacceptable in ollama
2025-03-14 20:13:28 -07:00
Krrish Dholakia
b75cd3b887 feat(endpoints.py): support adding credentials by model id
Allows user to reuse existing model credentials
2025-03-14 12:32:32 -07:00
Krrish Dholakia
913dc5b73b feat(endpoints.py): enable retrieving existing credentials by model name
Enables reusing existing credentials
2025-03-14 12:02:50 -07:00
Brian Dev
f53e365170 Support 'system' role ollama 2025-03-15 00:55:18 +07:00
Ishaan Jaff
ceb8668e4a Merge pull request #9220 from BerriAI/litellm_qa_responses_api
[Fixes] Responses API - allow /responses and subpaths as LLM API route + Add exception mapping for responses API
2025-03-13 21:36:59 -07:00
Ishaan Jaff
a6e04aeffb exception_type 2025-03-13 20:09:32 -07:00