Commit graph

86 commits

Author SHA1 Message Date
Prathamesh Saraf
c832b0fded
Added get responses API (#10234) 2025-04-23 07:07:00 -07:00
Krish Dholakia
217681eb5e
Litellm dev 04 22 2025 p1 (#10206)
* fix(openai.py): initial commit adding generic event type for openai responses api streaming

Ensures handling for undocumented event types - e.g. "response.reasoning_summary_part.added"

* fix(transformation.py): handle unknown openai response type

* fix(datadog_llm_observability.py): handle dict[str, any] -> dict[str, str] conversion

Fixes https://github.com/BerriAI/litellm/issues/9494

* test: add more unit testing

* test: add unit test

* fix(common_utils.py): fix message with content list

* test: update testing
2025-04-22 23:58:43 -07:00
Ishaan Jaff
868cdd0226
[Feat] Add Support for DELETE /v1/responses/{response_id} on OpenAI, Azure OpenAI (#10205)
* add transform_delete_response_api_request to base responses config

* add transform_delete_response_api_request

* add delete_response_api_handler

* fixes for deleting responses, response API

* add adelete_responses

* add async test_basic_openai_responses_delete_endpoint

* test_basic_openai_responses_delete_endpoint

* working delete for streaming on responses API

* fixes azure transformation

* TestAnthropicResponsesAPITest

* fix code check

* fix linting

* fixes for get_complete_url

* test_basic_openai_responses_streaming_delete_endpoint

* streaming fixes
2025-04-22 18:27:03 -07:00
Ishaan Jaff
d3e04eac7f
[Feat] Unified Responses API - Add Azure Responses API support (#10116)
* initial commit for azure responses api support

* update get complete url

* fixes for responses API

* working azure responses API

* working responses API

* test suite for responses API

* azure responses API test suite

* fix test with complete url

* fix test refactor

* test fix metadata checks

* fix code quality check
2025-04-17 16:47:59 -07:00
Krish Dholakia
8ddaf3dfbc
fix(o_series_transformation.py): correctly map o4 to openai o_series model (#10079)
Fixes https://github.com/BerriAI/litellm/issues/10066
2025-04-16 21:51:31 -07:00
Krish Dholakia
9b0f871129
Add /vllm/* and /mistral/* passthrough endpoints (adds support for Mistral OCR via passthrough)
* feat(llm_passthrough_endpoints.py): support mistral passthrough

Closes https://github.com/BerriAI/litellm/issues/9051

* feat(llm_passthrough_endpoints.py): initial commit for adding vllm passthrough route

* feat(vllm/common_utils.py): add new vllm model info route

make it possible to use vllm passthrough route via factory function

* fix(llm_passthrough_endpoints.py): add all methods to vllm passthrough route

* fix: fix linting error

* fix: fix linting error

* fix: fix ruff check

* fix(proxy/_types.py): add new passthrough routes

* docs(config_settings.md): add mistral env vars to docs
2025-04-14 22:06:33 -07:00
Krish Dholakia
3ca82c22b6
Support CRUD endpoints for Managed Files (#9924)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* fix(files/main.py): pass litellm params to azure route

* test: fix test
2025-04-11 21:48:27 -07:00
Krish Dholakia
6ba3c4a4f8
VertexAI non-jsonl file storage support (#9781)
* test: add initial e2e test

* fix(vertex_ai/files): initial commit adding sync file create support

* refactor: initial commit of vertex ai non-jsonl files reaching gcp endpoint

* fix(vertex_ai/files/transformation.py): initial working commit of non-jsonl file call reaching backend endpoint

* fix(vertex_ai/files/transformation.py): working e2e non-jsonl file upload

* test: working e2e jsonl call

* test: unit testing for jsonl file creation

* fix(vertex_ai/transformation.py): reset file pointer after read

allow multiple reads on same file object

* fix: fix linting errors

* fix: fix ruff linting errors

* fix: fix import

* fix: fix linting error

* fix: fix linting error

* fix(vertex_ai/files/transformation.py): fix linting error

* test: update test

* test: update tests

* fix: fix linting errors

* fix: fix test

* fix: fix linting error
2025-04-09 14:01:48 -07:00
Krish Dholakia
34bdf36eab
Add inference providers support for Hugging Face (#8258) (#9738) (#9773)
* Add inference providers support for Hugging Face (#8258)

* add first version of inference providers for huggingface

* temporarily skipping tests

* Add documentation

* Fix titles

* remove max_retries from params and clean up

* add suggestions

* use llm http handler

* update doc

* add suggestions

* run formatters

* add tests

* revert

* revert

* rename file

* set maxsize for lru cache

* fix embeddings

* fix inference url

* fix tests following breaking change in main

* use ChatCompletionRequest

* fix tests and lint

* [Hugging Face] Remove outdated chat completion tests and fix embedding tests (#9749)

* remove or fix tests

* fix link in doc

* fix(config_settings.md): document hf api key

---------

Co-authored-by: célina <hanouticelina@gmail.com>
2025-04-05 10:50:15 -07:00
Krish Dholakia
053b0e741f
Add Google AI Studio /v1/files upload API support (#9645)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 23s
* test: fix import for test

* fix: fix bad error string

* docs: cleanup files docs

* fix(files/main.py): cleanup error string

* style: initial commit with a provider/config pattern for files api

google ai studio files api onboarding

* fix: test

* feat(gemini/files/transformation.py): support gemini files api response transformation

* fix(gemini/files/transformation.py): return file id as gemini uri

allows id to be passed in to chat completion request, just like openai

* feat(llm_http_handler.py): support async route for files api on llm_http_handler

* fix: fix linting errors

* fix: fix model info check

* fix: fix ruff errors

* fix: fix linting errors

* Revert "fix: fix linting errors"

This reverts commit 926a5a527f.

* fix: fix linting errors

* test: fix test

* test: fix tests
2025-04-02 08:56:58 -07:00
Krish Dholakia
9b7ebb6a7d
build(pyproject.toml): add new dev dependencies - for type checking (#9631)
* build(pyproject.toml): add new dev dependencies - for type checking

* build: reformat files to fit black

* ci: reformat to fit black

* ci(test-litellm.yml): make tests run clear

* build(pyproject.toml): add ruff

* fix: fix ruff checks

* build(mypy/): fix mypy linting errors

* fix(hashicorp_secret_manager.py): fix passing cert for tls auth

* build(mypy/): resolve all mypy errors

* test: update test

* fix: fix black formatting

* build(pre-commit-config.yaml): use poetry run black

* fix(proxy_server.py): fix linting error

* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Krish Dholakia
c0845fec1f
Add OpenAI gpt-4o-transcribe support (#9517)
* refactor: introduce new transformation config for gpt-4o-transcribe models

* refactor: expose new transformation configs for audio transcription

* ci: fix config yml

* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions

allows gpt-4o and whisper audio transformation to work as expected

* refactor: migrate fireworks ai + deepgram to new transform request pattern

* feat(openai/): working support for gpt-4o-audio-transcribe

* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map

* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`

* fix(get_supported_openai_params.py): fix return

* refactor(deepgram/): migrate unit test to deepgram handler

* refactor: cleanup unused imports

* fix(get_supported_openai_params.py): fix linting error

* test: update test
2025-03-26 23:10:25 -07:00
Krish Dholakia
4351c77253
Support Gemini audio token cost tracking + fix openai audio input token cost tracking (#9535)
* fix(vertex_and_google_ai_studio_gemini.py): log gemini audio tokens in usage object

enables accurate cost tracking

* refactor(vertex_ai/cost_calculator.py): refactor 128k+ token cost calculation to only run if model info has it

Google has moved away from this for gemini-2.0 models

* refactor(vertex_ai/cost_calculator.py): migrate to usage object for more flexible data passthrough

* fix(llm_cost_calc/utils.py): support audio token cost tracking in generic cost per token

enables vertex ai cost tracking to work with audio tokens

* fix(llm_cost_calc/utils.py): default to total prompt tokens if text tokens field not set

* refactor(llm_cost_calc/utils.py): move openai cost tracking to generic cost per token

more consistent behaviour across providers

* test: add unit test for gemini audio token cost calculation

* ci: bump ci config

* test: fix test
2025-03-26 17:26:25 -07:00
Ishaan Jaff
55115bf520 transform_responses_api_request 2025-03-20 12:28:55 -07:00
Ishaan Jaff
bc174adcd0 add should_fake_stream 2025-03-20 09:54:26 -07:00
Krrish Dholakia
fe24b9d90b feat(azure/gpt_transformation.py): add azure audio model support
Closes https://github.com/BerriAI/litellm/issues/6305
2025-03-19 22:57:49 -07:00
Ishaan Jaff
9203910ab6 fix import hashlib 2025-03-19 21:08:19 -07:00
Ishaan Jaff
65083ca8da get_openai_client_cache_key 2025-03-18 18:35:50 -07:00
Ishaan Jaff
3daef0d740 fix common utils 2025-03-18 17:59:46 -07:00
Ishaan Jaff
a45830dac3 use common caching logic for openai/azure clients 2025-03-18 17:57:03 -07:00
Ishaan Jaff
f73e9047dc use common logic for re-using openai clients 2025-03-18 17:56:32 -07:00
Ishaan Jaff
842625a6f0 :test_completion_azure_ad_toke 2025-03-18 12:25:32 -07:00
Ishaan Jaff
a0c5fb81b8 fix logic for intializing openai clients 2025-03-18 10:23:30 -07:00
Ishaan Jaff
6e351136d7 handle _get_async_http_client for OpenAI 2025-03-18 08:56:08 -07:00
Krish Dholakia
cff1c1f7d8
Merge branch 'main' into litellm_dev_03_12_2025_p1 2025-03-12 22:14:02 -07:00
Krrish Dholakia
88e9edf7db refactor: update method signature 2025-03-12 15:23:38 -07:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3 2025-03-12 14:56:01 -07:00
Ishaan Jaff
2460f3cbab test_validate_environment 2025-03-12 12:57:40 -07:00
Ishaan Jaff
39d391d8e7 Optional[Dict] 2025-03-12 12:29:13 -07:00
Ishaan Jaff
4ff6e41c15 ResponsesAPIStreamEvents 2025-03-11 23:42:35 -07:00
Ishaan Jaff
278b6fb5f6 add debug logging 2025-03-11 23:13:10 -07:00
Ishaan Jaff
8fa313ab07 add async streaming support 2025-03-11 20:00:42 -07:00
Ishaan Jaff
f32968409e working basic openai response api request 2025-03-11 17:37:19 -07:00
Krrish Dholakia
cbc2e84044 refactor(azure.py): refactor to have client init work across all endpoints 2025-03-11 17:27:24 -07:00
Ishaan Jaff
b3999b4c75 transform_response_api_response 2025-03-11 17:03:31 -07:00
Ishaan Jaff
dff8308e92 add transform_response_api_response 2025-03-11 16:53:18 -07:00
Ishaan Jaff
0f8de3d0a5 add transform_request for OpenAI responses API 2025-03-11 16:33:26 -07:00
Ishaan Jaff
980354b78b add validate_environment, get_complete_url 2025-03-11 16:12:36 -07:00
Ishaan Jaff
401a52e694 working transform 2025-03-11 15:24:42 -07:00
Ishaan Jaff
368f1de2e1 add OpenAIResponsesAPIConfig 2025-03-11 15:10:34 -07:00
Krrish Dholakia
5f87dc229a feat(openai.py): bubble all error information back to client 2025-03-10 15:27:43 -07:00
Krrish Dholakia
c1ec82fbd5 refactor: instrument body param to bubble up on exception 2025-03-10 15:21:04 -07:00
Krish Dholakia
f6535ae6ad
Support format param for specifying image type (#9019)
* fix(transformation.py): support a 'format' parameter for image's

allow user to specify mime type

* fix: pass mimetype via 'format' param

* feat(gemini/chat/transformation.py): support 'format' param for gemini

* fix(factory.py): support 'format' param on sync bedrock converse calls

* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls

* refactor(factory.py): move to supporting 'format' param in base helper

ensures consistency in param support

* feat(gpt_transformation.py): filter out 'format' param

don't send invalid param to openai

* fix(gpt_transformation.py): fix translation

* fix: fix translation error
2025-03-05 19:52:53 -08:00
Krish Dholakia
5e386c28b2
Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krrish Dholakia
4418e6dd14 build: merge branch 2025-03-02 08:31:57 -08:00
Ishaan Jaff
6231052b18
[Bug]: Deepseek error on proxy after upgrading to 1.61.13-stable (#8860)
* fix deepseek error

* test_deepseek_provider_async_completion

* fix get_complete_url
2025-02-26 21:11:06 -08:00
Krish Dholakia
017c482d7b
fix(o_series_transformation.py): fix optional param check for o-serie… (#8787)
* fix(o_series_transformation.py): fix optional param check for o-series models

o3-mini and o-1 do not support parallel tool calling

* fix(utils.py): support 'drop_params' for 'thinking' param across models

allows switching to older claude versions (or non-anthropic models) and param to be safely dropped

* fix: fix passing thinking param in optional params

allows dropping thinking_param where not applicable

* test: update old model

* fix(utils.py): fix linting errors

* fix(main.py): add param to acompletion
2025-02-26 12:26:55 -08:00
Ishaan Jaff
f9cee4c46b
(Bug Fix) Using LiteLLM Python SDK with model=litellm_proxy/ for embedding, image_generation, transcription, speech, rerank (#8815)
* test_litellm_gateway_from_sdk

* fix embedding check for openai

* test litellm proxy provider

* fix image generation openai compatible models

* fix litellm.transcription

* test_litellm_gateway_from_sdk_rerank

* docs litellm python sdk

* docs litellm python sdk with proxy

* test_litellm_gateway_from_sdk_rerank

* ci/cd run again

* test_litellm_gateway_from_sdk_image_generation

* test_litellm_gateway_from_sdk_embedding

* test_litellm_gateway_from_sdk_embedding
2025-02-25 16:22:37 -08:00
Krish Dholakia
0c0f6c23e2
feat(openai/o_series_transformation.py): support native streaming for all openai o-series models (#8552)
o1 now supports streaming
2025-02-14 20:04:19 -08:00
Krish Dholakia
e26d7df91b
Litellm dev 02 10 2025 p2 (#8443)
* Fixed issue #8246 (#8250)

* Fixed issue #8246

* Added unit tests for discard() and for remove_callback_from_list_by_object()

* fix(openai.py): support dynamic passing of organization param to openai

handles scenario where client-side org id is passed to openai

---------

Co-authored-by: Erez Hadad <erezh@il.ibm.com>
2025-02-10 17:53:46 -08:00