Commit graph

38 commits

Author SHA1 Message Date
Jean Carlo de Souza
29eea8fe9b
Merge d3b75abc86 into b82af5b826 2025-04-24 01:02:01 -07:00
Krish Dholakia
6ba3c4a4f8
VertexAI non-jsonl file storage support (#9781)
* test: add initial e2e test

* fix(vertex_ai/files): initial commit adding sync file create support

* refactor: initial commit of vertex ai non-jsonl files reaching gcp endpoint

* fix(vertex_ai/files/transformation.py): initial working commit of non-jsonl file call reaching backend endpoint

* fix(vertex_ai/files/transformation.py): working e2e non-jsonl file upload

* test: working e2e jsonl call

* test: unit testing for jsonl file creation

* fix(vertex_ai/transformation.py): reset file pointer after read

allow multiple reads on same file object

* fix: fix linting errors

* fix: fix ruff linting errors

* fix: fix import

* fix: fix linting error

* fix: fix linting error

* fix(vertex_ai/files/transformation.py): fix linting error

* test: update test

* test: update tests

* fix: fix linting errors

* fix: fix test

* fix: fix linting error
2025-04-09 14:01:48 -07:00
Krish Dholakia
053b0e741f
Add Google AI Studio /v1/files upload API support (#9645)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 23s
* test: fix import for test

* fix: fix bad error string

* docs: cleanup files docs

* fix(files/main.py): cleanup error string

* style: initial commit with a provider/config pattern for files api

google ai studio files api onboarding

* fix: test

* feat(gemini/files/transformation.py): support gemini files api response transformation

* fix(gemini/files/transformation.py): return file id as gemini uri

allows id to be passed in to chat completion request, just like openai

* feat(llm_http_handler.py): support async route for files api on llm_http_handler

* fix: fix linting errors

* fix: fix model info check

* fix: fix ruff errors

* fix: fix linting errors

* Revert "fix: fix linting errors"

This reverts commit 926a5a527f.

* fix: fix linting errors

* test: fix test

* test: fix tests
2025-04-02 08:56:58 -07:00
Krish Dholakia
9b7ebb6a7d
build(pyproject.toml): add new dev dependencies - for type checking (#9631)
* build(pyproject.toml): add new dev dependencies - for type checking

* build: reformat files to fit black

* ci: reformat to fit black

* ci(test-litellm.yml): make tests run clear

* build(pyproject.toml): add ruff

* fix: fix ruff checks

* build(mypy/): fix mypy linting errors

* fix(hashicorp_secret_manager.py): fix passing cert for tls auth

* build(mypy/): resolve all mypy errors

* test: update test

* fix: fix black formatting

* build(pre-commit-config.yaml): use poetry run black

* fix(proxy_server.py): fix linting error

* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Ishaan Jaff
9203910ab6 fix import hashlib 2025-03-19 21:08:19 -07:00
Ishaan Jaff
65083ca8da get_openai_client_cache_key 2025-03-18 18:35:50 -07:00
Ishaan Jaff
a45830dac3 use common caching logic for openai/azure clients 2025-03-18 17:57:03 -07:00
Ishaan Jaff
a0c5fb81b8 fix logic for intializing openai clients 2025-03-18 10:23:30 -07:00
Ishaan Jaff
6e351136d7 handle _get_async_http_client for OpenAI 2025-03-18 08:56:08 -07:00
Ishaan Jaff
39d391d8e7 Optional[Dict] 2025-03-12 12:29:13 -07:00
Krrish Dholakia
5f87dc229a feat(openai.py): bubble all error information back to client 2025-03-10 15:27:43 -07:00
Krrish Dholakia
c1ec82fbd5 refactor: instrument body param to bubble up on exception 2025-03-10 15:21:04 -07:00
Krish Dholakia
5e386c28b2
Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krish Dholakia
e26d7df91b
Litellm dev 02 10 2025 p2 (#8443)
* Fixed issue #8246 (#8250)

* Fixed issue #8246

* Added unit tests for discard() and for remove_callback_from_list_by_object()

* fix(openai.py): support dynamic passing of organization param to openai

handles scenario where client-side org id is passed to openai

---------

Co-authored-by: Erez Hadad <erezh@il.ibm.com>
2025-02-10 17:53:46 -08:00
Krish Dholakia
23f458d2da
Improved O3 + Azure O3 support (#8181)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix: support azure o3 model family for fake streaming workaround (#8162)

* fix: support azure o3 model family for fake streaming workaround

* refactor: rename helper to is_o_series_model for clarity

* update function calling parameters for o3 models (#8178)

* refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3

ensures max_tokens is correctly translated for o3

* feat(openai/): refactor o1 files to be 'o_series' files

expands naming to cover o3

* fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets

* test(test_azure_o_series.py): assert stream faked for azure o3 mini

Resolves https://github.com/BerriAI/litellm/pull/8162

* fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing

* docs(azure.md): update doc with `o_series/` model name

---------

Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
2025-02-01 09:52:28 -08:00
Ishaan Jaff
2cf0daa31c
(Fixes) OpenAI Streaming Token Counting + Fixes usage track when litellm.turn_off_message_logging=True (#8156)
* working streaming usage tracking

* fix test_async_chat_openai_stream_options

* fix await asyncio.sleep(1)

* test_async_chat_azure

* fix s3 logging

* fix get_stream_options

* fix get_stream_options

* fix streaming handler

* test_stream_token_counting_with_redaction

* fix codeql concern
2025-01-31 15:06:37 -08:00
Ishaan Jaff
b6f2e659b9
(Feat) Add x-litellm-overhead-duration-ms and "x-litellm-response-duration-ms" in response from LiteLLM (#7899)
* add track_llm_api_timing

* add track_llm_api_timing

* test_litellm_overhead

* use ResponseMetadata class for setting hidden params and response overhead

* instrument http handler

* fix track_llm_api_timing

* track_llm_api_timing

* emit response overhead on hidden params

* fix resp metadata

* fix make_sync_openai_embedding_request

* test_aaaaatext_completion_endpoint fixes

* _get_value_from_hidden_params

* set_hidden_params

* test_litellm_overhead

* test_litellm_overhead

* test_litellm_overhead

* fix import

* test_litellm_overhead_stream

* add LiteLLMLoggingObject

* use diff folder for testing

* use diff folder for overhead testing

* test litellm overhead

* use typing

* clear typing

* test_litellm_overhead

* fix async_streaming

* update_response_metadata

* move test file

* pply metadata to the response objec
2025-01-21 20:27:55 -08:00
Krish Dholakia
4b23420a20
Litellm dev 01 20 2025 p1 (#7884)
* fix(initial-test-to-return-api-timeout-value-in-openai-timeout-exception): Makes it easier for user to debug why request timed out

* feat(openai.py): return timeout value + time taken on openai timeout errors

helps debug timeout errors

* fix(utils.py): fix num retries extraction logic when num_retries = 0

* fix(config_settings.md): litellm_logging.py

support printing payload to console if 'LITELLM_PRINT_STANDARD_LOGGING_PAYLOAD' is true

 Enables easier debug

* test(test_auth_checks.py'): remove common checks userapikeyauth enforcement check

* fix(litellm_logging.py): fix linting error
2025-01-20 21:45:48 -08:00
Krish Dholakia
1bea338597
LiteLLM Minor Fixes & Improvements (2024/16/01) (#7826)
* fix(lm_studio/chat/transformation.py): Fix https://github.com/BerriAI/litellm/issues/7811

* fix(router.py): fix mock timeout check

* fix: drop model name from fallback args since it causes a conflict with the model=model that is provided later on. (#7806)

This error happens if you provide multiple fallback models to the completion function with model name defined in each one.

* fix(router.py): remove mock_timeout before sending to request

prevents reuse in fallbacks

* test: update test

* test: revert test change - wrong pr

---------

Co-authored-by: Dudu Lasry <david1542@users.noreply.github.com>
2025-01-17 20:59:21 -08:00
Jean Souza
01c7feb87d feat(openai-modify-assistant): add support for modifying OpenAI assistants via PATCH request
- Implemented `modify_assistant` function to update assistant details such as name, instructions, and tools.
- Added `local_tests` to verify the modification functionality.
2025-01-05 15:35:11 -03:00
Krish Dholakia
d43d83f9ef
feat(router.py): support request prioritization for text completion c… (#7540)
* feat(router.py): support request prioritization for text completion calls

* fix(internal_user_endpoints.py): fix sql query to return all keys, including null team id keys on `/user/info`

Fixes https://github.com/BerriAI/litellm/issues/7485

* fix: fix linting errors

* fix: fix linting error

* test(test_router_helper_utils.py): add direct test for '_schedule_factory'

Fixes code qa test
2025-01-03 19:35:44 -08:00
Krish Dholakia
6843f3a2bb
Revert "fix: add missing parameters order, limit, before, and after in get_as…" (#7542)
This reverts commit 4b0505dffd.
2025-01-03 16:32:12 -08:00
Jean Carlo de Souza
4b0505dffd
fix: add missing parameters order, limit, before, and after in get_assistants method for openai (#7537)
- Ensured that `before` and `after` parameters are only passed when provided to avoid AttributeError.
- Implemented safe access using default values for `before` and `after` to prevent missing attribute issues.
- Added consistent handling of `order` and `limit` to improve flexibility and robustness in API calls.
2025-01-03 14:41:54 -08:00
Krish Dholakia
0120176541
Litellm dev 12 30 2024 p2 (#7495)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* fix(types/utils.py): handle none logprobs

Fixes https://github.com/BerriAI/litellm/issues/328

* fix(exception_mapping_utils.py): fix error str unbound error

* refactor(azure_ai/): move to openai_like chat completion handler

allows for easy swapping of api base url's (e.g. ai.services.com)

Fixes https://github.com/BerriAI/litellm/issues/7275

* refactor(azure_ai/): move to base llm http handler

* fix(azure_ai/): handle differing api endpoints

* fix(azure_ai/): make sure all unit tests are passing

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting error

* fix: fix linting errors

* fix(azure_ai/transformation.py): handle extra body param

* fix(azure_ai/transformation.py): fix max retries param handling

* fix: fix test

* test(test_azure_o1.py): fix test

* fix(llm_http_handler.py): support handling azure ai unprocessable entity error

* fix(llm_http_handler.py): handle sync invalid param error for azure ai

* fix(azure_ai/): streaming support with base_llm_http_handler

* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai

* fix: fix linting errors

* fix(llm_http_handler.py): fix linting error

* fix(azure_ai/): handle cohere tool call invalid index param error
2025-01-01 18:57:29 -08:00
Krish Dholakia
347779b813
Litellm dev 12 30 2024 p1 (#7480)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* test: fix azure o1 test

* test: fix tests

* fix: fix test
2024-12-30 21:52:52 -08:00
Ishaan Jaff
1e06ee3162
(Refactor) - Re use litellm.completion/litellm.embedding etc for health checks (#7455)
* add mode: realtime

* add _realtime_health_check

* test_realtime_health_check

* azure _realtime_health_check

* _realtime_health_check

* Realtime Models

* fix code quality

* delete OAI / Azure custom health check code

* simplest version of ahealth check

* update tests

* working health check post refactor

* working aspeech health check

* fix realtime health checks

* test_audio_transcription_health_check

* use get_audio_file_for_health_check

* test_text_completion_health_check

* ahealth_check

* simplify health check code

* update ahealth_check

* fix import

* fix unused imports

* fix ahealth_check

* fix local testing

* test_async_realtime_health_check
2024-12-28 18:38:54 -08:00
Ishaan Jaff
4e65722a00
(Bug Fix) Add health check support for realtime models (#7453)
* add mode: realtime

* add _realtime_health_check

* test_realtime_health_check

* azure _realtime_health_check

* _realtime_health_check

* Realtime Models

* fix code quality
2024-12-28 18:15:00 -08:00
Krish Dholakia
760328b6ad
Litellm dev 12 25 2025 p2 (#7420)
* test: add new test image embedding to base llm unit tests

Addresses https://github.com/BerriAI/litellm/issues/6515

* fix(bedrock/embed/multimodal-embeddings): strip data prefix from image urls for bedrock multimodal embeddings

Fix https://github.com/BerriAI/litellm/issues/6515

* feat: initial commit for fireworks ai audio transcription support

Relevant issue: https://github.com/BerriAI/litellm/issues/7134

* test: initial fireworks ai test

* feat(fireworks_ai/): implemented fireworks ai audio transcription config

* fix(utils.py): register fireworks ai audio transcription config, in config manager

* fix(utils.py): add fireworks ai param translation to 'get_optional_params_transcription'

* refactor(fireworks_ai/): define text completion route with model name handling

moves model name handling to specific fireworks routes, as required by their api

* refactor(fireworks_ai/chat): define transform_Request - allows fixing model if accounts/ is missing

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(handler.py): fix linting errors

* fix(main.py): fix tgai text completion route

* refactor(together_ai/completion): refactors together ai text completion route to just use provider transform request

* refactor: move test_fine_tuning_api out of local_testing

reduces local testing ci/cd time
2024-12-25 18:35:34 -08:00
Ishaan Jaff
81be0b4090
(Feat) add `"/v1/batches/{batch_id:path}/cancel" endpoint (#7406)
* use 1 file for azure batches handling

* add cancel_batch endpoint

* add a cancel batch on open ai

* add cancel_batch endpoint

* add cancel batches to test

* remove unused imports

* test_batches_operations

* update test_batches_operations
2024-12-24 20:23:50 -08:00
Krish Dholakia
404bf2974b
Litellm dev 2024 12 20 p1 (#7335)
* fix(utils.py): e2e azure tts cost tracking working

moves tts response obj to include hidden params (allows for litellm call id, etc. to be sent in response headers) ; fixes spend_Tracking_utils logging payload to account for non-base model use-case

Fixes https://github.com/BerriAI/litellm/issues/7223

* fix: fix linting errors

* build(model_prices_and_context_window.json): add bedrock llama 3.3

Closes https://github.com/BerriAI/litellm/issues/7329

* fix(openai.py): fix return type for sync openai httpx response

* test: update test

* fix(spend_tracking_utils.py): fix if check

* fix(spend_tracking_utils.py): fix if check

* test: improve debugging for test

* fix: fix import
2024-12-20 21:22:31 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Krish Dholakia
62b00cf28d
o1 - add image param handling (#7312)
* fix(openai.py): fix returning o1 non-streaming requests

fixes issue where fake stream always true for o1

* build(model_prices_and_context_window.json): add 'supports_vision' for o1 models

* fix: add internal server error exception mapping

* fix(base_llm_unit_tests.py): drop temperature from test

* test: mark prompt caching as a flaky test
2024-12-19 11:22:25 -08:00
Krish Dholakia
5253f639cd
fix(health.md): add rerank model health check information (#7295)
* fix(health.md): add rerank model health check information

* build(model_prices_and_context_window.json): add gemini 2.0 for google ai studio - pricing + commercial rate limits

* build(model_prices_and_context_window.json): add gemini-2.0 supports audio output = true

* docs(team_model_add.md): clarify allowing teams to add models is an enterprise feature

* fix(o1_transformation.py): add support for 'n', 'response_format' and 'stop' params for o1 and 'stream_options' param for o1-mini

* build(model_prices_and_context_window.json): add 'supports_system_message' to supporting openai models

needed as o1-preview, and o1-mini models don't support 'system message

* fix(o1_transformation.py): translate system message based on if o1 model supports it

* fix(o1_transformation.py): return 'stream' param support if o1-mini/o1-preview

o1 currently doesn't support streaming, but the other model versions do

Fixes https://github.com/BerriAI/litellm/issues/7292

* fix(o1_transformation.py): return tool calling/response_format in supported params if model map says so

Fixes https://github.com/BerriAI/litellm/issues/7292

* fix: fix linting errors

* fix: update '_transform_messages'

* fix(o1_transformation.py): fix provider passed for supported param checks

* test(base_llm_unit_tests.py): skip test if api takes >5s to respond

* fix(utils.py): return false in 'supports_factory' if can't find value

* fix(o1_transformation.py): always return stream + stream_options as supported params + handle stream options being passed in for azure o1

* feat(openai.py): support stream faking natively in openai handler

Allows o1 calls to be faked for just the "o1" model, allows native streaming for o1-mini, o1-preview

 Fixes https://github.com/BerriAI/litellm/issues/7292

* fix(openai.py): use inference param instead of original optional param
2024-12-18 19:18:10 -08:00
Ishaan Jaff
7a5dd29fe0
(fix) unable to pass input_type parameter to Voyage AI embedding mode (#7276)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 46s
* VoyageEmbeddingConfig

* fix voyage logic to get params

* add voyage embedding transformation

* add get_provider_embedding_config

* use BaseEmbeddingConfig

* voyage clean up

* use llm http handler for embedding transformations

* test_voyage_ai_embedding_extra_params

* add voyage async

* test_voyage_ai_embedding_extra_params

* add async for llm http handler

* update BaseLLMEmbeddingTest

* test_voyage_ai_embedding_extra_params

* fix linting

* fix get_provider_embedding_config

* fix anthropic text test

* update location of base/chat/transformation

* fix import path

* fix IBMWatsonXAIConfig
2024-12-17 19:23:49 -08:00
Krish Dholakia
516c2a6a70
Litellm remove circular imports (#7232)
* fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py

* fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py

'

* refactor: fix litellm.ModelResponse import on pass through endpoints

* refactor(litellm_logging.py): fix circular import for custom callbacks literal

* fix(factory.py): fix circular imports inside prompt factory

* fix(cost_calculator.py): fix circular import for 'litellm.Usage'

* fix(proxy_server.py): fix potential circular import with `litellm.Router'

* fix(proxy/utils.py): fix potential circular import in `litellm.Router`

* fix: remove circular imports in 'auth_checks' and 'guardrails/'

* fix(prompt_injection_detection.py): fix router impor t

* fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through

* fix(anthropic_pass_through_logging_handler.py): fix potential circular imports

* fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import

* fix(base.py): fix potential circular import

* fix(handler.py): fix potential circular ref in codestral + cohere handler's

* fix(azure.py): fix potential circular imports

* fix(gpt_transformation.py): fix modelresponse import

* fix(litellm_logging.py): add logging base class - simplify typing

makes it easy for other files to type check the logging obj without introducing circular imports

* fix(azure_ai/embed): fix potential circular import on handler.py

* fix(databricks/): fix potential circular imports in databricks/

* fix(vertex_ai/): fix potential circular imports on vertex ai embeddings

* fix(vertex_ai/image_gen): fix import

* fix(watsonx-+-bedrock): cleanup imports

* refactor(anthropic-pass-through-+-petals): cleanup imports

* refactor(huggingface/): cleanup imports

* fix(ollama-+-clarifai): cleanup circular imports

* fix(openai_like/): fix impor t

* fix(openai_like/): fix embedding handler

cleanup imports

* refactor(openai.py): cleanup imports

* fix(sagemaker/transformation.py): fix import

* ci(config.yml): add circular import test to ci/cd
2024-12-14 16:28:34 -08:00
Ishaan Jaff
b5d55688e5
(Refactor) Code Quality improvement - remove /prompt_templates/ , base_aws_llm.py from /llms folder (#7164)
* fix move base_aws_llm

* fix import

* update enforce llms folder style

* move prompt_templates

* update prompt_templates location

* fix imports

* fix imports

* fix imports

* fix imports

* fix checks
2024-12-11 00:02:46 -08:00
Krish Dholakia
350cfc36f7
Litellm merge pr (#7161)
* build: merge branch

* test: fix openai naming

* fix(main.py): fix openai renaming

* style: ignore function length for config factory

* fix(sagemaker/): fix routing logic

* fix: fix imports

* fix: fix override
2024-12-10 22:49:26 -08:00
Ishaan Jaff
bfb6891eb7
rename llms/OpenAI/ -> llms/openai/ (#7154)
* rename OpenAI -> openai

* fix file rename

* fix rename changes

* fix organization of openai/transcription

* fix import OA fine tuning API

* fix openai ft handler

* fix handler import
2024-12-10 20:14:07 -08:00
Renamed from litellm/llms/OpenAI/openai.py (Browse further)