litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 10:44:24 +00:00

Author	SHA1	Message	Date
Krish Dholakia	6b8b49451f	Fix azure max retries error (#8340 ) * fix(azure.py): ensure max_retries=0 is respected Fixes https://github.com/BerriAI/litellm/issues/6129 * fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0 * test(test_azure_openai.py): add unit testing for azure_text/ route * fix(azure.py): fix passing max retries on streaming * fix(azure.py): fix azure max retries on async completion + streaming * fix(completion/handler.py): fix azure text async completion + streaming * test(test_azure_openai.py): ensure azure openai max retries always respected * test(test_azure_o_series.py): add testing to ensure max retries always respected * Added gemini providers for 2.0-flash and 2.0-flash lite (#8321) * Update model_prices_and_context_window.json added gemini providers for 2.0-flash and 2.0-flash light * Update model_prices_and_context_window.json fixed URL --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Convert tool use arguments to string before counting tokens (#6989) In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine. * build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing * build(model_prices_and_context_window.json): add gemini commercial rate limits * fix(utils.py): fix linting error * refactor(utils.py): refactor to maintain function size --------- Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com> Co-authored-by: Josh Morrow <josh@jcmorrow.com>	2025-02-06 23:20:48 -08:00
Ishaan Jaff	818792228c	(Refactor) - migrate bedrock invoke to `BaseLLMHTTPHandler` class (#8290 ) * initial transform for invoke * invoke transform_response * working - able to make request * working get_complete_url * working - invoke now runs on llm_http_handler * fix unused imports * track litellm overhead ms * working stream request * sign_request transform * sign_request update * use has_async_custom_stream_wrapper property * use get_async_custom_stream_wrapper in base llm http handler * fix make_call in invoke handler * fix invoke with streaming get_async_custom_stream_wrapper * working bedrock async streaming with invoke * fix make call handler for bedrock * test_all_model_configs * fix test_bedrock_custom_prompt_template * sync streaming for bedrock invoke * fix _add_stream_param_to_request_body * test_async_text_completion_bedrock * fix transform_request * fix get_supported_openai_params * fix test supports tool choice * fix test_supports_tool_choice * add unit test coverage for bedrock invoke transform * fix location of transformation files * update import loc * fix bedrock invoke unit tests * fix import for max completion tokens	2025-02-05 18:58:55 -08:00
Krish Dholakia	3c813b3a87	Fix deepseek calling - refactor to use base_llm_http_handler (#8266 ) * refactor(deepseek/): move deepseek to base llm http handler Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457 * fix(gpt_transformation.py): support stream parsing for gpt-like calls * test(test_deepseek_completion.py): add async streaming test * fix(gpt_transformation.py): fix import * fix(gpt_transformation.py): return full api base and content type	2025-02-04 22:30:00 -08:00
Krish Dholakia	97b8de17ab	LiteLLM Minor Fixes & Improvements (01/16/2025) - p2 (#7828 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 14s Details * fix(vertex_ai/gemini/transformation.py): handle 'http://' image urls * test: add base test for `http:` url's * fix(factory.py/get_image_details): follow redirects allows http calls to work * fix(codestral/): fix stream chunk parsing on last chunk of stream * Azure ad token provider (#6917) * Update azure.py Added optional parameter azure ad token provider * Added parameter to main.py * Found token provider arg location * Fixed embeddings * Fixed ad token provider --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * fix: fix linting errors * fix(main.py): leave out o1 route for azure ad token provider, for now get v0 out for sync azure gpt route to begin with * test: skip http:// test for fireworks ai model does not support it * refactor: cleanup dead code * fix: revert http:// url passthrough for gemini google ai studio raises errors * test: fix test --------- Co-authored-by: bahtman <anton@baht.dk>	2025-02-02 23:17:50 -08:00
Krish Dholakia	1105e35538	Complete o3 model support (#8183 ) * fix(o_series_transformation.py): add 'reasoning_effort' as o series model param Closes https://github.com/BerriAI/litellm/issues/8182 * fix(main.py): ensure `reasoning_effort` is a mapped openai param * refactor(azure/): rename o1_[x] files to o_series_[x] * refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort * test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests * feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers Makes it easy to switch from openai to anthropic * fix: fix linting errors * fix(base_llm_unit_tests.py): fix test * fix(main.py): add missing param	2025-02-02 22:36:37 -08:00
Krish Dholakia	e4566d7b1c	fix(main.py): fix passing openrouter specific params (#8184 ) * fix(main.py): fix passing openrouter specific params Fixes https://github.com/BerriAI/litellm/issues/8130 * test(test_get_model_info.py): add check for region name w/ cris model Resolves https://github.com/BerriAI/litellm/issues/8115	2025-02-02 22:23:14 -08:00
Krish Dholakia	23f458d2da	Improved O3 + Azure O3 support (#8181 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix: support azure o3 model family for fake streaming workaround (#8162) * fix: support azure o3 model family for fake streaming workaround * refactor: rename helper to is_o_series_model for clarity * update function calling parameters for o3 models (#8178) * refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3 ensures max_tokens is correctly translated for o3 * feat(openai/): refactor o1 files to be 'o_series' files expands naming to cover o3 * fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets * test(test_azure_o_series.py): assert stream faked for azure o3 mini Resolves https://github.com/BerriAI/litellm/pull/8162 * fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing * docs(azure.md): update doc with `o_series/` model name --------- Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com> Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>	2025-02-01 09:52:28 -08:00
Krish Dholakia	dad24f2b52	Litellm dev 01 29 2025 p2 (#8102 ) * docs: cleanup doc * feat(bedrock/): initial commit adding bedrock/converse_like/<model> route support allows routing to a converse like endpoint Resolves https://github.com/BerriAI/litellm/issues/8085 * feat(bedrock/chat/converse_transformation.py): make converse config base config compatible enables new 'converse_like' route * feat(converse_transformation.py): enables using the proxy with converse like api endpoint Resolves https://github.com/BerriAI/litellm/issues/8085	2025-01-29 20:53:37 -08:00
Krish Dholakia	d9eb8f42ff	Litellm dev 01 27 2025 p3 (#8047 ) * docs(reliability.md): add doc on disabling fallbacks per request * feat(litellm_pre_call_utils.py): support reading request timeout from request headers - new `x-litellm-timeout` param Allows setting dynamic model timeouts from vercel's AI sdk * test(test_proxy_server.py): add simple unit test for reading request timeout * test(test_fallbacks.py): add e2e test to confirm timeout passed in request headers is correctly read * feat(main.py): support passing metadata to openai in preview Resolves https://github.com/BerriAI/litellm/issues/6022#issuecomment-2616119371 * fix(main.py): fix passing openai metadata * docs(request_headers.md): document new request headers * build: Merge branch 'main' into litellm_dev_01_27_2025_p3 * test: loosen test	2025-01-28 18:01:27 -08:00
Krish Dholakia	6bafdbc546	Litellm dev 01 25 2025 p4 (#8006 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 34s Details * feat(main.py): use asyncio.sleep for mock_Timeout=true on async request adds unit testing to ensure proxy does not fail if specific Openai requests hang (e.g. recent o1 outage) * fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming Fixes https://github.com/BerriAI/litellm/issues/7942 * Revert "fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming" This reverts commit `7a052a64e3`. * fix(deepseek-r-1): return reasoning_content as a top-level param ensures compatibility with existing tools that use it * fix: fix linting error	2025-01-26 08:01:05 -08:00
Krish Dholakia	4857a1e9ba	refactor: cleanup dead codeblock (#7936 ) * refactor: cleanup dead codeblock * fix(main.py): add extra headers to headers * fix: remove dead codeblock	2025-01-24 21:48:23 -08:00
Krish Dholakia	8ca3229b26	Ensure base_model cost tracking works across all endpoints (#7989 ) * test(test_completion_cost.py): add sdk test to ensure base model is used for cost tracking * test(test_completion_cost.py): add sdk test to ensure custom pricing works * fix(main.py): add base model cost tracking support for embedding calls Enables base model cost tracking for embedding calls when base model set as a litellm_param * fix(litellm_logging.py): update logging object with litellm params - including base model, if given ensures base model param is always tracked * fix(main.py): fix linting errors	2025-01-24 21:05:26 -08:00
Krish Dholakia	1e011b66d3	Ollama ssl verify = False + Spend Logs reliability fixes (#7931 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(http_handler.py): support passing ssl verify dynamically and using the correct httpx client based on passed ssl verify param Fixes https://github.com/BerriAI/litellm/issues/6499 * feat(llm_http_handler.py): support passing `ssl_verify=False` dynamically in call args Closes https://github.com/BerriAI/litellm/issues/6499 * fix(proxy/utils.py): prevent bad logs from breaking all cost tracking + reset list regardless of success/failure prevents malformed logs from causing all spend tracking to break since they're constantly retried * test(test_proxy_utils.py): add test to ensure bad log is dropped * test(test_proxy_utils.py): ensure in-memory spend logs reset after bad log error * test(test_user_api_key_auth.py): add unit test to ensure end user id as str works * fix(auth_utils.py): ensure extracted end user id is always a str prevents db cost tracking errors * test(test_auth_utils.py): ensure get end user id from request body always returns a string * test: update tests * test: skip bedrock test- behaviour now supported * test: fix testing * refactor(spend_tracking_utils.py): reduce size of get_logging_payload * test: fix test * bump: version 1.59.4 → 1.59.5 * Revert "bump: version 1.59.4 → 1.59.5" This reverts commit `1182b46b2e`. * fix(utils.py): fix spend logs retry logic * fix(spend_tracking_utils.py): fix get tags * fix(spend_tracking_utils.py): fix end user id spend tracking on pass-through endpoints	2025-01-23 23:05:41 -08:00
Krish Dholakia	27560bd5ad	Litellm dev 01 22 2025 p4 (#7932 ) * feat(main.py): add new 'provider_specific_header' param allows passing extra header for specific provider * fix(litellm_pre_call_utils.py): add unit test for pre call utils * test(test_bedrock_completion.py): skip test now that bedrock supports this	2025-01-22 21:52:07 -08:00
Krish Dholakia	866fffb50d	Litellm dev 01 21 2025 p1 (#7898 ) * fix(utils.py): don't pass 'anthropic-beta' header to vertex - will cause request to fail * fix(utils.py): add flag to allow user to disable filtering invalid headers ensure user can control behaviour * style(utils.py): cleanup message * test(test_utils.py): add unit test to cover invalid header filtering * fix(proxy_server.py): fix custom openapi schema generation * fix(utils.py): pass extra headers if set * fix(main.py): fix image variation to use 'client' param	2025-01-21 20:36:11 -08:00
Ishaan Jaff	f1335362cf	(core sdk fix) - fix fallbacks stuck in infinite loop (#7751 ) * test_acompletion_fallbacks_basic * use common run_async_function * fix completion_with_fallbacks * fix completion with fallbacks * fix fallback utils * test_acompletion_fallbacks_basic * test_completion_fallbacks_sync * huggingface/mistralai/Mistral-7B-Instruct-v0.3	2025-01-13 19:34:34 -08:00
Ishaan Jaff	c8cedbed20	fix img gen cost	2025-01-12 16:31:04 -08:00
Krish Dholakia	ad2f66b3e3	[BETA] Add OpenAI `/images/variations` + Topaz API support (#7700 ) * feat(main.py): initial commit for `/image/variations` endpoint support * refactor(base_llm/): introduce new base llm base config for image variation endpoints * refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler * fix: test * feat(openai/): working openai `/image/variation` endpoint calls via sdk * feat(topaz/): topaz sync image variation call support Addresses https://github.com/BerriAI/litellm/issues/7593 ' * fix(topaz/transformation.py): fix linting errors * fix(openai/image_variations/handler.py): fix passing json data * fix(main.py): image_variation/ support async image variation route - `aimage_variation` * fix(test_get_model_info.py): fix test * fix: cleanup unused imports * feat(openai/): add async `/image/variations` endpoint support * feat(topaz/): support async `/image/variations` calls * fix: test * fix(utils.py): fix get_model_info_helper for no model info w/ provider config handles situation where model info is not known but provider config exists * test(test_router_fallbacks.py): mark flaky test * fix: fix unused imports * test: bump otel load test perf threshold - accounts for current load tests hitting same server	2025-01-11 23:27:46 -08:00
Krish Dholakia	becd4bc748	Litellm dev 01 11 2025 p3 (#7702 ) * fix(__init__.py): fix init to exclude pricing-only model cost values from real model names prevents bad health checks on wildcard routes * fix(get_llm_provider.py): fix to handle calling bedrock_converse models	2025-01-11 20:06:54 -08:00
Krish Dholakia	27892acdfc	Litellm dev 01 10 2025 p3 (#7682 ) * feat(langfuse.py): log the used prompt when prompt management used * test: fix test * docs(self_serve.md): add doc on restricting personal key creation on ui * feat(s3.py): support s3 logging with team alias prefixes (if available) New preview feature * fix(main.py): remove old if block - simplify to just await if coroutine returned fixes lm_studio async embedding error * fix(langfuse.py): handle get prompt check	2025-01-10 21:56:42 -08:00
Krish Dholakia	c10ae8879e	fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… (#7660 ) * fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url * refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic deduplicates code * fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages * docs(prompt_management.md): update prompt management to be in beta given feedback - this still needs to be revised (e.g. passing in user message, not ignoring) * refactor(prompt_management_base.py): introduce base class for prompt management allows consistent behaviour across prompt management integrations * feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base * fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set allows tracking what prompt was used for what purpose * feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse allows logging prompt id / prompt variables to langfuse * test: fix test * fix(router.py): cleanup unused imports * fix: fix linting error * fix: fix trace param typing * fix: fix linting errors * fix: fix code qa check	2025-01-10 07:31:59 -08:00
Krish Dholakia	865e6d5bda	fix(main.py): fix lm_studio/ embedding routing (#7658 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 36s Details * fix(main.py): fix lm_studio/ embedding routing adds the mapping + updates docs with example * docs(self_serve.md): update doc to show how to auto-add sso users to teams * fix(streaming_handler.py): simplify async iterator check, to just check if streaming response is an async iterable	2025-01-09 23:03:24 -08:00
Krish Dholakia	4e69711411	Litellm dev 01 07 2025 p1 (#7618 ) * fix(main.py): pass custom llm provider on litellm logging provider update * fix(cost_calculator.py): don't append provider name to return model if existing llm provider Fixes https://github.com/BerriAI/litellm/issues/7607 * fix(prometheus_services.py): fix prometheus system health error logging Fixes https://github.com/BerriAI/litellm/issues/7611	2025-01-07 21:22:31 -08:00
Ishaan Jaff	2ca0977921	`aiohttp_openai/` fixes - allow using `aiohttp_openai/gpt-4o` (#7598 ) * fixes for get_complete_url * update aiohttp tests * fix event loop for aiohtto * ci/cd run again * test_aiohttp_openai	2025-01-06 21:39:11 -08:00
Krish Dholakia	fef7839e8a	Litellm dev 01 06 2025 p1 (#7594 ) * fix(custom_logger.py): expose new 'async_get_chat_completion_prompt' event hook * fix(custom_logger.py): langfuse_prompt_management.py remove 'headers' from custom logger 'async_get_chat_completion_prompt' and 'get_chat_completion_prompt' event hooks * feat(router.py): expose new function for prompt management based routing * feat(router.py): partial working router prompt factory logic allows load balanced model to be used for model name w/ langfuse prompt management call * feat(router.py): fix prompt management with load balanced model group * feat(langfuse_prompt_management.py): support reading in openai params from langfuse enables user to define optional params on langfuse vs. client code * test(test_Router.py): add unit test for router based langfuse prompt management * fix: fix linting errors	2025-01-06 21:26:21 -08:00
Krish Dholakia	f6698e871f	Fix langfuse prompt management on proxy (#7535 ) * fix(types/utils.py): support langfuse + humanloop routes on llm router * fix(main.py): remove acompletion elif block just await if coroutine returned	2025-01-03 12:42:37 -08:00
Ishaan Jaff	d861aa8ff3	(perf) use `aiohttp` for `custom_openai` (#7514 ) * use aiohttp handler * BaseLLMAIOHTTPHandler * use CustomOpenAIChatConfig * CustomOpenAIChatConfig * CustomOpenAIChatConfig * fix linting * AiohttpOpenAIChatConfig * fix order * aiohttp_openai	2025-01-02 22:15:17 -08:00
Krish Dholakia	45b93f2721	Litellm dev 01 01 2025 p3 (#7503 ) * fix(utils.py): add new validate tool choice helper function Prevents https://github.com/BerriAI/litellm/issues/7483 * fix(main.py): add tool choice validation on .completion() prevents user error like - https://github.com/BerriAI/litellm/issues/7483 * fix(utils.py): fix return val of tool choice validation logic	2025-01-01 22:12:15 -08:00
Krish Dholakia	0120176541	Litellm dev 12 30 2024 p2 (#7495 ) * test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model * fix(base_llm_unit_tests.py): handle azure o1 preview response format tests skip as o1 on azure doesn't support tool calling yet * fix: initial commit of azure o1 handler using openai caller simplifies calling + allows fake streaming logic alr. implemented for openai to just work * feat(azure/o1_handler.py): fake o1 streaming for azure o1 models azure does not currently support streaming for o1 * feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info enables user to toggle on when azure allows o1 streaming without needing to bump versions * style(router.py): remove 'give feedback/get help' messaging when router is used Prevents noisy messaging Closes https://github.com/BerriAI/litellm/issues/5942 * fix(types/utils.py): handle none logprobs Fixes https://github.com/BerriAI/litellm/issues/328 * fix(exception_mapping_utils.py): fix error str unbound error * refactor(azure_ai/): move to openai_like chat completion handler allows for easy swapping of api base url's (e.g. ai.services.com) Fixes https://github.com/BerriAI/litellm/issues/7275 * refactor(azure_ai/): move to base llm http handler * fix(azure_ai/): handle differing api endpoints * fix(azure_ai/): make sure all unit tests are passing * fix: fix linting errors * fix: fix linting errors * fix: fix linting error * fix: fix linting errors * fix(azure_ai/transformation.py): handle extra body param * fix(azure_ai/transformation.py): fix max retries param handling * fix: fix test * test(test_azure_o1.py): fix test * fix(llm_http_handler.py): support handling azure ai unprocessable entity error * fix(llm_http_handler.py): handle sync invalid param error for azure ai * fix(azure_ai/): streaming support with base_llm_http_handler * fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai * fix: fix linting errors * fix(llm_http_handler.py): fix linting error * fix(azure_ai/): handle cohere tool call invalid index param error	2025-01-01 18:57:29 -08:00
Krish Dholakia	41e5b3aa8d	HumanLoop integration for Prompt Management (#7479 ) * feat(humanloop.py): initial commit for humanloop prompt management integration Closes https://github.com/BerriAI/litellm/issues/213 * feat(humanloop.py): working e2e humanloop prompt management integration Closes https://github.com/BerriAI/litellm/issues/213 * fix(humanloop.py): fix linting errors * fix: fix linting erro * fix: fix test * test: handle filenotfound error	2024-12-30 22:26:03 -08:00
Krish Dholakia	347779b813	Litellm dev 12 30 2024 p1 (#7480 ) * test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model * fix(base_llm_unit_tests.py): handle azure o1 preview response format tests skip as o1 on azure doesn't support tool calling yet * fix: initial commit of azure o1 handler using openai caller simplifies calling + allows fake streaming logic alr. implemented for openai to just work * feat(azure/o1_handler.py): fake o1 streaming for azure o1 models azure does not currently support streaming for o1 * feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info enables user to toggle on when azure allows o1 streaming without needing to bump versions * style(router.py): remove 'give feedback/get help' messaging when router is used Prevents noisy messaging Closes https://github.com/BerriAI/litellm/issues/5942 * test: fix azure o1 test * test: fix tests * fix: fix test	2024-12-30 21:52:52 -08:00
Ishaan Jaff	a003af6c04	(fix) `litellm.amoderation` - support using `model=openai/omni-moderation-latest`, `model=omni-moderation-latest`, `model=None` (#7475 ) * test_moderation_endpoint * fix litellm.amoderation	2024-12-30 09:42:51 -08:00
Krish Dholakia	cfb6890b9f	Litellm dev 12 28 2024 p2 (#7458 ) * docs(sidebar.js): docs for support model access groups for wildcard routes * feat(key_management_endpoints.py): add check if user is premium_user when adding model access group for wildcard route * refactor(docs/): make control model access a root-level doc in proxy sidebar easier to discover how to control model access on litellm * docs: more cleanup * feat(fireworks_ai/): add document inlining support Enables user to call non-vision models with images/pdfs/etc. * test(test_fireworks_ai_translation.py): add unit testing for fireworks ai transform inline helper util * docs(docs/): add document inlining details to fireworks ai docs * feat(fireworks_ai/): allow user to dynamically disable auto add transform inline allows client-side disabling of this feature for proxy users * feat(fireworks_ai/): return 'supports_vision' and 'supports_pdf_input' true on all fireworks ai models now true as fireworks ai supports document inlining * test: fix tests * fix(router.py): add unit testing for _is_model_access_group_for_wildcard_route	2024-12-28 19:38:06 -08:00
Krish Dholakia	5af438ed89	Litellm dev 12 28 2024 p3 (#7464 ) * feat(deepgram/): initial e2e support for deepgram stt Uses deepgram's `/listen` endpoint to transcribe speech to text Closes https://github.com/BerriAI/litellm/issues/4875 * fix: fix linting errors * test: fix test	2024-12-28 19:18:58 -08:00
Ishaan Jaff	4d648ee335	fix ahealth_check	2024-12-28 19:16:28 -08:00
Ishaan Jaff	1e06ee3162	(Refactor) - Re use litellm.completion/litellm.embedding etc for health checks (#7455 ) * add mode: realtime * add _realtime_health_check * test_realtime_health_check * azure _realtime_health_check * _realtime_health_check * Realtime Models * fix code quality * delete OAI / Azure custom health check code * simplest version of ahealth check * update tests * working health check post refactor * working aspeech health check * fix realtime health checks * test_audio_transcription_health_check * use get_audio_file_for_health_check * test_text_completion_health_check * ahealth_check * simplify health check code * update ahealth_check * fix import * fix unused imports * fix ahealth_check * fix local testing * test_async_realtime_health_check	2024-12-28 18:38:54 -08:00
Ishaan Jaff	4e65722a00	(Bug Fix) Add health check support for realtime models (#7453 ) * add mode: realtime * add _realtime_health_check * test_realtime_health_check * azure _realtime_health_check * _realtime_health_check * Realtime Models * fix code quality	2024-12-28 18:15:00 -08:00
Krish Dholakia	67b39bacf7	LiteLLM Minor Fixes & Improvements (12/27/2024) - p1 (#7448 ) * feat(main.py): mock_response() - support 'litellm.ContextWindowExceededError' in mock response enabled quicker router/fallback/proxy debug on context window errors * feat(exception_mapping_utils.py): extract special litellm errors from error str if calling `litellm_proxy/` as provider Closes https://github.com/BerriAI/litellm/issues/7259 * fix(user_api_key_auth.py): specify 'Received Proxy Server Request' is span kind server Closes https://github.com/BerriAI/litellm/issues/7298	2024-12-27 19:04:39 -08:00
Krish Dholakia	760328b6ad	Litellm dev 12 25 2025 p2 (#7420 ) * test: add new test image embedding to base llm unit tests Addresses https://github.com/BerriAI/litellm/issues/6515 * fix(bedrock/embed/multimodal-embeddings): strip data prefix from image urls for bedrock multimodal embeddings Fix https://github.com/BerriAI/litellm/issues/6515 * feat: initial commit for fireworks ai audio transcription support Relevant issue: https://github.com/BerriAI/litellm/issues/7134 * test: initial fireworks ai test * feat(fireworks_ai/): implemented fireworks ai audio transcription config * fix(utils.py): register fireworks ai audio transcription config, in config manager * fix(utils.py): add fireworks ai param translation to 'get_optional_params_transcription' * refactor(fireworks_ai/): define text completion route with model name handling moves model name handling to specific fireworks routes, as required by their api * refactor(fireworks_ai/chat): define transform_Request - allows fixing model if accounts/ is missing * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix(handler.py): fix linting errors * fix(main.py): fix tgai text completion route * refactor(together_ai/completion): refactors together ai text completion route to just use provider transform request * refactor: move test_fine_tuning_api out of local_testing reduces local testing ci/cd time	2024-12-25 18:35:34 -08:00
Ishaan Jaff	08a4c72692	(feat) `/batches` - track `user_api_key_alias`, `user_api_key_team_alias` etc for /batch requests (#7401 ) * run azure testing on ci/cd * update docs on azure batches endpoints * add input azure.jsonl * refactor - use separate file for batches endpoints * fixes for passing custom llm provider to /batch endpoints * pass custom llm provider to files endpoints * update azure batches doc * add info for azure batches api * update batches endpoints * use simple helper for raising proxy exception * update config.yml * fix imports * add type hints to get_litellm_params * update get_litellm_params * update get_litellm_params * update get slp * QOL - stop double logging a create batch operations on custom loggers * re use slp from og event * _create_standard_logging_object_for_completed_batch * fix linting errors * reduce num changes in PR * update BATCH_STATUS_POLL_MAX_ATTEMPTS	2024-12-24 17:44:28 -08:00
Ishaan Jaff	05b0d2026f	(feat) Add cost tracking for /batches requests OpenAI (#7384 ) * add basic logging for create`batch` * add create_batch as a call type * add basic dd logging for batches * basic batch creation logging on DD * batch endpoints add cost calc * fix batches_async_logging * separate folder for batches testing * new job for batches tests * test batches logging * fix validation logic * add vertex_batch_completions.jsonl * test test_async_create_batch * test_async_create_batch * update tests * test_completion_with_no_model * remove dead code * update load_vertex_ai_credentials * test_avertex_batch_prediction * update get async httpx client * fix get_async_httpx_client * update test_avertex_batch_prediction * fix batches testing config.yaml * add google deps * fix vertex files handler	2024-12-23 17:47:26 -08:00
Krish Dholakia	48316520f4	LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 (#7386 ) * fix(main.py): support 'mock_timeout=true' param allows mock requests on proxy to have a time delay, for testing * fix(main.py): ensure mock timeouts raise litellm.Timeout error triggers retry/fallbacks * fix: fix fallback + mock timeout testing * fix(router.py): always return remaining tpm/rpm limits, if limits are known allows for rate limit headers to be guaranteed * docs(timeout.md): add docs on mock timeout = true * fix(main.py): fix linting errors * test: fix test	2024-12-23 17:41:27 -08:00
Krish Dholakia	3671829e39	Complete 'requests' library removal (#7350 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure * refactor(watsonx/completion/handler.py): move to using base llm http handler removes 'requests' library usage * fix(watsonx_text/transformation.py): fix result transformation migrates to transformation.py, for usage with base llm http handler * fix(streaming_handler.py): migrate watsonx streaming to transformation.py ensures streaming works with base llm http handler * fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic * fix(watsonx/): fix chat route post completion route refactor * refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well * refactor(base.py): remove requests library usage from litellm * build(pyproject.toml): remove requests library usage * fix: fix linting errors * fix: fix linting errors * fix(types/utils.py): fix validation errors for modelresponsestream * fix(replicate/handler.py): fix linting errors * fix(litellm_logging.py): handle modelresponsestream object * fix(streaming_handler.py): fix modelresponsestream args * fix: remove unused imports * test: fix test * fix: fix test * test: fix test * test: fix tests * test: fix test * test: fix patch target * test: fix test	2024-12-22 07:21:25 -08:00
Ishaan Jaff	6107f9f3f3	[Bug fix ]: Triton /infer handler incompatible with batch responses (#7337 ) * migrate triton to base llm http handler * clean up triton handler.py * use transform functions for triton * add TritonConfig * get openai params for triton * use triton embedding config * test_completion_triton_generate_api * test_completion_triton_infer_api * fix TritonConfig doc string * use TritonResponseIterator * fix triton embeddings * docs triton chat usage	2024-12-20 20:59:40 -08:00
Krish Dholakia	27a4d08604	Litellm dev 2024 12 19 p3 (#7322 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params Fixes https://github.com/BerriAI/litellm/issues/7242 * test: new test for langfuse prompt management hook Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(main.py): add 'get_chat_completion_prompt' customlogger hook allows for langfuse prompt management Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(langfuse_prompt_management.py): working e2e langfuse prompt management works with `langfuse/` route * feat(main.py): initial tracing for dynamic langfuse params allows admin to specify langfuse keys by model in model_list * feat(main.py): support passing langfuse credentials dynamically * fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params allows dynamic langfuse params to work * fix: fix linting errors * docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial * docs(prompt_management.md): cleanup doc * docs: cleanup topnav * docs(prompt_management.md): update docs to be easier to use * fix: remove unused imports * docs(prompt_management.md): add architectural overview doc * fix(litellm_logging.py): fix dynamic param passing * fix(langfuse_prompt_management.py): fix linting errors * fix: fix linting errors * fix: use typing_extensions for typealias to ensure python3.8 compatibility * test: use stream_options in test to account for tiktoken diff * fix: improve import error message, and check run test earlier	2024-12-20 13:30:16 -08:00
Krish Dholakia	6a45ee1ef7	fix(hosted_vllm/transformation.py): return fake api key, if none give… (#7301 ) * fix(hosted_vllm/transformation.py): return fake api key, if none give. Prevents httpx error Fixes https://github.com/BerriAI/litellm/issues/7291 * test: fix test * fix(main.py): add hosted_vllm/ support for embeddings endpoint Closes https://github.com/BerriAI/litellm/issues/7290 * docs(vllm.md): add docs on vllm embeddings usage * fix(__init__.py): fix sambanova model test * fix(base_llm_unit_tests.py): skip pydantic obj test if model takes >5s to respond	2024-12-18 18:41:53 -08:00
Ishaan Jaff	7a5dd29fe0	(fix) unable to pass input_type parameter to Voyage AI embedding mode (#7276 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details * VoyageEmbeddingConfig * fix voyage logic to get params * add voyage embedding transformation * add get_provider_embedding_config * use BaseEmbeddingConfig * voyage clean up * use llm http handler for embedding transformations * test_voyage_ai_embedding_extra_params * add voyage async * test_voyage_ai_embedding_extra_params * add async for llm http handler * update BaseLLMEmbeddingTest * test_voyage_ai_embedding_extra_params * fix linting * fix get_provider_embedding_config * fix anthropic text test * update location of base/chat/transformation * fix import path * fix IBMWatsonXAIConfig	2024-12-17 19:23:49 -08:00
Krish Dholakia	019ddc32d6	Litellm dev 12 17 2024 p2 (#7277 ) * fix(openai/transcription/handler.py): call 'log_pre_api_call' on async calls * fix(openai/transcriptions/handler.py): call 'logging.pre_call' on sync whisper calls as well * fix(proxy_cli.py): remove default proxy_cli timeout param gets passed in as a dynamic request timeout and overrides config values * fix(langfuse.py): pass litellm httpx client - contains ssl certs (#7052) Fixes https://github.com/BerriAI/litellm/issues/7046	2024-12-17 14:05:14 -08:00
Krish Dholakia	b82add11ba	LITELLM: Remove `requests` library usage (#7235 ) * fix(generic_api_callback.py): remove requests lib usage * fix(budget_manager.py): remove requests lib usgae * fix(main.py): cleanup requests lib usage * fix(utils.py): remove requests lib usage * fix(argilla.py): fix argilla test * fix(athina.py): replace 'requests' lib usage with litellm module * fix(greenscale.py): replace 'requests' lib usage with httpx * fix: remove unused 'requests' lib import + replace usage in some places * fix(prompt_layer.py): remove 'requests' lib usage from prompt layer * fix(ollama_chat.py): remove 'requests' lib usage * fix(baseten.py): replace 'requests' lib usage * fix(codestral/): replace 'requests' lib usage * fix(predibase/): replace 'requests' lib usage * refactor: cleanup unused 'requests' lib imports * fix(oobabooga.py): cleanup 'requests' lib usage * fix(invoke_handler.py): remove unused 'requests' lib usage * refactor: cleanup unused 'requests' lib import * fix: fix linting errors * refactor(ollama/): move ollama to using base llm http handler removes 'requests' lib dep for ollama integration * fix(ollama_chat.py): fix linting errors * fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama	2024-12-17 12:50:04 -08:00
Krish Dholakia	ec36353b41	fix(main.py): fix retries being multiplied when using openai sdk (#7221 ) * fix(main.py): fix retries being multiplied when using openai sdk Closes https://github.com/BerriAI/litellm/pull/7130 * docs(prompt_management.md): add langfuse prompt management doc * feat(team_endpoints.py): allow teams to add their own models Enables teams to call their own finetuned models via the proxy * test: add better enforcement check testing for `/model/new` now that teams can add their own models * docs(team_model_add.md): tutorial for allowing teams to add their own models * test: fix test	2024-12-14 11:56:55 -08:00

1 2 3 4 5 ...

1298 commits