litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 03:04:13 +00:00

Author	SHA1	Message	Date
Krish Dholakia	4b88635372	fix(fireworks_ai/): fix global disable flag with transform messages helper (#7847 ) fixes issue where .get() = none was preventing global disable flag from being picked up	2025-01-20 20:16:11 -08:00
Krish Dholakia	c4ff0b6487	refactor: make bedrock image transformation requests async (#7840 ) * refactor: initial commit for using separate sync vs. async transformation routes for bedrock ensures no blocking calls e.g. when converting image url to b64 * perf(converse_transformation.py): make bedrock converse transformation async asyncify's the bedrock message transformation - useful for handling image urls for bedrock * fix(converse_handler.py): fix logging for async streaming * style: cleanup unused imports	2025-01-17 20:14:15 -08:00
Krish Dholakia	71c41f8f33	QA: ensure all bedrock regional models have same `supported_` as base + Anthropic nested pydantic object support (#7844 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * build: ensure all regional bedrock models have same supported values as base bedrock model prevents drift * test(base_llm_unit_tests.py): add testing for nested pydantic objects * fix(test_utils.py): add test_get_potential_model_names * fix(anthropic/chat/transformation.py): support nested pydantic objects Fixes https://github.com/BerriAI/litellm/issues/7755	2025-01-17 19:49:12 -08:00
Ishaan Jaff	c8febaca2e	test_watsonx_token_in_env_var	2025-01-16 22:28:37 -08:00
Ishaan Jaff	b492551d3d	(fix) IBM Watsonx using ZenApiKey (#7821 ) * ibm watsonx fix * test ZenAPIKey * fix zenapikey	2025-01-16 22:02:36 -08:00
Ishaan Jaff	5458a2ff33	fireworks ai use llama-v3p1-8b-instruct	2025-01-16 21:28:44 -08:00
Ishaan Jaff	2f38e72026	test commit on main	2025-01-16 20:52:55 -08:00
Krish Dholakia	fe60a38c8e	Litellm dev 01 2025 p4 (#7776 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(gemini/): support gemini 'frequency_penalty' and 'presence_penalty' Closes https://github.com/BerriAI/litellm/issues/7748 * feat(proxy_server.py): new env var to disable prisma health check on startup * test: fix test	2025-01-14 21:49:25 -08:00
Ishaan Jaff	30bb4c4cdd	(fix) `BaseAWSLLM` - cache IAM role credentials when used (#7775 ) * fix base aws llm * fix auth with aws role * test aws base llm * fix base aws llm init * run ci/cd again * fix get_credentials * ci/cd run again * _auth_with_aws_role	2025-01-14 20:16:22 -08:00
Krish Dholakia	35919d9fec	Litellm dev 01 13 2025 p2 (#7758 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * fix(factory.py): fix bedrock document url check Make check more generic - if starts with 'text' or 'application' assume it's a document and let it go through Fixes https://github.com/BerriAI/litellm/issues/7746 * feat(key_management_endpoints.py): support writing new key alias to aws secret manager - on key rotation adds rotation endpoint to aws key management hook - allows for rotated litellm virtual keys with new key alias to be written to it * feat(key_management_event_hooks.py): support rotating keys and updating secret manager * refactor(base_secret_manager.py): support rotate secret at the base level since it's just an abstraction function, it's easy to implement at the base manager level * style: cleanup unused imports	2025-01-14 17:04:01 -08:00
Krish Dholakia	7b27cfb0ae	Support temporary budget increases on keys (#7754 ) * fix(gpt_transformation.py): fix response_format translation check for 4o models Fixes https://github.com/BerriAI/litellm/issues/7616 * feat(key_management_endpoints.py): support 'temp_budget_increase' and 'temp_budget_expiry' fields Allow proxy admin to grant temporary budget increases to keys * fix(proxy/_types.py): enforce temp_budget_increase and temp_budget_expiry are always passed together * feat(user_api_key_auth.py): initial working temp budget increase logic ensures key budget exceeded error checks for temp budget in key metadata * feat(proxy_server.py): return the key max budget and key spend in the response headers Allows clientside user to know their remaining limits * test: add unit testing for new proxy utils Ensures new key budget is correctly handled * docs(temporary_budget_increase.md): add doc on temporary budget increase * fix(utils.py): remove 3.5 from response_format check for now not all azure 3.5 models support response_format * fix(user_api_key_auth.py): return valid user api key auth object on all paths	2025-01-14 17:03:11 -08:00
Ishaan Jaff	970e9c7507	huggingface/mistralai/Mistral-7B-Instruct-v0.3	2025-01-13 18:42:36 -08:00
Krish Dholakia	c10ae8879e	fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… (#7660 ) * fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url * refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic deduplicates code * fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages * docs(prompt_management.md): update prompt management to be in beta given feedback - this still needs to be revised (e.g. passing in user message, not ignoring) * refactor(prompt_management_base.py): introduce base class for prompt management allows consistent behaviour across prompt management integrations * feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base * fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set allows tracking what prompt was used for what purpose * feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse allows logging prompt id / prompt variables to langfuse * test: fix test * fix(router.py): cleanup unused imports * fix: fix linting error * fix: fix trace param typing * fix: fix linting errors * fix: fix code qa check	2025-01-10 07:31:59 -08:00
Ishaan Jaff	a85de46ef7	(proxy - RPS) - Get 2K RPS at 4 instances, minor fix `aiohttp_openai/` (#7659 ) * speed up transform_response * use 2 workers * undo changes to uvicorn * ci/cd run again	2025-01-09 17:24:18 -08:00
Krish Dholakia	0ffc5379ea	Litellm dev 01 07 2025 p2 (#7622 ) * build(ui/): update ui * fix: drop unsupported non-whitespace characters for real when calling… (#7484) * fix: drop unsupported non-whitespace characters for real when calling anthropic with stop sequences * test: add parameterized test for _map_stop_sequences method in AnthropicConfig --------- Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>	2025-01-08 16:56:39 -08:00
Krish Dholakia	e8ed40a27b	Litellm dev 01 01 2025 p2 (#7615 ) * fix(utils.py): prevent double logging when passing 'fallbacks=' to .completion() Fixes https://github.com/BerriAI/litellm/issues/7477 * fix(utils.py): fix vertex anthropic check * fix(utils.py): ensure supported params is always set Fixes https://github.com/BerriAI/litellm/issues/7470 * test(test_optional_params.py): add unit testing to prevent mistranslation Fixes https://github.com/BerriAI/litellm/issues/7470 * fix: fix linting error * test: cleanup	2025-01-07 21:40:33 -08:00
Ishaan Jaff	55139b8fd6	update tests All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 34s Details	2025-01-06 22:36:00 -08:00
Ishaan Jaff	2ca0977921	`aiohttp_openai/` fixes - allow using `aiohttp_openai/gpt-4o` (#7598 ) * fixes for get_complete_url * update aiohttp tests * fix event loop for aiohtto * ci/cd run again * test_aiohttp_openai	2025-01-06 21:39:11 -08:00
Ishaan Jaff	616211daee	ci/cd run again	2025-01-05 14:11:27 -08:00
Krish Dholakia	ce97e7e054	fix(groq/chat/transformation.py): fix groq response_format transformation (#7565 ) Fixes https://github.com/BerriAI/litellm/issues/4804	2025-01-04 19:39:04 -08:00
Krish Dholakia	f770dd0c95	Support checking provider-specific `/models` endpoints for available models based on key (#7538 ) * test(test_utils.py): initial test for valid models Addresses https://github.com/BerriAI/litellm/issues/7525 * fix: test * feat(fireworks_ai/transformation.py): support retrieving valid models from fireworks ai endpoint * refactor(fireworks_ai/): support checking model info on `/v1/models` route * docs(set_keys.md): update docs to clarify check llm provider api usage * fix(watsonx/common_utils.py): support 'WATSONX_ZENAPIKEY' for iam auth * fix(watsonx): read in watsonx token from env var * fix: fix linting errors * fix(utils.py): fix provider config check * style: cleanup unused imports	2025-01-03 19:29:59 -08:00
Ishaan Jaff	23104d9a14	test_aiohttp_openai	2025-01-03 15:12:56 -08:00
Ishaan Jaff	d861aa8ff3	(perf) use `aiohttp` for `custom_openai` (#7514 ) * use aiohttp handler * BaseLLMAIOHTTPHandler * use CustomOpenAIChatConfig * CustomOpenAIChatConfig * CustomOpenAIChatConfig * fix linting * AiohttpOpenAIChatConfig * fix order * aiohttp_openai	2025-01-02 22:15:17 -08:00
Krish Dholakia	0120176541	Litellm dev 12 30 2024 p2 (#7495 ) * test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model * fix(base_llm_unit_tests.py): handle azure o1 preview response format tests skip as o1 on azure doesn't support tool calling yet * fix: initial commit of azure o1 handler using openai caller simplifies calling + allows fake streaming logic alr. implemented for openai to just work * feat(azure/o1_handler.py): fake o1 streaming for azure o1 models azure does not currently support streaming for o1 * feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info enables user to toggle on when azure allows o1 streaming without needing to bump versions * style(router.py): remove 'give feedback/get help' messaging when router is used Prevents noisy messaging Closes https://github.com/BerriAI/litellm/issues/5942 * fix(types/utils.py): handle none logprobs Fixes https://github.com/BerriAI/litellm/issues/328 * fix(exception_mapping_utils.py): fix error str unbound error * refactor(azure_ai/): move to openai_like chat completion handler allows for easy swapping of api base url's (e.g. ai.services.com) Fixes https://github.com/BerriAI/litellm/issues/7275 * refactor(azure_ai/): move to base llm http handler * fix(azure_ai/): handle differing api endpoints * fix(azure_ai/): make sure all unit tests are passing * fix: fix linting errors * fix: fix linting errors * fix: fix linting error * fix: fix linting errors * fix(azure_ai/transformation.py): handle extra body param * fix(azure_ai/transformation.py): fix max retries param handling * fix: fix test * test(test_azure_o1.py): fix test * fix(llm_http_handler.py): support handling azure ai unprocessable entity error * fix(llm_http_handler.py): handle sync invalid param error for azure ai * fix(azure_ai/): streaming support with base_llm_http_handler * fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai * fix: fix linting errors * fix(llm_http_handler.py): fix linting error * fix(azure_ai/): handle cohere tool call invalid index param error	2025-01-01 18:57:29 -08:00
Ishaan Jaff	38bfefa6ef	(Feat) - LiteLLM Use `UsernamePasswordCredential` for Azure OpenAI (#7496 ) * add get_azure_ad_token_from_username_password * docs azure use username / password for auth * update doc * get_azure_ad_token_from_username_password * test test_get_azure_ad_token_from_username_password	2025-01-01 14:11:27 -08:00
Krish Dholakia	347779b813	Litellm dev 12 30 2024 p1 (#7480 ) * test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model * fix(base_llm_unit_tests.py): handle azure o1 preview response format tests skip as o1 on azure doesn't support tool calling yet * fix: initial commit of azure o1 handler using openai caller simplifies calling + allows fake streaming logic alr. implemented for openai to just work * feat(azure/o1_handler.py): fake o1 streaming for azure o1 models azure does not currently support streaming for o1 * feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info enables user to toggle on when azure allows o1 streaming without needing to bump versions * style(router.py): remove 'give feedback/get help' messaging when router is used Prevents noisy messaging Closes https://github.com/BerriAI/litellm/issues/5942 * test: fix azure o1 test * test: fix tests * fix: fix test	2024-12-30 21:52:52 -08:00
Ishaan Jaff	83879d2a3d	test_rerank_response_assertions (#7476 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 42s Details	2024-12-30 10:12:56 -08:00
Krish Dholakia	31ace870a2	Litellm dev 12 28 2024 p1 (#7463 ) * refactor(utils.py): migrate amazon titan config to base config * refactor(utils.py): refactor bedrock meta invoke model translation to use base config * refactor(utils.py): move bedrock ai21 to base config * refactor(utils.py): move bedrock cohere to base config * refactor(utils.py): move bedrock mistral to use base config * refactor(utils.py): move all provider optional param translations to using a config * docs(clientside_auth.md): clarify how to pass vertex region to litellm proxy * fix(utils.py): handle scenario where custom llm provider is none / empty * fix: fix get config * test(test_otel_load_tests.py): widen perf margin * fix(utils.py): fix get provider config check to handle custom llm's * fix(utils.py): fix check	2024-12-28 20:26:00 -08:00
Krish Dholakia	cfb6890b9f	Litellm dev 12 28 2024 p2 (#7458 ) * docs(sidebar.js): docs for support model access groups for wildcard routes * feat(key_management_endpoints.py): add check if user is premium_user when adding model access group for wildcard route * refactor(docs/): make control model access a root-level doc in proxy sidebar easier to discover how to control model access on litellm * docs: more cleanup * feat(fireworks_ai/): add document inlining support Enables user to call non-vision models with images/pdfs/etc. * test(test_fireworks_ai_translation.py): add unit testing for fireworks ai transform inline helper util * docs(docs/): add document inlining details to fireworks ai docs * feat(fireworks_ai/): allow user to dynamically disable auto add transform inline allows client-side disabling of this feature for proxy users * feat(fireworks_ai/): return 'supports_vision' and 'supports_pdf_input' true on all fireworks ai models now true as fireworks ai supports document inlining * test: fix tests * fix(router.py): add unit testing for _is_model_access_group_for_wildcard_route	2024-12-28 19:38:06 -08:00
Krish Dholakia	5af438ed89	Litellm dev 12 28 2024 p3 (#7464 ) * feat(deepgram/): initial e2e support for deepgram stt Uses deepgram's `/listen` endpoint to transcribe speech to text Closes https://github.com/BerriAI/litellm/issues/4875 * fix: fix linting errors * test: fix test	2024-12-28 19:18:58 -08:00
Krish Dholakia	760328b6ad	Litellm dev 12 25 2025 p2 (#7420 ) * test: add new test image embedding to base llm unit tests Addresses https://github.com/BerriAI/litellm/issues/6515 * fix(bedrock/embed/multimodal-embeddings): strip data prefix from image urls for bedrock multimodal embeddings Fix https://github.com/BerriAI/litellm/issues/6515 * feat: initial commit for fireworks ai audio transcription support Relevant issue: https://github.com/BerriAI/litellm/issues/7134 * test: initial fireworks ai test * feat(fireworks_ai/): implemented fireworks ai audio transcription config * fix(utils.py): register fireworks ai audio transcription config, in config manager * fix(utils.py): add fireworks ai param translation to 'get_optional_params_transcription' * refactor(fireworks_ai/): define text completion route with model name handling moves model name handling to specific fireworks routes, as required by their api * refactor(fireworks_ai/chat): define transform_Request - allows fixing model if accounts/ is missing * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix(handler.py): fix linting errors * fix(main.py): fix tgai text completion route * refactor(together_ai/completion): refactors together ai text completion route to just use provider transform request * refactor: move test_fine_tuning_api out of local_testing reduces local testing ci/cd time	2024-12-25 18:35:34 -08:00
Krish Dholakia	9237357bcc	Litellm dev 12 25 2024 p1 (#7411 ) * test(test_watsonx.py): e2e unit test for watsonx custom header covers https://github.com/BerriAI/litellm/issues/7408 * fix(common_utils.py): handle auth token already present in headers (watsonx + openai-like base handler) Fixes https://github.com/BerriAI/litellm/issues/7408 * fix(watsonx/chat): fix chat route Fixes https://github.com/BerriAI/litellm/issues/7408 * fix(huggingface/chat/handler.py): fix huggingface async completion calls * Correct handling of max_retries=0 to disable AzureOpenAI retries (#7379) * test: fix test --------- Co-authored-by: Minh Duc <phamminhduc0711@gmail.com>	2024-12-25 17:36:30 -08:00
Krish Dholakia	3ac54483a7	Litellm dev 12 24 2024 p3 (#7403 ) * fix(model_prices_and_context_window.json): specify meta llama is a bedrock converse model route Fixes https://github.com/BerriAI/litellm/issues/7385 * test(test_get_model_info.py): enforce all new bedrock chat models added have the bedrock_converse route Prevents https://github.com/BerriAI/litellm/issues/7385 and https://github.com/BerriAI/litellm/discussions/7325 * fix(get_supported_openai_params.py): use vertex gemini config by default for vertex ai route Fixes https://github.com/BerriAI/litellm/issues/7378 * refactor(vertex_ai/gemini/): rename vertexaiconfig to vertexaibaseconfig - make it clear vertexaiconfig = vertexgemini config * build(model_prices_and_context_window.json): add gpt-4o-audio-preview-2024-12-17 Closes https://github.com/BerriAI/litellm/issues/7367 * test: fix test * test: fix o1 tests * fix: handle llm api errors * fix: fix linting errors	2024-12-24 18:07:53 -08:00
Krrish Dholakia	fe43403359	test: override openai o1 prompt caching test - openai backend not caching right now	2024-12-24 16:49:48 -08:00
Krish Dholakia	c3edfc2c92	LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 35s Details * build(model_prices_and_context_window.json): add gemini-1.5-flash context caching * fix(context_caching/transformation.py): just use last identified cache point Fixes https://github.com/BerriAI/litellm/issues/6738 * fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google Fixes https://github.com/BerriAI/litellm/issues/6738 * fix(vertex_ai/gemini/): track context caching tokens * refactor(gemini/): place transformation.py inside `chat/` folder make it easy for user to know we support the equivalent endpoint * fix: fix import * refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder make it easier to see cost calculation logic * fix: fix linting errors * fix: fix circular import * feat(gemini/cost_calculator.py): support gemini context caching cost calculation generifies anthropic's cost calculation function and uses it across anthropic + gemini * build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching Closes https://github.com/BerriAI/litellm/issues/6891 * docs(gemini.md): add gemini context caching architecture diagram make it easier for user to understand how context caching works * docs(gemini.md): link to relevant gemini context caching code * docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code * fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario * fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation * test: fix test	2024-12-23 22:02:52 -08:00
Krish Dholakia	3671829e39	Complete 'requests' library removal (#7350 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure * refactor(watsonx/completion/handler.py): move to using base llm http handler removes 'requests' library usage * fix(watsonx_text/transformation.py): fix result transformation migrates to transformation.py, for usage with base llm http handler * fix(streaming_handler.py): migrate watsonx streaming to transformation.py ensures streaming works with base llm http handler * fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic * fix(watsonx/): fix chat route post completion route refactor * refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well * refactor(base.py): remove requests library usage from litellm * build(pyproject.toml): remove requests library usage * fix: fix linting errors * fix: fix linting errors * fix(types/utils.py): fix validation errors for modelresponsestream * fix(replicate/handler.py): fix linting errors * fix(litellm_logging.py): handle modelresponsestream object * fix(streaming_handler.py): fix modelresponsestream args * fix: remove unused imports * test: fix test * fix: fix test * test: fix test * test: fix tests * test: fix test * test: fix patch target * test: fix test	2024-12-22 07:21:25 -08:00
Ishaan Jaff	6107f9f3f3	[Bug fix ]: Triton /infer handler incompatible with batch responses (#7337 ) * migrate triton to base llm http handler * clean up triton handler.py * use transform functions for triton * add TritonConfig * get openai params for triton * use triton embedding config * test_completion_triton_generate_api * test_completion_triton_infer_api * fix TritonConfig doc string * use TritonResponseIterator * fix triton embeddings * docs triton chat usage	2024-12-20 20:59:40 -08:00
Krish Dholakia	27a4d08604	Litellm dev 2024 12 19 p3 (#7322 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params Fixes https://github.com/BerriAI/litellm/issues/7242 * test: new test for langfuse prompt management hook Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(main.py): add 'get_chat_completion_prompt' customlogger hook allows for langfuse prompt management Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(langfuse_prompt_management.py): working e2e langfuse prompt management works with `langfuse/` route * feat(main.py): initial tracing for dynamic langfuse params allows admin to specify langfuse keys by model in model_list * feat(main.py): support passing langfuse credentials dynamically * fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params allows dynamic langfuse params to work * fix: fix linting errors * docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial * docs(prompt_management.md): cleanup doc * docs: cleanup topnav * docs(prompt_management.md): update docs to be easier to use * fix: remove unused imports * docs(prompt_management.md): add architectural overview doc * fix(litellm_logging.py): fix dynamic param passing * fix(langfuse_prompt_management.py): fix linting errors * fix: fix linting errors * fix: use typing_extensions for typealias to ensure python3.8 compatibility * test: use stream_options in test to account for tiktoken diff * fix: improve import error message, and check run test earlier	2024-12-20 13:30:16 -08:00
Krish Dholakia	4c7a3931b7	Litellm dev 12 19 2024 p2 (#7315 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details * fix(proxy_server.py): only update k,v pair if v is not empty/null Fixes https://github.com/BerriAI/litellm/issues/6787 * test(test_router.py): cleanup duplicate calls * test: add new test stream options drop params test * test: update optional params / stream options test to test for vertex ai mistral route specifically Addresses https://github.com/BerriAI/litellm/issues/7309 * fix(proxy_server.py): fix linting errors * fix: fix linting errors	2024-12-19 20:28:16 -08:00
Ishaan Jaff	617ac63d14	(feat) add infinity rerank models (#7321 ) * Support Infinity Reranker (custom reranking models) (#7247) * Support Infinity Reranker * Clean code * Included transformation.py * Clean code * Added Infinity reranker test * Clean code --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * transform_rerank_response * update handler.py * infinity rerank updates * ci/cd run again * add infinity unit tests * docs add instruction on how to add a new provider for rerank --------- Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>	2024-12-19 18:30:28 -08:00
Ishaan Jaff	5f15b0aa20	(code refactor) - Add `BaseRerankConfig`. Use `BaseRerankConfig` for `cohere/rerank` and `azure_ai/rerank` (#7319 ) * add base rerank config * working sync cohere rerank * update rerank types * update base rerank config * remove old rerank * add new cohere handler.py * add cohere rerank transform * add get_provider_rerank_config * add rerank to base llm http handler * add rerank utils * add arerank to llm http handler.py * add AzureAIRerankConfig * updates rerank config * update test rerank * fix unused imports * update get_provider_rerank_config * test_basic_rerank_caching * fix unused import * test rerank	2024-12-19 17:03:34 -08:00
Ishaan Jaff	a790d43116	[Bug Fix]: ImportError: cannot import name 'T' from 're' (#7314 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details * fix unused imports * add test for python 3.12 * re introduce error - as a test * update config for ci/cd * fix python 13 install * bump pyyaml * bump numpy * fix embedding requests * bump pillow dep * bump version * bump pydantic * bump tiktoken * fix import * fix python 3.13 import * fix unused imports in tests/*	2024-12-19 13:09:30 -08:00
Krish Dholakia	62b00cf28d	o1 - add image param handling (#7312 ) * fix(openai.py): fix returning o1 non-streaming requests fixes issue where fake stream always true for o1 * build(model_prices_and_context_window.json): add 'supports_vision' for o1 models * fix: add internal server error exception mapping * fix(base_llm_unit_tests.py): drop temperature from test * test: mark prompt caching as a flaky test	2024-12-19 11:22:25 -08:00
Krish Dholakia	5253f639cd	fix(health.md): add rerank model health check information (#7295 ) * fix(health.md): add rerank model health check information * build(model_prices_and_context_window.json): add gemini 2.0 for google ai studio - pricing + commercial rate limits * build(model_prices_and_context_window.json): add gemini-2.0 supports audio output = true * docs(team_model_add.md): clarify allowing teams to add models is an enterprise feature * fix(o1_transformation.py): add support for 'n', 'response_format' and 'stop' params for o1 and 'stream_options' param for o1-mini * build(model_prices_and_context_window.json): add 'supports_system_message' to supporting openai models needed as o1-preview, and o1-mini models don't support 'system message * fix(o1_transformation.py): translate system message based on if o1 model supports it * fix(o1_transformation.py): return 'stream' param support if o1-mini/o1-preview o1 currently doesn't support streaming, but the other model versions do Fixes https://github.com/BerriAI/litellm/issues/7292 * fix(o1_transformation.py): return tool calling/response_format in supported params if model map says so Fixes https://github.com/BerriAI/litellm/issues/7292 * fix: fix linting errors * fix: update '_transform_messages' * fix(o1_transformation.py): fix provider passed for supported param checks * test(base_llm_unit_tests.py): skip test if api takes >5s to respond * fix(utils.py): return false in 'supports_factory' if can't find value * fix(o1_transformation.py): always return stream + stream_options as supported params + handle stream options being passed in for azure o1 * feat(openai.py): support stream faking natively in openai handler Allows o1 calls to be faked for just the "o1" model, allows native streaming for o1-mini, o1-preview Fixes https://github.com/BerriAI/litellm/issues/7292 * fix(openai.py): use inference param instead of original optional param	2024-12-18 19:18:10 -08:00
Krish Dholakia	6a45ee1ef7	fix(hosted_vllm/transformation.py): return fake api key, if none give… (#7301 ) * fix(hosted_vllm/transformation.py): return fake api key, if none give. Prevents httpx error Fixes https://github.com/BerriAI/litellm/issues/7291 * test: fix test * fix(main.py): add hosted_vllm/ support for embeddings endpoint Closes https://github.com/BerriAI/litellm/issues/7290 * docs(vllm.md): add docs on vllm embeddings usage * fix(__init__.py): fix sambanova model test * fix(base_llm_unit_tests.py): skip pydantic obj test if model takes >5s to respond	2024-12-18 18:41:53 -08:00
Ishaan Jaff	7a5dd29fe0	(fix) unable to pass input_type parameter to Voyage AI embedding mode (#7276 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details * VoyageEmbeddingConfig * fix voyage logic to get params * add voyage embedding transformation * add get_provider_embedding_config * use BaseEmbeddingConfig * voyage clean up * use llm http handler for embedding transformations * test_voyage_ai_embedding_extra_params * add voyage async * test_voyage_ai_embedding_extra_params * add async for llm http handler * update BaseLLMEmbeddingTest * test_voyage_ai_embedding_extra_params * fix linting * fix get_provider_embedding_config * fix anthropic text test * update location of base/chat/transformation * fix import path * fix IBMWatsonXAIConfig	2024-12-17 19:23:49 -08:00
Krish Dholakia	179d2f56b7	LiteLLM Minor Fixes & Improvements (12/16/2024) - p1 (#7263 ) * fix(factory.py): skip empty text blocks for bedrock user messages Fixes https://github.com/BerriAI/litellm/issues/7169 * Add support for Gemini 2.0 GoogleSearch tool (#7257) * Add support for google_search tool in gemini 2.0 * Add/modify tests * Fix grounding check * Remove 2.0 grounding test; exclude experimental model in VERTEX_MODELS_TO_NOT_TEST * Swap order of tools * DFix formatting * fix(get_api_base.py): return api base in streaming response Fixes https://github.com/BerriAI/litellm/issues/7249 Closes https://github.com/BerriAI/litellm/pull/7250 * fix(cost_calculator.py): only set base model to model if not none Fixes https://github.com/BerriAI/litellm/issues/7223 * fix(cost_calculator.py): enforce stricter order when picking model for cost calculation * fix(cost_calculator.py): fix '_select_model_name_for_cost_calc' to return model name with region name prefix if provided * fix(utils.py): fix 'get_model_info()' to handle edge case where model name starts with custom llm provider AND custom llm provider is given * fix(cost_calculator.py): handle `custom_llm_provider-` scenario * fix(cost_calculator.py): e2e working tts cost tracking ensures initial message is passed in, to cost calculator * fix(factory.py): suppress linting errors * fix(cost_calculator.py): strip llm provider from model name after selecting cost calc model * fix(litellm_logging.py): store initial request in 'input' field + accept base_model to be passed in litellm_params directly * test: handle none env var value in flaky test * fix(litellm_logging.py): fix linting errors --------- Co-authored-by: Sam B <samlingx@gmail.com>	2024-12-17 15:33:36 -08:00
Krish Dholakia	cd5bdfcb7a	docs(input.md): document 'extra_headers' param support (#7268 ) * docs(input.md): document 'extra_headers' param support * fix: #7239 to move Nova topK parameter to `additionalModelRequestFields` (#7240) Co-authored-by: Ryan Hoium <rhoium> --------- Co-authored-by: ryanh-ai <3118399+ryanh-ai@users.noreply.github.com>	2024-12-17 07:19:14 -08:00
Ishaan Jaff	8060c5c698	Litellm add router to base llm testing (#7202 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 1m15s Details * code qa add litellm router to base llm testing * test_image_url * fix img url * fix add router to base llm class * fix base llm testing * add test scenario * fix test_json_response_format * fixes base llm testing * fix base llm testing * fix test image url	2024-12-13 19:16:28 -08:00
Krish Dholakia	7643b3087c	Litellm dev 12 11 2024 v2 (#7215 ) * feat(bedrock/): add bedrock converse top k param Closes https://github.com/BerriAI/litellm/issues/7087 * Fix bedrock empty content error (#7177) * add resolver * handle empty content on bedrock with default content * use existing default message, tests * Update tests/llm_translation/test_bedrock_completion.py * fix tests * Revert "add resolver" This reverts commit `c717e376ee`. * fallback to empty --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * fix(factory.py): handle empty content blocks in messages Fixes https://github.com/BerriAI/litellm/issues/7169 * feat(router.py): add stripped model check to model fallback search if model_name="openai/gpt-3.5-turbo" and fallback=[{"gpt-3.5-turbo"..}] the fallback should just work as expected * fix: fix linting error * fix(factory.py): fix linting error * fix(factory.py): in base case still support skip empty text blocks --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-12-13 12:49:57 -08:00

... 2 3 4 5 6

290 commits