litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 18:54:30 +00:00

Author	SHA1	Message	Date
Krrish Dholakia	cb6e9fbe40	refactor: complete migration	2025-03-24 19:52:45 -07:00
Krrish Dholakia	dd2c980d5b	fix(utils.py): Prevents final chunk w/ usage from being ignored Fixes https://github.com/BerriAI/litellm/issues/7112	2025-03-15 09:12:14 -07:00
Krrish Dholakia	0af6cde994	fix(invoke_handler.py): support cache token tracking on converse streaming	2025-03-13 16:10:13 -07:00
Krrish Dholakia	92d85555fe	fix(invoke_handler.py): fix converse chunk parsing to only return empty dict on tool use Fixes https://github.com/BerriAI/litellm/issues/9127	2025-03-11 22:04:17 -07:00
Krish Dholakia	744e10b0f0	Litellm dev 03 05 2025 p3 (#9023 ) * fix(invoke_handler.py): fix converse streaming - return signature + ensure consistency with anthropic api response * build(model_prices_and_context_window.json): fix anthropic api claude-3-7 max output tokens with beta header this is 128k Resolves https://github.com/BerriAI/litellm/issues/8964 * feat(handler.py): handle new anthropic 'thinking_delta' block on streaming Fixes https://github.com/BerriAI/litellm/issues/8825	2025-03-05 22:31:39 -08:00
Krish Dholakia	a65bfab697	Fix calling claude via invoke route + response_format support for claude on invoke route (#8908 ) * fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route move to using anthropic config as base * fix(utils.py): expose anthropic config via providerconfigmanager * fix(llm_http_handler.py): support json mode on async completion calls * fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke * fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config Prevents error when passing in 'response_format: {"type": "text"} * test: fix test * fix(utils.py): fix base invoke provider check * fix(anthropic_claude3_transformation.py): don't pass 'stream' param * fix: fix linting errors * fix(converse_transformation.py): handle response_format type=text for converse	2025-02-28 17:56:26 -08:00
Krish Dholakia	ab7c4d1a0e	Litellm dev bedrock anthropic 3 7 v2 (#8843 ) * feat(bedrock/converse/transformation.py): support claude-3-7-sonnet reasoning_Content transformation Closes https://github.com/BerriAI/litellm/issues/8777 * fix(bedrock/): support returning `reasoning_content` on streaming for claude-3-7 Resolves https://github.com/BerriAI/litellm/issues/8777 * feat(bedrock/): unify converse reasoning content blocks for consistency across anthropic and bedrock * fix(anthropic/chat/transformation.py): handle deepseek-style 'reasoning_content' extraction within transformation.py simpler logic * feat(bedrock/): fix streaming to return blocks in consistent format * fix: fix linting error * test: fix test * feat(factory.py): fix bedrock thinking block translation on tool calling allows passing the thinking blocks back to bedrock for tool calling * fix(types/utils.py): don't exclude provider_specific_fields on model dump ensures consistent responses * fix: fix linting errors * fix(convert_dict_to_response.py): pass reasoning_content on root * fix: test * fix(streaming_handler.py): add helper util for setting model id * fix(streaming_handler.py): fix setting model id on model response stream chunk * fix(streaming_handler.py): fix linting error * fix(streaming_handler.py): fix linting error * fix(types/utils.py): add provider_specific_fields to model stream response * fix(streaming_handler.py): copy provider specific fields and add them to the root of the streaming response * fix(streaming_handler.py): fix check * fix: fix test * fix(types/utils.py): ensure messages content is always openai compatible * fix(types/utils.py): fix delta object to always be openai compatible only introduce new params if variable exists * test: fix bedrock nova tests * test: skip flaky test * test: skip flaky test in ci/cd	2025-02-26 16:05:33 -08:00
Krrish Dholakia	9470f57e86	build: extract <think>..</think> block for amazon deepseek r1 and put in reasoning_content	2025-02-19 21:10:38 -08:00
Krish Dholakia	58141df65d	Litellm dev 02 13 2025 p2 (#8525 ) * fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param Closes https://github.com/BerriAI/litellm/issues/8500 * build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model * style: cleanup invalid json trailing commma * feat(utils.py): support passing 'tokenizer_config' to register_prompt_template enables passing complete tokenizer config of model to litellm Allows calling deepseek on bedrock with the correct prompt template * fix(utils.py): fix register_prompt_template for custom model names * test(test_prompt_factory.py): fix test * test(test_completion.py): add e2e test for bedrock invoke deepseek ft model * feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls enables proxy admin to set base model for ft bedrock deepseek model * feat(bedrock/invoke): support deepseek_r1 route for bedrock makes it easy to apply the right chat template to that call * feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work * test(test_completion.py): add e2e mock test for bedrock deepseek * docs(bedrock.md): document new deepseek_r1 route for bedrock allows us to use the right config * fix(exception_mapping_utils.py): catch read operation timeout	2025-02-13 20:28:42 -08:00
Ishaan Jaff	12ac414839	(Feat) - Allow calling Nova models on `/bedrock/invoke/` (#8397 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 16s Details * add nova to BEDROCK_INVOKE_PROVIDERS_LITERAL * BedrockInvokeNovaRequest * nova + invoke config * add AmazonInvokeNovaConfig * AmazonInvokeNovaConfig * run transform_request for invoke/nova models * AmazonInvokeNovaConfig * rename invoke tests * fix linting error * TestBedrockInvokeNovaJson * TestBedrockInvokeNovaJson * add converse_chunk_parser * test_nova_invoke_remove_empty_system_messages * test_nova_invoke_streaming_chunk_parsing	2025-02-08 13:03:05 -08:00
Ishaan Jaff	b242c66a3b	(Feat) - Add `/bedrock/invoke` support for all Anthropic models (#8383 ) * use anthropic transformation for bedrock/invoke * use anthropic transforms for bedrock invoke claude * TestBedrockInvokeClaudeJson * add AmazonAnthropicClaudeStreamDecoder * pass bedrock_invoke_provider to make_call * fix _get_base_bedrock_model * fix get_bedrock_route * fix bedrock routing * fixes for bedrock invoke * test_all_model_configs * fix AWSEventStreamDecoder linting * fix code qa * test_bedrock_get_base_model * test_get_model_info_bedrock_models * test_bedrock_base_model_helper * test_bedrock_route_detection	2025-02-07 22:41:11 -08:00
Krish Dholakia	dfbbf0bde8	fix: dictionary changed size during iteration error (#8327 ) (#8341 ) Co-authored-by: Joey Feldberg <joeyfeldberg@users.noreply.github.com> Co-authored-by: Joey Feldberg <12495578+joeyfeldberg@users.noreply.github.com>	2025-02-07 16:20:28 -08:00
Ishaan Jaff	818792228c	(Refactor) - migrate bedrock invoke to `BaseLLMHTTPHandler` class (#8290 ) * initial transform for invoke * invoke transform_response * working - able to make request * working get_complete_url * working - invoke now runs on llm_http_handler * fix unused imports * track litellm overhead ms * working stream request * sign_request transform * sign_request update * use has_async_custom_stream_wrapper property * use get_async_custom_stream_wrapper in base llm http handler * fix make_call in invoke handler * fix invoke with streaming get_async_custom_stream_wrapper * working bedrock async streaming with invoke * fix make call handler for bedrock * test_all_model_configs * fix test_bedrock_custom_prompt_template * sync streaming for bedrock invoke * fix _add_stream_param_to_request_body * test_async_text_completion_bedrock * fix transform_request * fix get_supported_openai_params * fix test supports tool choice * fix test_supports_tool_choice * add unit test coverage for bedrock invoke transform * fix location of transformation files * update import loc * fix bedrock invoke unit tests * fix import for max completion tokens	2025-02-05 18:58:55 -08:00
Ishaan Jaff	795a71ce1e	fix linting error	2025-01-31 21:27:56 -08:00
Ishaan Jaff	9ff27809b2	(Feat) add bedrock/deepseek custom import models (#8132 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 16s Details * add support for using llama spec with bedrock * fix get_bedrock_invoke_provider * add support for using bedrock provider in mappings * working request * test_bedrock_custom_deepseek * test_bedrock_custom_deepseek * fix _get_model_id_for_llama_like_model * test_bedrock_custom_deepseek * doc DeepSeek-R1-Distill-Llama-70B * test_bedrock_custom_deepseek	2025-01-31 18:40:44 -08:00
Ishaan Jaff	b6f2e659b9	(Feat) Add x-litellm-overhead-duration-ms and "x-litellm-response-duration-ms" in response from LiteLLM (#7899 ) * add track_llm_api_timing * add track_llm_api_timing * test_litellm_overhead * use ResponseMetadata class for setting hidden params and response overhead * instrument http handler * fix track_llm_api_timing * track_llm_api_timing * emit response overhead on hidden params * fix resp metadata * fix make_sync_openai_embedding_request * test_aaaaatext_completion_endpoint fixes * _get_value_from_hidden_params * set_hidden_params * test_litellm_overhead * test_litellm_overhead * test_litellm_overhead * fix import * test_litellm_overhead_stream * add LiteLLMLoggingObject * use diff folder for testing * use diff folder for overhead testing * test litellm overhead * use typing * clear typing * test_litellm_overhead * fix async_streaming * update_response_metadata * move test file * pply metadata to the response objec	2025-01-21 20:27:55 -08:00
Krish Dholakia	c8aa876785	fix(proxy_server.py): fix get model info when litellm_model_id is set + move model analytics to free (#7886 ) * fix(proxy_server.py): fix get model info when litellm_model_id is set Fixes https://github.com/BerriAI/litellm/issues/7873 * test(test_models.py): add test to ensure get model info on specific deployment has same value as all model info Fixes https://github.com/BerriAI/litellm/issues/7873 * fix(usage.tsx): make model analytics free Fixes @iqballx's feedback * fix(fix(invoke_handler.py):-fix-bedrock-error-chunk-parsing): return correct bedrock status code and error message if chunk in stream Improves bedrock stream error handling * fix(proxy_server.py): fix linting errors * test(test_auth_checks.py): remove redundant test * fix(proxy_server.py): fix linting errors * test: fix flaky test * test: fix test	2025-01-21 08:19:07 -08:00
Krish Dholakia	39dabb2e89	Litellm dev 12 24 2024 p4 (#7407 ) * fix(invoke_handler.py): fix mock response iterator to handle tool calling returns tool call if returned by model response * fix(prometheus.py): add new 'tokens_by_tag' metric on prometheus allows tracking 'token usage' by task * feat(prometheus.py): add input + output token tracking by tag * feat(prometheus.py): add tag based deployment failure tracking allows admin to track failure by use-case	2024-12-24 20:24:06 -08:00
Krish Dholakia	3ac54483a7	Litellm dev 12 24 2024 p3 (#7403 ) * fix(model_prices_and_context_window.json): specify meta llama is a bedrock converse model route Fixes https://github.com/BerriAI/litellm/issues/7385 * test(test_get_model_info.py): enforce all new bedrock chat models added have the bedrock_converse route Prevents https://github.com/BerriAI/litellm/issues/7385 and https://github.com/BerriAI/litellm/discussions/7325 * fix(get_supported_openai_params.py): use vertex gemini config by default for vertex ai route Fixes https://github.com/BerriAI/litellm/issues/7378 * refactor(vertex_ai/gemini/): rename vertexaiconfig to vertexaibaseconfig - make it clear vertexaiconfig = vertexgemini config * build(model_prices_and_context_window.json): add gpt-4o-audio-preview-2024-12-17 Closes https://github.com/BerriAI/litellm/issues/7367 * test: fix test * test: fix o1 tests * fix: handle llm api errors * fix: fix linting errors	2024-12-24 18:07:53 -08:00
Ishaan Jaff	c7f14e936a	(code quality) run ruff rule to ban unused imports (#7313 ) * remove unused imports * fix AmazonConverseConfig * fix test * fix import * ruff check fixes * test fixes * fix testing * fix imports	2024-12-19 12:33:42 -08:00
Krish Dholakia	b82add11ba	LITELLM: Remove `requests` library usage (#7235 ) * fix(generic_api_callback.py): remove requests lib usage * fix(budget_manager.py): remove requests lib usgae * fix(main.py): cleanup requests lib usage * fix(utils.py): remove requests lib usage * fix(argilla.py): fix argilla test * fix(athina.py): replace 'requests' lib usage with litellm module * fix(greenscale.py): replace 'requests' lib usage with httpx * fix: remove unused 'requests' lib import + replace usage in some places * fix(prompt_layer.py): remove 'requests' lib usage from prompt layer * fix(ollama_chat.py): remove 'requests' lib usage * fix(baseten.py): replace 'requests' lib usage * fix(codestral/): replace 'requests' lib usage * fix(predibase/): replace 'requests' lib usage * refactor: cleanup unused 'requests' lib imports * fix(oobabooga.py): cleanup 'requests' lib usage * fix(invoke_handler.py): remove unused 'requests' lib usage * refactor: cleanup unused 'requests' lib import * fix: fix linting errors * refactor(ollama/): move ollama to using base llm http handler removes 'requests' lib dep for ollama integration * fix(ollama_chat.py): fix linting errors * fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama	2024-12-17 12:50:04 -08:00
Krish Dholakia	516c2a6a70	Litellm remove circular imports (#7232 ) * fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py * fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py ' * refactor: fix litellm.ModelResponse import on pass through endpoints * refactor(litellm_logging.py): fix circular import for custom callbacks literal * fix(factory.py): fix circular imports inside prompt factory * fix(cost_calculator.py): fix circular import for 'litellm.Usage' * fix(proxy_server.py): fix potential circular import with `litellm.Router' * fix(proxy/utils.py): fix potential circular import in `litellm.Router` * fix: remove circular imports in 'auth_checks' and 'guardrails/' * fix(prompt_injection_detection.py): fix router impor t * fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through * fix(anthropic_pass_through_logging_handler.py): fix potential circular imports * fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import * fix(base.py): fix potential circular import * fix(handler.py): fix potential circular ref in codestral + cohere handler's * fix(azure.py): fix potential circular imports * fix(gpt_transformation.py): fix modelresponse import * fix(litellm_logging.py): add logging base class - simplify typing makes it easy for other files to type check the logging obj without introducing circular imports * fix(azure_ai/embed): fix potential circular import on handler.py * fix(databricks/): fix potential circular imports in databricks/ * fix(vertex_ai/): fix potential circular imports on vertex ai embeddings * fix(vertex_ai/image_gen): fix import * fix(watsonx-+-bedrock): cleanup imports * refactor(anthropic-pass-through-+-petals): cleanup imports * refactor(huggingface/): cleanup imports * fix(ollama-+-clarifai): cleanup circular imports * fix(openai_like/): fix impor t * fix(openai_like/): fix embedding handler cleanup imports * refactor(openai.py): cleanup imports * fix(sagemaker/transformation.py): fix import * ci(config.yml): add circular import test to ci/cd	2024-12-14 16:28:34 -08:00
Ishaan Jaff	b5d55688e5	(Refactor) Code Quality improvement - remove `/prompt_templates/` , `base_aws_llm.py` from `/llms` folder (#7164 ) * fix move base_aws_llm * fix import * update enforce llms folder style * move prompt_templates * update prompt_templates location * fix imports * fix imports * fix imports * fix imports * fix checks	2024-12-11 00:02:46 -08:00
Krish Dholakia	6bb934c0ac	fix(key_management_endpoints.py): override metadata field value on up… (#7008 ) * fix(key_management_endpoints.py): override metadata field value on update allow user to override tags * feat(__init__.py): expose new disable_end_user_cost_tracking_prometheus_only metric allow disabling end user cost tracking on prometheus - fixes cardinality issue * fix(litellm_pre_call_utils.py): add key/team level enforced params Fixes https://github.com/BerriAI/litellm/issues/6652 * fix(key_management_endpoints.py): allow user to pass in `enforced_params` as a top level param on /key/generate and /key/update * docs(enterprise.md): add docs on enforcing required params for llm requests * Add support of Galadriel API (#7005) * fix(router.py): robust retry after handling set retry after time to 0 if >0 healthy deployments. handle base case = 1 deployment * test(test_router.py): fix test * feat(bedrock/): add support for 'nova' models also adds explicit 'converse/' route for simpler routing * fix: fix 'supports_pdf_input' return if model supports pdf input on get_model_info * feat(converse_transformation.py): support bedrock pdf input * docs(document_understanding.md): add document understanding to docs * fix(litellm_pre_call_utils.py): fix linting error * fix(init.py): fix passing of bedrock converse models * feat(bedrock/converse): support 'response_format={"type": "json_object"}' * fix(converse_handler.py): fix linting error * fix(base_llm_unit_tests.py): fix test * fix: fix test * test: fix test * test: fix test * test: remove duplicate test --------- Co-authored-by: h4n0 <4738254+h4n0@users.noreply.github.com>	2024-12-03 23:03:50 -08:00
Ishaan Jaff	610974b4fc	(code quality) add ruff check PLR0915 for `too-many-statements` (#6309 ) * ruff add PLR0915 * add noqa for PLR0915 * fix noqa * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915	2024-10-18 15:36:49 +05:30
Ishaan Jaff	4d1b4beb3d	(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208 ) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * fix test_embedding_caching_azure_individual_items_reordered	2024-10-14 16:34:01 +05:30
Krish Dholakia	d57be47b0f	Litellm ruff linting enforcement (#5992 ) * ci(config.yml): add a 'check_code_quality' step Addresses https://github.com/BerriAI/litellm/issues/5991 * ci(config.yml): check why circle ci doesn't pick up this test * ci(config.yml): fix to run 'check_code_quality' tests * fix(__init__.py): fix unprotected import * fix(__init__.py): don't remove unused imports * build(ruff.toml): update ruff.toml to ignore unused imports * fix: fix: ruff + pyright - fix linting + type-checking errors * fix: fix linting errors * fix(lago.py): fix module init error * fix: fix linting errors * ci(config.yml): cd into correct dir for checks * fix(proxy_server.py): fix linting error * fix(utils.py): fix bare except causes ruff linting errors * fix: ruff - fix remaining linting errors * fix(clickhouse.py): use standard logging object * fix(__init__.py): fix unprotected import * fix: ruff - fix linting errors * fix: fix linting errors * ci(config.yml): cleanup code qa step (formatting handled in local_testing) * fix(_health_endpoints.py): fix ruff linting errors * ci(config.yml): just use ruff in check_code_quality pipeline for now * build(custom_guardrail.py): include missing file * style(embedding_handler.py): fix ruff check	2024-10-01 19:44:20 -04:00
Ishaan Jaff	cd90807807	fix use converse for all llama3 models (#5729 )	2024-09-19 09:31:52 -07:00
Krish Dholakia	da77706c26	Litellm stable dev (#5711 ) * feat(aws_base_llm.py): prevents recreating boto3 credentials during high traffic Leads to 100ms perf boost in local testing * fix(base_aws_llm.py): fix credential caching check to see if token is set * refactor(bedrock/chat): separate converse api and invoke api + isolate converse api transformation logic Make it easier to see how requests are transformed for /converse * fix: fix imports * fix(bedrock/embed): fix reordering of headers * fix(base_aws_llm.py): fix get credential logic * fix(converse_handler.py): fix ai21 streaming response	2024-09-14 23:22:59 -07:00

29 commits