litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 03:04:13 +00:00

History

Krish Dholakia 36308a31be Gemini-2.5-flash - support reasoning cost calc + return reasoning content (#10141 ) * build(model_prices_and_context_window.json): add vertex ai gemini-2.5-flash pricing * build(model_prices_and_context_window.json): add gemini reasoning token pricing * fix(vertex_and_google_ai_studio_gemini.py): support counting thinking tokens for gemini allows accurate cost calc * fix(utils.py): add reasoning token cost calc to generic cost calc ensures gemini-2.5-flash cost calculation is accurate * build(model_prices_and_context_window.json): mark gemini-2.5-flash as 'supports_reasoning' * feat(gemini/): support 'thinking' + 'reasoning_effort' params + new unit tests allow controlling thinking effort for gemini-2.5-flash models * test: update unit testing * feat(vertex_and_google_ai_studio_gemini.py): return reasoning content if given in gemini response * test: update model name * fix: fix ruff check * test(test_spend_management_endpoints.py): update tests to be less sensitive to new keys / updates to usage object * fix(vertex_and_google_ai_studio_gemini.py): fix translation		2025-04-19 09:20:52 -07:00
..
audio_utils	(Refactor) - Re use litellm.completion/litellm.embedding etc for health checks (#7455 )	2024-12-28 18:38:54 -08:00
llm_cost_calc	Gemini-2.5-flash - support reasoning cost calc + return reasoning content (#10141 )	2025-04-19 09:20:52 -07:00
llm_response_utils	fix(cost_calculator.py): handle custom pricing at deployment level fo… (#9855 )	2025-04-09 22:13:10 -07:00
prompt_templates	fix(factory.py): correct indentation for message index increment in ollama, This fixes bug #9822 (#9943 )	2025-04-12 09:50:40 -07:00
specialty_caches	Fix team-based logging to langfuse + allow custom tokenizer on `/token_counter` endpoint (#7493 )	2024-12-31 23:18:41 -08:00
tokenizers	Code Quality Improvement - remove `tokenizers/` from /llms (#7163 )	2024-12-10 23:50:15 -08:00
asyncify.py	(core sdk fix) - fix fallbacks stuck in infinite loop (#7751 )	2025-01-13 19:34:34 -08:00
core_helpers.py	build(pyproject.toml): add new dev dependencies - for type checking (#9631 )	2025-03-29 11:02:13 -07:00
credential_accessor.py	fix(router.py): support reusable credentials via passthrough router (#9758 )	2025-04-04 18:40:14 -07:00
dd_tracing.py	(Bug fix) - don't log messages in `model_parameters` in StandardLoggingPayload (#8932 )	2025-03-01 13:39:45 -08:00
default_encoding.py	build(pyproject.toml): add new dev dependencies - for type checking (#9631 )	2025-03-29 11:02:13 -07:00
dot_notation_indexing.py	feat(handle_jwt.py): initial commit adding custom RBAC support on jwt… (#8037 )	2025-01-28 16:27:06 -08:00
duration_parser.py	(Bug Fix + Better Observability) - BudgetResetJob: (#8562 )	2025-02-15 16:13:08 -08:00
exception_mapping_utils.py	exception_type	2025-03-13 20:09:32 -07:00
fallback_utils.py	LiteLLM Minor Fixes & Improvements (2024/16/01) (#7826 )	2025-01-17 20:59:21 -08:00
get_litellm_params.py	VertexAI non-jsonl file storage support (#9781 )	2025-04-09 14:01:48 -07:00
get_llm_provider_logic.py	Squashed commit of the following: (#9709 )	2025-04-02 21:24:54 -07:00
get_model_cost_map.py	Litellm UI qa 04 12 2025 p1 (#9955 )	2025-04-12 19:30:48 -07:00
get_supported_openai_params.py	fix(cost_calculator.py): handle custom pricing at deployment level fo… (#9855 )	2025-04-09 22:13:10 -07:00
health_check_utils.py	(Refactor) - Re use litellm.completion/litellm.embedding etc for health checks (#7455 )	2024-12-28 18:38:54 -08:00
initialize_dynamic_callback_params.py	Fix team-based logging to langfuse + allow custom tokenizer on `/token_counter` endpoint (#7493 )	2024-12-31 23:18:41 -08:00
json_validation_rule.py	feat(vertex_ai_anthropic.py): support response_schema for vertex ai anthropic calls	2024-07-18 16:57:38 -07:00
litellm_logging.py	[UI] Allow setting prompt `cache_control_injection_points` (#10000 )	2025-04-14 21:17:42 -07:00
llm_request_utils.py	Revert "test_completion_mistral_api_mistral_large_function_call"	2025-01-17 07:20:46 -08:00
logging_callback_manager.py	get_custom_loggers_for_type	2025-03-19 16:36:41 -07:00
logging_utils.py	fix(streaming_handler.py): emit deep copy of completed chunk	2025-03-17 17:26:21 -07:00
mock_functions.py	Ensure base_model cost tracking works across all endpoints (#7989 )	2025-01-24 21:05:26 -08:00
model_param_helper.py	build(pyproject.toml): add new dev dependencies - for type checking (#9631 )	2025-03-29 11:02:13 -07:00
README.md	(QOL improvement) Provider budget routing - allow using 1s, 1d, 1mo, 2mo etc (#6885 )	2024-11-23 16:59:46 -08:00
realtime_streaming.py	Realtime API Cost tracking (#9795 )	2025-04-07 16:43:12 -07:00
redact_messages.py	build(pyproject.toml): add new dev dependencies - for type checking (#9631 )	2025-03-29 11:02:13 -07:00
response_header_helpers.py	fix(utils.py): guarantee openai-compatible headers always exist in response	2024-09-28 21:08:15 -07:00
rules.py	Litellm dev 11 07 2024 (#6649 )	2024-11-08 19:34:22 +05:30
safe_json_dumps.py	Add recursion depth to convert_anyof_null_to_nullable, constants.py. Fix recursive_detector.py raise error state	2025-03-28 13:11:19 -07:00
sensitive_data_masker.py	build(pyproject.toml): add new dev dependencies - for type checking (#9631 )	2025-03-29 11:02:13 -07:00
streaming_chunk_builder_utils.py	fix(stream_chunk_builder_utils.py): don't set index on modelresponse (#10063 )	2025-04-16 10:11:47 -07:00
streaming_handler.py	Add inference providers support for Hugging Face (#8258 ) (#9738 ) (#9773 )	2025-04-05 10:50:15 -07:00
thread_pool_executor.py	(Fixes) OpenAI Streaming Token Counting + Fixes usage track when `litellm.turn_off_message_logging=True` (#8156 )	2025-01-31 15:06:37 -08:00
token_counter.py	Squashed commit of the following: (#9709 )	2025-04-02 21:24:54 -07:00

README.md

Folder Contents

This folder contains general-purpose utilities that are used in multiple places in the codebase.

Core files:

streaming_handler.py: The core streaming logic + streaming related helper utils
core_helpers.py: code used in types/ - e.g. map_finish_reason.
exception_mapping_utils.py: utils for mapping exceptions to openai-compatible error types.
default_encoding.py: code for loading the default encoding (tiktoken)
get_llm_provider_logic.py: code for inferring the LLM provider from a given model name.
duration_parser.py: code for parsing durations - e.g. "1d", "1mo", "10s"