(feat) add cost tracking for OpenAI prompt caching (#6055)

* add cache_read_input_token_cost for prompt caching models * add prompt caching for latest models * add openai cost calculator * add openai prompt caching test * fix lint check * add not on how usage._cache_read_input_tokens is used * fix cost calc whisper openai * use output_cost_per_second * add input_cost_per_second
2025-04-25 18:54:30 +00:00 · 2024-10-05 14:20:15 +05:30 · 2024-10-05 14:20:15 +05:30 · dee18cbf31
commit dee18cbf31
parent f6adb9f8f2
5 changed files with 202 additions and 32 deletions
--- a/litellm/utils.py
+++ b/litellm/utils.py
@ -5092,6 +5092,7 @@ def get_model_info(model: str, custom_llm_provider: Optional[str] = None) -> Mod
                output_cost_per_character_above_128k_tokens=_model_info.get(
                    "output_cost_per_character_above_128k_tokens", None
                ),
+                output_cost_per_second=_model_info.get("output_cost_per_second", None),
                output_vector_size=_model_info.get("output_vector_size", None),
                litellm_provider=_model_info.get(
                    "litellm_provider", custom_llm_provider