Krish Dholakia
7338b24a74
refactor(redis_cache.py): use a default cache value when writing to r… ( #6358 )
...
* refactor(redis_cache.py): use a default cache value when writing to redis
prevent redis from blowing up in high traffic
* refactor(redis_cache.py): refactor all cache writes to use self.get_ttl
ensures default ttl always used when writing to redis
Prevents redis db from blowing up in prod
2024-10-21 16:42:12 -07:00
Krish Dholakia
c58d542282
Litellm openai audio streaming ( #6325 )
...
* refactor(main.py): streaming_chunk_builder
use <100 lines of code
refactor each component into a separate function - easier to maintain + test
* fix(utils.py): handle choices being None
openai pydantic schema updated
* fix(main.py): fix linting error
* feat(streaming_chunk_builder_utils.py): update stream chunk builder to support rebuilding audio chunks from openai
* test(test_custom_callback_input.py): test message redaction works for audio output
* fix(streaming_chunk_builder_utils.py): return anthropic token usage info directly
* fix(stream_chunk_builder_utils.py): run validation check before entering chunk processor
* fix(main.py): fix import
2024-10-19 16:16:51 -07:00
Ishaan Jaff
979e8ea526
(refactor) get_cache_key
to be under 100 LOC function ( #6327 )
...
* refactor - use helpers for name space and hashing
* use openai to get the relevant supported params
* use helpers for getting cache key
* fix test caching
* use get/set helpers for preset cache keys
* make get_cache_key under 100 LOC
* fix _get_model_param_value
* fix _get_caching_group
* fix linting error
* add unit testing for get cache key
* test_generate_streaming_content
2024-10-19 15:21:11 +05:30
Krish Dholakia
38a9a106d2
LiteLLM Minor Fixes & Improvements (10/16/2024) ( #6265 )
...
* fix(caching_handler.py): handle positional arguments in add cache logic
Fixes https://github.com/BerriAI/litellm/issues/6264
* feat(litellm_pre_call_utils.py): allow forwarding openai org id to backend client
https://github.com/BerriAI/litellm/issues/6237
* docs(configs.md): add 'forward_openai_org_id' to docs
* fix(proxy_server.py): return model info if user_model is set
Fixes https://github.com/BerriAI/litellm/issues/6233
* fix(hosted_vllm/chat/transformation.py): don't set tools unless non-none
* fix(openai.py): improve debug log for openai 'str' error
Addresses https://github.com/BerriAI/litellm/issues/6272
* fix(proxy_server.py): fix linting error
* fix(proxy_server.py): fix linting errors
* test: skip WIP test
* docs(openai.md): add docs on passing openai org id from client to openai
2024-10-16 22:16:23 -07:00
Ishaan Jaff
97ba4eea7d
(refactor) sync caching - use LLMCachingHandler
class for get_cache ( #6249 )
...
* caching - use _sync_set_cache
* add sync _sync_add_streaming_response_to_cache
* use caching class for cache storage
* fix use _sync_get_cache
* fix circular import
* use _update_litellm_logging_obj_environment
* use one helper for _process_async_embedding_cached_response
* fix _is_call_type_supported_by_cache
* fix checking cache
* fix sync get cache
* fix use _combine_cached_embedding_response_with_api_result
* fix _update_litellm_logging_obj_environment
* adjust test_redis_cache_acompletion_stream_bedrock
2024-10-16 12:33:49 +05:30
Ishaan Jaff
cda0a993e2
fix importing Cache from litellm ( #6219 )
2024-10-15 08:47:23 +05:30
Ishaan Jaff
4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache ( #6208 )
...
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Krish Dholakia
2acb0c0675
Litellm Minor Fixes & Improvements (10/12/2024) ( #6179 )
...
* build(model_prices_and_context_window.json): add bedrock llama3.2 pricing
* build(model_prices_and_context_window.json): add bedrock cross region inference pricing
* Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165 )"
This reverts commit 2a5624af47
.
* add azure/gpt-4o-2024-05-13 (#6174 )
* LiteLLM Minor Fixes & Improvements (10/10/2024) (#6158 )
* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic
* fix(vertex_ai/): support passing custom api base to partner models
Fixes https://github.com/BerriAI/litellm/issues/4317
* fix(proxy_server.py): Fix prometheus premium user check logic
* docs(prometheus.md): update quick start docs
* fix(custom_llm.py): support passing dynamic api key + api base
* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints
Closes https://github.com/BerriAI/litellm/issues/6081
* feat(openai/realtime): add openai realtime api logging
Closes https://github.com/BerriAI/litellm/issues/6081
* fix(realtime_streaming.py): fix linting errors
* fix(realtime_streaming.py): fix linting errors
* fix: fix linting errors
* fix pattern match router
* Add literalai in the sidebar observability category (#6163 )
* fix: add literalai in the sidebar
* fix: typo
* update (#6160 )
* Feat: Add Langtrace integration (#5341 )
* Feat: Add Langtrace integration
* add langtrace service name
* fix timestamps for traces
* add tests
* Discard Callback + use existing otel logger
* cleanup
* remove print statments
* remove callback
* add docs
* docs
* add logging docs
* format logging
* remove emoji and add litellm proxy example
* format logging
* format `logging.md`
* add langtrace docs to logging.md
* sync conflict
* docs fix
* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165 )
* fix move s3 to use customLogger
* add basic s3 logging test
* add s3 to custom logger compatible
* use batch logger for s3
* s3 set flush interval and batch size
* fix s3 logging
* add notes on s3 logging
* fix s3 logging
* add basic s3 logging test
* fix s3 type errors
* add test for sync logging on s3
* fix: fix to debug log
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
* docs(custom_llm_server.md): update doc on passing custom params
* fix(pass_through_endpoints.py): don't require headers
Fixes https://github.com/BerriAI/litellm/issues/6128
* feat(utils.py): add support for caching rerank endpoints
Closes https://github.com/BerriAI/litellm/issues/6144
* feat(litellm_logging.py'): add response headers for failed requests
Closes https://github.com/BerriAI/litellm/issues/6159
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-12 11:48:34 -07:00
Krish Dholakia
6005450c8f
LiteLLM Minor Fixes & Improvements (10/09/2024) ( #6139 )
...
* fix(utils.py): don't return 'none' response headers
Fixes https://github.com/BerriAI/litellm/issues/6123
* fix(vertex_and_google_ai_studio_gemini.py): support parsing out additional properties and strict value for tool calls
Fixes https://github.com/BerriAI/litellm/issues/6136
* fix(cost_calculator.py): set default character value to none
Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403290196
* fix(google.py): fix cost per token / cost per char conversion
Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403370287
* build(model_prices_and_context_window.json): update gemini pricing
Fixes https://github.com/BerriAI/litellm/issues/6133
* build(model_prices_and_context_window.json): update gemini pricing
* fix(litellm_logging.py): fix streaming caching logging when 'turn_off_message_logging' enabled
Stores unredacted response in cache
* build(model_prices_and_context_window.json): update gemini-1.5-flash pricing
* fix(cost_calculator.py): fix default prompt_character count logic
Fixes error in gemini cost calculation
* fix(cost_calculator.py): fix cost calc for tts models
2024-10-10 00:42:11 -07:00
Ishaan Jaff
2c8bba293f
(bug fix) TTL not being set for embedding caching requests ( #6095 )
...
* fix ttl for cache pipeline settings
* add test for caching
* add test for setting ttls on redis caching
2024-10-07 15:53:18 +05:30
Krrish Dholakia
3560f0ef2c
refactor: move all testing to top-level of repo
...
Closes https://github.com/BerriAI/litellm/issues/486
2024-09-28 21:08:14 -07:00