Ishaan Jaff
4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache ( #6208 )
...
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Krish Dholakia
2acb0c0675
Litellm Minor Fixes & Improvements (10/12/2024) ( #6179 )
...
* build(model_prices_and_context_window.json): add bedrock llama3.2 pricing
* build(model_prices_and_context_window.json): add bedrock cross region inference pricing
* Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165 )"
This reverts commit 2a5624af47
.
* add azure/gpt-4o-2024-05-13 (#6174 )
* LiteLLM Minor Fixes & Improvements (10/10/2024) (#6158 )
* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic
* fix(vertex_ai/): support passing custom api base to partner models
Fixes https://github.com/BerriAI/litellm/issues/4317
* fix(proxy_server.py): Fix prometheus premium user check logic
* docs(prometheus.md): update quick start docs
* fix(custom_llm.py): support passing dynamic api key + api base
* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints
Closes https://github.com/BerriAI/litellm/issues/6081
* feat(openai/realtime): add openai realtime api logging
Closes https://github.com/BerriAI/litellm/issues/6081
* fix(realtime_streaming.py): fix linting errors
* fix(realtime_streaming.py): fix linting errors
* fix: fix linting errors
* fix pattern match router
* Add literalai in the sidebar observability category (#6163 )
* fix: add literalai in the sidebar
* fix: typo
* update (#6160 )
* Feat: Add Langtrace integration (#5341 )
* Feat: Add Langtrace integration
* add langtrace service name
* fix timestamps for traces
* add tests
* Discard Callback + use existing otel logger
* cleanup
* remove print statments
* remove callback
* add docs
* docs
* add logging docs
* format logging
* remove emoji and add litellm proxy example
* format logging
* format `logging.md`
* add langtrace docs to logging.md
* sync conflict
* docs fix
* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165 )
* fix move s3 to use customLogger
* add basic s3 logging test
* add s3 to custom logger compatible
* use batch logger for s3
* s3 set flush interval and batch size
* fix s3 logging
* add notes on s3 logging
* fix s3 logging
* add basic s3 logging test
* fix s3 type errors
* add test for sync logging on s3
* fix: fix to debug log
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
* docs(custom_llm_server.md): update doc on passing custom params
* fix(pass_through_endpoints.py): don't require headers
Fixes https://github.com/BerriAI/litellm/issues/6128
* feat(utils.py): add support for caching rerank endpoints
Closes https://github.com/BerriAI/litellm/issues/6144
* feat(litellm_logging.py'): add response headers for failed requests
Closes https://github.com/BerriAI/litellm/issues/6159
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-12 11:48:34 -07:00
Ishaan Jaff
2c8bba293f
(bug fix) TTL not being set for embedding caching requests ( #6095 )
...
* fix ttl for cache pipeline settings
* add test for caching
* add test for setting ttls on redis caching
2024-10-07 15:53:18 +05:30
Krish Dholakia
fac3b2ee42
Add pyright to ci/cd + Fix remaining type-checking errors ( #6082 )
...
* fix: fix type-checking errors
* fix: fix additional type-checking errors
* fix: additional type-checking error fixes
* fix: fix additional type-checking errors
* fix: additional type-check fixes
* fix: fix all type-checking errors + add pyright to ci/cd
* fix: fix incorrect import
* ci(config.yml): use mypy on ci/cd
* fix: fix type-checking errors in utils.py
* fix: fix all type-checking errors on main.py
* fix: fix mypy linting errors
* fix(anthropic/cost_calculator.py): fix linting errors
* fix: fix mypy linting errors
* fix: fix linting errors
2024-10-05 17:04:00 -04:00
Krish Dholakia
5c33d1c9af
Litellm Minor Fixes & Improvements (10/03/2024) ( #6049 )
...
* fix(proxy_server.py): remove spendlog fixes from proxy startup logic
Moves https://github.com/BerriAI/litellm/pull/4794 to `/db_scripts` and cleans up some caching-related debug info (easier to trace debug logs)
* fix(langfuse_endpoints.py): Fixes https://github.com/BerriAI/litellm/issues/6041
* fix(azure.py): fix health checks for azure audio transcription models
Fixes https://github.com/BerriAI/litellm/issues/5999
* Feat: Add Literal AI Integration (#5653 )
* feat: add Literal AI integration
* update readme
* Update README.md
* fix: address comments
* fix: remove literalai sdk
* fix: use HTTPHandler
* chore: add test
* fix: add asyncio lock
* fix(literal_ai.py): fix linting errors
* fix(literal_ai.py): fix linting errors
* refactor: cleanup
---------
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
2024-10-03 18:02:28 -04:00
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement ( #5992 )
...
* ci(config.yml): add a 'check_code_quality' step
Addresses https://github.com/BerriAI/litellm/issues/5991
* ci(config.yml): check why circle ci doesn't pick up this test
* ci(config.yml): fix to run 'check_code_quality' tests
* fix(__init__.py): fix unprotected import
* fix(__init__.py): don't remove unused imports
* build(ruff.toml): update ruff.toml to ignore unused imports
* fix: fix: ruff + pyright - fix linting + type-checking errors
* fix: fix linting errors
* fix(lago.py): fix module init error
* fix: fix linting errors
* ci(config.yml): cd into correct dir for checks
* fix(proxy_server.py): fix linting error
* fix(utils.py): fix bare except
causes ruff linting errors
* fix: ruff - fix remaining linting errors
* fix(clickhouse.py): use standard logging object
* fix(__init__.py): fix unprotected import
* fix: ruff - fix linting errors
* fix: fix linting errors
* ci(config.yml): cleanup code qa step (formatting handled in local_testing)
* fix(_health_endpoints.py): fix ruff linting errors
* ci(config.yml): just use ruff in check_code_quality pipeline for now
* build(custom_guardrail.py): include missing file
* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Krrish Dholakia
575b7911b2
fix(caching.py): cleanup print_stack()
2024-09-28 21:08:15 -07:00
Krrish Dholakia
81d6c5e5a5
fix(router.py): skip setting model_group response headers for now
...
current implementation increases redis cache calls by 3x
2024-09-28 21:08:15 -07:00
Krrish Dholakia
efc06d4a03
fix(batch_redis_get.py): handle custom namespace
...
Fix https://github.com/BerriAI/litellm/issues/5917
2024-09-28 21:08:14 -07:00
Ishaan Jaff
7500855654
fix redis async_set_cache_pipeline when empty list passed to it ( #5962 )
2024-09-28 13:32:00 -07:00
Ishaan Jaff
f4613a100d
[Perf Proxy] parallel request limiter - use one cache update call ( #5932 )
...
* fix parallel request limiter - use one cache update call
* ci/cd run again
* run ci/cd again
* use docker username password
* fix config.yml
* fix config
* fix config
* fix config.yml
* ci/cd run again
* use correct typing for batch set cache
* fix async_set_cache_pipeline
* fix only check user id tpm / rpm limits when limits set
* fix test_openai_azure_embedding_with_oidc_and_cf
2024-09-27 17:24:46 -07:00
Ishaan Jaff
7cbcf538c6
[Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL ( #5881 )
...
* fix use previous internal usage caching logic
* fix test_dual_cache_uses_redis
* redis track event_metadata in service logging
* show otel error on _get_parent_otel_span_from_kwargs
* track parent otel span on internal usage cache
* update_request_status
* fix internal usage cache
* fix linting
* fix test internal usage cache
* fix linting error
* show event metadata in redis set
* fix test_get_team_redis
* fix test_get_team_redis
* test_proxy_logging_setup
2024-09-25 10:57:08 -07:00
Ishaan Jaff
2000e8cde9
[Perf Fix] Don't always read from Redis by Default ( #5877 )
...
* fix use previous internal usage caching logic
* fix test_dual_cache_uses_redis
2024-09-24 21:34:18 -07:00
Krish Dholakia
8039b95aaf
LiteLLM Minor Fixes & Improvements (09/21/2024) ( #5819 )
...
* fix(router.py): fix error message
* Litellm disable keys (#5814 )
* build(schema.prisma): allow blocking/unblocking keys
Fixes https://github.com/BerriAI/litellm/issues/5328
* fix(key_management_endpoints.py): fix pop
* feat(auth_checks.py): allow admin to enable/disable virtual keys
Closes https://github.com/BerriAI/litellm/issues/5328
* docs(vertex.md): add auth section for vertex ai
Addresses - https://github.com/BerriAI/litellm/issues/5768#issuecomment-2365284223
* build(model_prices_and_context_window.json): show which models support prompt_caching
Closes https://github.com/BerriAI/litellm/issues/5776
* fix(router.py): allow setting default priority for requests
* fix(router.py): add 'retry-after' header for concurrent request limit errors
Fixes https://github.com/BerriAI/litellm/issues/5783
* fix(router.py): correctly raise and use retry-after header from azure+openai
Fixes https://github.com/BerriAI/litellm/issues/5783
* fix(user_api_key_auth.py): fix valid token being none
* fix(auth_checks.py): fix model dump for cache management object
* fix(user_api_key_auth.py): pass prisma_client to obj
* test(test_otel.py): update test for new key check
* test: fix test
2024-09-21 18:51:53 -07:00
Krish Dholakia
98c34a7e27
LiteLLM Minor Fixes and Improvements (11/09/2024) ( #5634 )
...
* fix(caching.py): set ttl for async_increment cache
fixes issue where ttl for redis client was not being set on increment_cache
Fixes https://github.com/BerriAI/litellm/issues/5609
* fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis
Fixes https://github.com/BerriAI/litellm/issues/5609
* fix(router.py): support adding retry policy + allowed fails policy via config.yaml
* fix(router.py): don't cooldown single deployments
No point, as there's no other deployment to loadbalance with.
* fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens
Closes https://github.com/BerriAI/litellm/issues/5605
* docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs
* fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set
Fixes issue where key logging would not be set if team metadata was not none
* fix(secret_managers/main.py): load environment variables correctly
Fixes issue where os.environ/ was not being loaded correctly
* test(test_router.py): fix test
* feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek
* test: fix tests
* test: fix test
* test: fix test
* test: fix test
* test: fix test
2024-09-11 22:36:06 -07:00
Ishaan Jaff
421b857714
pass llm provider when creating async httpx clients
2024-09-10 11:51:42 -07:00
Ishaan Jaff
d4b9a1307d
rename get_async_httpx_client
2024-09-10 10:38:01 -07:00
Ishaan Jaff
81ee1653af
use correct type hints for audio transcriptions
2024-09-05 09:12:27 -07:00
Krish Dholakia
1e7e538261
LiteLLM Minor fixes + improvements (08/04/2024) ( #5505 )
...
* Minor IAM AWS OIDC Improvements (#5246 )
* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.
* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.
* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.
* fix(router.py): log rejected requests
Fixes https://github.com/BerriAI/litellm/issues/5498
* refactor: don't use verbose_logger.exception, if exception is raised
User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.
* fix(datadog.py): support setting datadog source as an env var
Fixes https://github.com/BerriAI/litellm/issues/5508
* docs(logging.md): add dd_source to datadog docs
* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers
* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509 )
* feat(anthropic.py): support 'cache_control' param for content when it is a string
* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519 )
This reverts commit 3fac0349c2
.
* refactor: ci/cd run again
---------
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-09-04 22:16:55 -07:00
Ishaan Jaff
ca5a117544
dual cache use always read redis as True by default
2024-09-04 08:01:55 -07:00
Ishaan Jaff
fd122cb759
fix always read redis
2024-09-02 21:08:32 -07:00
Ishaan Jaff
15296b4fb7
fix allow qdrant api key to be optional
2024-08-30 11:13:23 -07:00
Krrish Dholakia
0eea01dae9
feat(vertex_ai_context_caching.py): check gemini cache, if key already exists
2024-08-26 22:19:01 -07:00
Ishaan Jaff
feb354d3bc
fix should_use_cache
2024-08-24 09:37:41 -07:00
Ishaan Jaff
3c1da2e823
feat - allow setting cache mode
2024-08-24 09:03:59 -07:00
Krrish Dholakia
e2d7539690
feat(caching.py): redis cluster support
...
Closes https://github.com/BerriAI/litellm/issues/4358
2024-08-21 15:01:52 -07:00
Ishaan Jaff
e7ecb2fe3a
fix qdrant litellm on proxy
2024-08-21 12:52:29 -07:00
Ishaan Jaff
c6dfd2d276
fixes for using qdrant with litellm proxy
2024-08-21 12:36:41 -07:00
Ishaan Jaff
428a74be07
fix drant url
2024-08-21 12:09:09 -07:00
Ishaan Jaff
7d0196191f
Merge pull request #5018 from haadirakhangi/main
...
Qdrant Semantic Caching
2024-08-21 08:50:43 -07:00
Haadi Rakhangi
7f1c3f5edf
implemented RestAPI and added support for cloud and local Qdrant clusters
2024-08-19 20:46:30 +05:30
Krrish Dholakia
61f4b71ef7
refactor: replace .error() with .exception() logging for better debugging on sentry
2024-08-16 09:22:47 -07:00
prd-tuong-nguyen
3445174ebe
feat: hash prompt when caching
2024-08-08 16:19:14 +07:00
Ishaan Jaff
467c506e33
caching use file_checksum
2024-08-06 13:03:14 -07:00
Krrish Dholakia
a9fdfb5a99
fix(init.py): rename feature_flag
2024-08-05 11:23:20 -07:00
Krrish Dholakia
3c4c78a71f
feat(caching.py): enable caching on provider-specific optional params
...
Closes https://github.com/BerriAI/litellm/issues/5049
2024-08-05 11:18:59 -07:00
Ishaan Jaff
b6b19dc128
use file name when getting cache key
2024-08-02 14:52:08 -07:00
Haadi Rakhangi
851db5ecea
qdrant semantic caching added
2024-08-02 21:07:19 +05:30
Krrish Dholakia
31445ab20a
fix(caching.py): support /completion caching by default
...
updates supported call types in redis cache to cover text_completion caching
2024-07-29 08:19:30 -07:00
Ishaan Jaff
19fb5cc11c
use common helpers for writing to otel
2024-07-27 11:40:39 -07:00
Ishaan Jaff
2a89486948
move _get_parent_otel_span_from_kwargs to otel.py
2024-07-27 11:12:13 -07:00
Ishaan Jaff
677db38f8b
add doc string to explain what delete cache does
2024-07-13 12:25:31 -07:00
Ishaan Jaff
0099bf7859
de-ref unused cache items
2024-07-12 16:38:36 -07:00
Krrish Dholakia
3f83e8a8d4
fix(caching.py): fix async redis health check
2024-07-06 09:14:29 -07:00
Ishaan Jaff
511dd18e4b
remove debug print statement
2024-06-27 20:58:29 -07:00
Ishaan Jaff
e899359427
ci/cd add debugging for cache eviction
2024-06-25 08:14:09 -07:00
Ishaan Jaff
05fe43f495
fix default ttl for InMemoryCache
2024-06-24 21:21:38 -07:00
Ishaan Jaff
fa57d2e823
feat use custom eviction policy
2024-06-24 20:28:03 -07:00
Ishaan Jaff
effc7579ac
fix install on python 3.8
2024-06-24 17:27:14 -07:00
Ishaan Jaff
b13a93d9bc
cleanup InMemoryCache
2024-06-24 17:24:59 -07:00