Ishaan Jaff
4d253e473a
[Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL ( #5881 )
...
* fix use previous internal usage caching logic
* fix test_dual_cache_uses_redis
* redis track event_metadata in service logging
* show otel error on _get_parent_otel_span_from_kwargs
* track parent otel span on internal usage cache
* update_request_status
* fix internal usage cache
* fix linting
* fix test internal usage cache
* fix linting error
* show event metadata in redis set
* fix test_get_team_redis
* fix test_get_team_redis
* test_proxy_logging_setup
2024-09-25 10:57:08 -07:00
Krish Dholakia
7107375ec3
LiteLLM Minor Fixes and Improvements (09/14/2024) ( #5697 )
...
* fix(health_check.py): hide sensitive keys from health check debug information k
* fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue
* fix(vertex_llm_base.py): fix exception message to not log credentials
2024-09-14 10:32:39 -07:00
Ishaan Jaff
bfb5136489
refactor vertex endpoints to pass through all routes
2024-08-21 17:08:42 -07:00
Ishaan Jaff
94e74b9ede
inly write model tpm/rpm tracking when user set it
2024-08-18 09:58:09 -07:00
Ishaan Jaff
8578301116
fix async_pre_call_hook in parallel request limiter
2024-08-17 12:42:28 -07:00
Ishaan Jaff
db8f789318
Merge pull request #5259 from BerriAI/litellm_return_remaining_tokens_in_header
...
[Feat] return `x-litellm-key-remaining-requests-{model}`: 1, `x-litellm-key-remaining-tokens-{model}: None` in response headers
2024-08-17 12:41:16 -07:00
Ishaan Jaff
9f6630912d
feat return rmng tokens for model for api key
2024-08-17 12:35:10 -07:00
Ishaan Jaff
a62277a6aa
feat - use commong helper for getting model group
2024-08-17 10:46:04 -07:00
Ishaan Jaff
03196742d2
add litellm-key-remaining-tokens on prometheus
2024-08-17 10:02:20 -07:00
Ishaan Jaff
8ae626b31f
feat add settings for rpm/tpm limits for a model
2024-08-17 09:16:01 -07:00
Ishaan Jaff
824ea32452
track rpm/tpm usage per key+model
2024-08-16 18:28:58 -07:00
Krrish Dholakia
2874b94fb1
refactor: replace .error() with .exception() logging for better debugging on sentry
2024-08-16 09:22:47 -07:00
Krrish Dholakia
e6bc7e938a
fix(utils.py): handle scenario where model="azure/*" and custom_llm_provider="azure"
...
Fixes https://github.com/BerriAI/litellm/issues/4912
2024-08-02 17:48:53 -07:00
Ishaan Jaff
bda2ac1af5
fix raise better error when crossing tpm / rpm limits
2024-07-26 17:35:08 -07:00
Krrish Dholakia
af2055c2b7
feat(aporio_ai.py): support aporio ai prompt injection for chat completion requests
...
Closes https://github.com/BerriAI/litellm/issues/2950
2024-07-17 16:38:47 -07:00
Krrish Dholakia
17635450cd
feat(proxy_server.py): return 'retry-after' param for rate limited requests
...
Closes https://github.com/BerriAI/litellm/issues/4695
2024-07-13 17:15:20 -07:00
Krrish Dholakia
1d6643df22
feat(pass_through_endpoint.py): support enforcing key rpm limits on pass through endpoints
...
Closes https://github.com/BerriAI/litellm/issues/4698
2024-07-13 13:29:44 -07:00
Krrish Dholakia
77328e4a28
fix(parallel_request_limiter.py): use redis cache, if available for rate limiting across instances
...
Fixes https://github.com/BerriAI/litellm/issues/4148
2024-06-12 10:35:48 -07:00
Krrish Dholakia
56fd0c60d1
fix(parallel_request_limiter.py): fix user+team tpm/rpm limit check
...
Closes https://github.com/BerriAI/litellm/issues/3788
2024-05-27 08:48:23 -07:00
Ishaan Jaff
eb58440ebf
feat - add end user rate limiting
2024-05-22 14:01:57 -07:00
Krrish Dholakia
3f339cb694
fix(parallel_request_limiter.py): fix max parallel request limiter on retries
2024-05-15 20:16:11 -07:00
Krrish Dholakia
737bb3e444
fix(proxy_server.py): fix tpm/rpm limiting for jwt auth
...
fixes tpm/rpm limiting for jwt auth and implements unit tests for jwt auth
2024-03-28 21:19:34 -07:00
Krrish Dholakia
d2f47ee45b
fix(parallel_request_limiter.py): handle metadata being none
2024-03-14 10:02:41 -07:00
Krrish Dholakia
c963e2761b
feat(proxy_server.py): retry if virtual key is rate limited
...
currently for chat completions
2024-03-05 19:00:03 -08:00
Krrish Dholakia
f72b84f6e0
fix(parallel_request_limiter.py): handle none scenario
2024-02-26 20:09:06 -08:00
Krrish Dholakia
7fff5119de
fix(parallel_request_limiter.py): fix team rate limit enforcement
2024-02-26 18:06:13 -08:00
Krrish Dholakia
5213fd2e1e
feat(parallel_request_limiter.py): enforce team based tpm / rpm limits
2024-02-26 16:20:41 -08:00
ishaan-jaff
5ec69a0ca5
(fix) failing parallel_Request_limiter test
2024-02-22 19:16:22 -08:00
ishaan-jaff
b728ded300
(fix) don't double check curr data and time
2024-02-22 18:50:02 -08:00
ishaan-jaff
74d66d5ac5
(feat) tpm/rpm limit by User
2024-02-22 18:44:03 -08:00
Krrish Dholakia
07aa05bf17
fix(test_parallel_request_limiter.py): use mock responses for streaming
2024-02-08 21:45:38 -08:00
ishaan-jaff
c8b2f0fd5d
(fix) parallel_request_limiter debug
2024-02-06 12:43:28 -08:00
Krrish Dholakia
dbf2b0b2c8
fix(utils.py): override default success callbacks with dynamic callbacks if set
2024-02-02 06:21:43 -08:00
Krrish Dholakia
c91ab81fde
fix(test_parallel_request_limiter): increase time limit for waiting for success logging event to happen
2024-01-30 13:26:17 -08:00
Krrish Dholakia
e957f41ab7
fix(utils.py): add metadata to logging obj on setup, if exists
2024-01-19 17:29:47 -08:00
Krrish Dholakia
f73a4ae7c2
fix(parallel_request_limiter.py): handle tpm/rpm limits being null
2024-01-19 10:22:27 -08:00
Krrish Dholakia
34c3b33b37
test(test_parallel_request_limiter.py): unit testing for tpm/rpm rate limits
2024-01-18 15:28:28 -08:00
Krrish Dholakia
13b013b28d
feat(parallel_request_limiter.py): add support for tpm/rpm limits
2024-01-18 13:52:15 -08:00
Krrish Dholakia
44553bcc3a
fix(parallel_request_limiter.py): decrement count for failed llm calls
...
https://github.com/BerriAI/litellm/issues/1477
2024-01-18 12:42:14 -08:00
Krrish Dholakia
79978c44ba
refactor: add black formatting
2023-12-25 14:11:20 +05:30
Krrish Dholakia
018405b956
fix(proxy/utils.py): return different exceptions if key is invalid vs. expired
...
https://github.com/BerriAI/litellm/issues/1230
2023-12-25 10:29:44 +05:30
Krrish Dholakia
72e8c84914
build(test_streaming.py): fix linting issues
2023-12-25 07:34:54 +05:30
Krrish Dholakia
1da7d35218
feat(proxy_server.py): enable infinite retries on rate limited requests
2023-12-15 20:03:41 -08:00
Krrish Dholakia
3fbeca134f
fix(custom_logger.py): enable pre_call hooks to modify incoming data to proxy
2023-12-13 16:20:37 -08:00
Krrish Dholakia
8eb7dc6393
fix(proxy_server.py): support for streaming
2023-12-09 16:23:04 -08:00
Krrish Dholakia
9c6584a376
fix(proxy_server.py): enable pre+post-call hooks and max parallel request limits
2023-12-08 17:11:30 -08:00