Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports ( #7313 )
...
* remove unused imports
* fix AmazonConverseConfig
* fix test
* fix import
* ruff check fixes
* test fixes
* fix testing
* fix imports
2024-12-19 12:33:42 -08:00
Ishaan Jaff
2fb2801eb4
(Refactor) Code Quality improvement - stop redefining LiteLLMBase ( #7147 )
...
* fix stop redefining LiteLLMBase
* use better name for base pydantic obj
2024-12-10 15:49:01 -08:00
Ishaan Jaff
441adad3ae
(router_strategy/) ensure all async functions use async cache methods ( #6489 )
...
* fix router strat
* use async set / get cache in router_strategy
* add coverage for router strategy
* fix imports
* fix batch_get_cache
* use async methods for least busy
* fix least busy use async methods
* fix test_dual_cache_increment
* test async_get_available_deployment when routing_strategy="least-busy"
2024-10-29 21:07:17 +05:30
Krish Dholakia
4f8a3fd4cf
redis otel tracing + async support for latency routing ( #6452 )
...
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
2024-10-28 21:52:12 -07:00
Ishaan Jaff
610974b4fc
(code quality) add ruff check PLR0915 for too-many-statements
( #6309 )
...
* ruff add PLR0915
* add noqa for PLR0915
* fix noqa
* add # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* add # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
2024-10-18 15:36:49 +05:30
Ishaan Jaff
4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache ( #6208 )
...
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement ( #5992 )
...
* ci(config.yml): add a 'check_code_quality' step
Addresses https://github.com/BerriAI/litellm/issues/5991
* ci(config.yml): check why circle ci doesn't pick up this test
* ci(config.yml): fix to run 'check_code_quality' tests
* fix(__init__.py): fix unprotected import
* fix(__init__.py): don't remove unused imports
* build(ruff.toml): update ruff.toml to ignore unused imports
* fix: fix: ruff + pyright - fix linting + type-checking errors
* fix: fix linting errors
* fix(lago.py): fix module init error
* fix: fix linting errors
* ci(config.yml): cd into correct dir for checks
* fix(proxy_server.py): fix linting error
* fix(utils.py): fix bare except
causes ruff linting errors
* fix: ruff - fix remaining linting errors
* fix(clickhouse.py): use standard logging object
* fix(__init__.py): fix unprotected import
* fix: ruff - fix linting errors
* fix: fix linting errors
* ci(config.yml): cleanup code qa step (formatting handled in local_testing)
* fix(_health_endpoints.py): fix ruff linting errors
* ci(config.yml): just use ruff in check_code_quality pipeline for now
* build(custom_guardrail.py): include missing file
* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Krrish Dholakia
61f4b71ef7
refactor: replace .error() with .exception() logging for better debugging on sentry
2024-08-16 09:22:47 -07:00
Krrish Dholakia
6cca5612d2
refactor: replace 'traceback.print_exc()' with logging library
...
allows error logs to be in json format for otel logging
2024-06-06 13:47:43 -07:00
Krrish Dholakia
f19d7327ca
fix(lowest_latency.py): set default none value for time_to_first_token in sync log success event
2024-05-21 18:42:15 -07:00
Krrish Dholakia
2b3da449c8
feat(lowest_latency.py): route by time to first token, for streaming requests (if available)
...
Closes https://github.com/BerriAI/litellm/issues/3574
2024-05-21 13:08:17 -07:00
Krrish Dholakia
f5d73547c7
fix(lowest_latency.py): allow ttl to be a float
2024-05-15 09:59:21 -07:00
Rahul Kataria
d57ecf3371
Remove duplicate code in router_strategy
2024-05-12 18:05:57 +05:30
Krrish Dholakia
4a3b084961
feat(bedrock_httpx.py): moves to using httpx client for bedrock cohere calls
2024-05-11 13:43:08 -07:00
Krrish Dholakia
6575143460
feat(proxy_server.py): return litellm version in response headers
2024-05-08 16:00:08 -07:00
Krrish Dholakia
0b72904608
fix(lowest_latency.py): fix the size of the latency list to 10 by default (can be modified)
2024-05-03 09:00:32 -07:00
Krrish Dholakia
90cdfef1c1
fix(lowest_latency.py): allow setting a buffer for getting values within a certain latency threshold
...
if an endpoint is slow - it's completion time might not be updated till the call is completed. This prevents us from overloading those endpoints, in a simple way.
2024-04-30 12:00:26 -07:00
Ishaan Jaff
4cb4a7f06d
fix - lowest latency routing
2024-04-29 16:02:57 -07:00
Ishaan Jaff
3b0aa05378
fix lowest latency - routing
2024-04-29 15:51:52 -07:00
Ishaan Jaff
bf92a0b31c
fix debugging lowest latency router
2024-04-25 19:34:28 -07:00
Ishaan Jaff
737af2b458
fix better debugging for latency
2024-04-25 11:35:08 -07:00
Ishaan Jaff
787735bb5a
fix
2024-04-25 11:25:03 -07:00
Ishaan Jaff
984259d420
temp - show better debug logs for lowest latency
2024-04-25 11:22:52 -07:00
Ishaan Jaff
92f21cba30
fix - increase default penalty for lowest latency
2024-04-25 07:54:25 -07:00
Ishaan Jaff
212369498e
fix - set latency stats in kwargs
2024-04-24 20:13:45 -07:00
Ishaan Jaff
bf6abed808
feat - penalize timeout errors
2024-04-24 16:35:00 -07:00
Krish Dholakia
9119858f4a
Merge pull request #2798 from CLARKBENHAM/main
...
add test for rate limits - Router isn't coroutine safe
2024-04-06 08:47:40 -07:00
Krrish Dholakia
2236f283fe
fix(router.py): handle id being passed in as int
2024-04-04 14:23:10 -07:00
CLARKBENHAM
18749e7051
undo black formating
2024-04-02 19:53:48 -07:00
CLARKBENHAM
164898a213
fix lowest latency tests
2024-04-02 19:10:40 -07:00
Krrish Dholakia
fccacaf91b
fix(lowest_latency.py): consistent time calc
2024-02-14 15:03:35 -08:00
stephenleo
37c83e0023
fix latency calc (lower better)
2024-02-11 17:06:46 +08:00
Krrish Dholakia
31917176ff
fix(lowest_latency.py): fix merge issue
2024-01-10 21:37:46 +05:30
Krish Dholakia
298e937586
Merge branch 'main' into litellm_latency_routing_updates
2024-01-10 21:33:54 +05:30
Krrish Dholakia
fe632c08a4
fix(router.py): allow user to control the latency routing time window
2024-01-10 20:56:52 +05:30
Krrish Dholakia
bb04a340a5
fix(lowest_latency.py): add back tpm/rpm checks, configurable time window
2024-01-10 20:52:01 +05:30
Krrish Dholakia
a35f4272f4
refactor(lowest_latency.py): fix linting error
2024-01-09 09:51:43 +05:30
Krrish Dholakia
a5147f9e06
feat(lowest_latency.py): support expanded time window for latency based routing
...
uses a 1hr avg. of latency for deployments, to determine which to route to
https://github.com/BerriAI/litellm/issues/1361
2024-01-09 09:38:04 +05:30
Krrish Dholakia
027218c3f0
test(test_lowest_latency_routing.py): add more tests
2023-12-30 17:41:42 +05:30
Krrish Dholakia
f2d0d5584a
fix(router.py): fix latency based routing
2023-12-30 17:25:40 +05:30