litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 19:24:27 +00:00

Author	SHA1	Message	Date
Krish Dholakia	2ec1ea9c31	Revert "Fix latency redis"	2025-03-19 18:11:22 -07:00
Emerson Gomes	fc31e20e04	Handle empty valid_deployments in LowestLatencyLoggingHandler	2025-03-19 19:56:57 -05:00
Emerson Gomes	8a3dba52ad	Fix TTFT prioritization for streaming in LowestLatencyLoggingHandler	2025-03-18 14:58:55 -05:00
Emerson Gomes	6b1ecf196d	fix redis serialization issue with Redis + lowest latency strategy	2025-03-17 19:19:20 -05:00
Ishaan Jaff	62a1cdec47	(code quality) run ruff rule to ban unused imports (#7313 ) * remove unused imports * fix AmazonConverseConfig * fix test * fix import * ruff check fixes * test fixes * fix testing * fix imports	2024-12-19 12:33:42 -08:00
Ishaan Jaff	6a9225fac2	(Refactor) Code Quality improvement - stop redefining LiteLLMBase (#7147 ) * fix stop redefining LiteLLMBase * use better name for base pydantic obj	2024-12-10 15:49:01 -08:00
Ishaan Jaff	5986f7457e	(router_strategy/) ensure all async functions use async cache methods (#6489 ) * fix router strat * use async set / get cache in router_strategy * add coverage for router strategy * fix imports * fix batch_get_cache * use async methods for least busy * fix least busy use async methods * fix test_dual_cache_increment * test async_get_available_deployment when routing_strategy="least-busy"	2024-10-29 21:07:17 +05:30
Krish Dholakia	e712a2090b	redis otel tracing + async support for latency routing (#6452 ) * docs(exception_mapping.md): add missing exception types Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183 * fix(main.py): register custom model pricing with specific key Ensure custom model pricing is registered to the specific model+provider key combination * test: make testing more robust for custom pricing * fix(redis_cache.py): instrument otel logging for sync redis calls ensures complete coverage for all redis cache calls * refactor: pass parent_otel_span for redis caching calls in router allows for more observability into what calls are causing latency issues * test: update tests with new params * refactor: ensure e2e otel tracing for router * refactor(router.py): add more otel tracing acrosss router catch all latency issues for router requests * fix: fix linting error * fix(router.py): fix linting error * fix: fix test * test: fix tests * fix(dual_cache.py): pass ttl to redis cache * fix: fix param	2024-10-28 21:52:12 -07:00
Ishaan Jaff	0c5a47c404	(code quality) add ruff check PLR0915 for `too-many-statements` (#6309 ) * ruff add PLR0915 * add noqa for PLR0915 * fix noqa * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915	2024-10-18 15:36:49 +05:30
Ishaan Jaff	ba56e37244	(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208 ) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * fix test_embedding_caching_azure_individual_items_reordered	2024-10-14 16:34:01 +05:30
Krish Dholakia	94a05ca5d0	Litellm ruff linting enforcement (#5992 ) * ci(config.yml): add a 'check_code_quality' step Addresses https://github.com/BerriAI/litellm/issues/5991 * ci(config.yml): check why circle ci doesn't pick up this test * ci(config.yml): fix to run 'check_code_quality' tests * fix(__init__.py): fix unprotected import * fix(__init__.py): don't remove unused imports * build(ruff.toml): update ruff.toml to ignore unused imports * fix: fix: ruff + pyright - fix linting + type-checking errors * fix: fix linting errors * fix(lago.py): fix module init error * fix: fix linting errors * ci(config.yml): cd into correct dir for checks * fix(proxy_server.py): fix linting error * fix(utils.py): fix bare except causes ruff linting errors * fix: ruff - fix remaining linting errors * fix(clickhouse.py): use standard logging object * fix(__init__.py): fix unprotected import * fix: ruff - fix linting errors * fix: fix linting errors * ci(config.yml): cleanup code qa step (formatting handled in local_testing) * fix(_health_endpoints.py): fix ruff linting errors * ci(config.yml): just use ruff in check_code_quality pipeline for now * build(custom_guardrail.py): include missing file * style(embedding_handler.py): fix ruff check	2024-10-01 19:44:20 -04:00
Krrish Dholakia	2874b94fb1	refactor: replace .error() with .exception() logging for better debugging on sentry	2024-08-16 09:22:47 -07:00
Krrish Dholakia	e391e30285	refactor: replace 'traceback.print_exc()' with logging library allows error logs to be in json format for otel logging	2024-06-06 13:47:43 -07:00
Krrish Dholakia	bcc07afd04	fix(lowest_latency.py): set default none value for time_to_first_token in sync log success event	2024-05-21 18:42:15 -07:00
Krrish Dholakia	f007bf7e21	feat(lowest_latency.py): route by time to first token, for streaming requests (if available) Closes https://github.com/BerriAI/litellm/issues/3574	2024-05-21 13:08:17 -07:00
Krrish Dholakia	84db63e3dd	fix(lowest_latency.py): allow ttl to be a float	2024-05-15 09:59:21 -07:00
Rahul Kataria	be4450106d	Remove duplicate code in router_strategy	2024-05-12 18:05:57 +05:30
Krrish Dholakia	926b86af87	feat(bedrock_httpx.py): moves to using httpx client for bedrock cohere calls	2024-05-11 13:43:08 -07:00
Krrish Dholakia	5f93cae3ff	feat(proxy_server.py): return litellm version in response headers	2024-05-08 16:00:08 -07:00
Krrish Dholakia	cb88ed4df8	fix(lowest_latency.py): fix the size of the latency list to 10 by default (can be modified)	2024-05-03 09:00:32 -07:00
Krrish Dholakia	7ae28bfcc9	fix(lowest_latency.py): allow setting a buffer for getting values within a certain latency threshold if an endpoint is slow - it's completion time might not be updated till the call is completed. This prevents us from overloading those endpoints, in a simple way.	2024-04-30 12:00:26 -07:00
Ishaan Jaff	d4a0530d02	fix - lowest latency routing	2024-04-29 16:02:57 -07:00
Ishaan Jaff	2a49580b5b	fix lowest latency - routing	2024-04-29 15:51:52 -07:00
Ishaan Jaff	7306072d33	fix debugging lowest latency router	2024-04-25 19:34:28 -07:00
Ishaan Jaff	3ab5e687f6	fix better debugging for latency	2024-04-25 11:35:08 -07:00
Ishaan Jaff	4931514330	fix	2024-04-25 11:25:03 -07:00
Ishaan Jaff	3b9d6dfc47	temp - show better debug logs for lowest latency	2024-04-25 11:22:52 -07:00
Ishaan Jaff	a26ecbad97	fix - increase default penalty for lowest latency	2024-04-25 07:54:25 -07:00
Ishaan Jaff	5dae1cf303	fix - set latency stats in kwargs	2024-04-24 20:13:45 -07:00
Ishaan Jaff	654c736d29	feat - penalize timeout errors	2024-04-24 16:35:00 -07:00
Krish Dholakia	b8d285d120	Merge pull request #2798 from CLARKBENHAM/main add test for rate limits - Router isn't coroutine safe	2024-04-06 08:47:40 -07:00
Krrish Dholakia	48a5948081	fix(router.py): handle id being passed in as int	2024-04-04 14:23:10 -07:00
CLARKBENHAM	1c93ebf05a	undo black formating	2024-04-02 19:53:48 -07:00
CLARKBENHAM	2dd0c32612	fix lowest latency tests	2024-04-02 19:10:40 -07:00
Krrish Dholakia	afaee375e6	fix(lowest_latency.py): consistent time calc	2024-02-14 15:03:35 -08:00
stephenleo	a6f24acb8b	fix latency calc (lower better)	2024-02-11 17:06:46 +08:00
Krrish Dholakia	ae9b8f50e0	fix(lowest_latency.py): fix merge issue	2024-01-10 21:37:46 +05:30
Krish Dholakia	e635ca2151	Merge branch 'main' into litellm_latency_routing_updates	2024-01-10 21:33:54 +05:30
Krrish Dholakia	7df19b2f7c	fix(router.py): allow user to control the latency routing time window	2024-01-10 20:56:52 +05:30
Krrish Dholakia	f288b12411	fix(lowest_latency.py): add back tpm/rpm checks, configurable time window	2024-01-10 20:52:01 +05:30
Krrish Dholakia	b5ec5eb10b	refactor(lowest_latency.py): fix linting error	2024-01-09 09:51:43 +05:30
Krrish Dholakia	fb9ebfbedd	feat(lowest_latency.py): support expanded time window for latency based routing uses a 1hr avg. of latency for deployments, to determine which to route to https://github.com/BerriAI/litellm/issues/1361	2024-01-09 09:38:04 +05:30
Krrish Dholakia	d3dee9b20c	test(test_lowest_latency_routing.py): add more tests	2023-12-30 17:41:42 +05:30
Krrish Dholakia	25ee96271e	fix(router.py): fix latency based routing	2023-12-30 17:25:40 +05:30

44 commits