litellm

Author	SHA1	Message	Date
Krrish Dholakia	3560f0ef2c	refactor: move all testing to top-level of repo Closes https://github.com/BerriAI/litellm/issues/486	2024-09-28 21:08:14 -07:00
Krrish Dholakia	94db4ec830	test: mark flaky tests	2024-08-30 07:53:04 -07:00
Ishaan Jaff	bcf7f3e437	handle flaky pytests	2024-08-27 22:44:49 -07:00
Krrish Dholakia	cd73ea245a	test(test_lowest_latency_routing.py): use mock responses	2024-06-20 21:05:23 -07:00
Krrish Dholakia	6024f9e45e	test(test_lowest_latency_routing.py): add time.sleep to reduce test flakiness	2024-06-07 11:28:33 -07:00
Krrish Dholakia	2b3da449c8	feat(lowest_latency.py): route by time to first token, for streaming requests (if available) Closes https://github.com/BerriAI/litellm/issues/3574	2024-05-21 13:08:17 -07:00
Ishaan Jaff	fdf7a4d8c8	fix - test_lowest_latency_routing_first_pick	2024-05-15 14:24:13 -07:00
Krrish Dholakia	0b72904608	fix(lowest_latency.py): fix the size of the latency list to 10 by default (can be modified)	2024-05-03 09:00:32 -07:00
Krrish Dholakia	90cdfef1c1	fix(lowest_latency.py): allow setting a buffer for getting values within a certain latency threshold if an endpoint is slow - it's completion time might not be updated till the call is completed. This prevents us from overloading those endpoints, in a simple way.	2024-04-30 12:00:26 -07:00
Ishaan Jaff	5247d7b6a5	test - lowest latency router	2024-04-29 15:51:01 -07:00
Ishaan Jaff	2e6fc91a75	test - lowest latency logger	2024-04-24 16:35:43 -07:00
Krish Dholakia	9119858f4a	Merge pull request #2798 from CLARKBENHAM/main add test for rate limits - Router isn't coroutine safe	2024-04-06 08:47:40 -07:00
Krrish Dholakia	2236f283fe	fix(router.py): handle id being passed in as int	2024-04-04 14:23:10 -07:00
CLARKBENHAM	44cb0f352a	formating	2024-04-02 19:56:07 -07:00
CLARKBENHAM	164898a213	fix lowest latency tests	2024-04-02 19:10:40 -07:00
CLARKBENHAM	29573b0967	param both tests to include failure (also fix prev)	2024-04-02 18:53:42 -07:00
CLARKBENHAM	4f95966475	tests showing error	2024-04-02 18:45:05 -07:00
Krrish Dholakia	6a8d518e44	test(test_lowest_latency_routing.py): use the correct cache key	2024-01-10 22:15:01 +05:30
Krrish Dholakia	9a829ff956	refactor: cleanup duplicates	2024-01-10 21:42:20 +05:30
Krish Dholakia	298e937586	Merge branch 'main' into litellm_latency_routing_updates	2024-01-10 21:33:54 +05:30
Krrish Dholakia	fe632c08a4	fix(router.py): allow user to control the latency routing time window	2024-01-10 20:56:52 +05:30
Krrish Dholakia	bb04a340a5	fix(lowest_latency.py): add back tpm/rpm checks, configurable time window	2024-01-10 20:52:01 +05:30
Krrish Dholakia	a5147f9e06	feat(lowest_latency.py): support expanded time window for latency based routing uses a 1hr avg. of latency for deployments, to determine which to route to https://github.com/BerriAI/litellm/issues/1361	2024-01-09 09:38:04 +05:30
Krrish Dholakia	027218c3f0	test(test_lowest_latency_routing.py): add more tests	2023-12-30 17:41:42 +05:30
Krrish Dholakia	f2d0d5584a	fix(router.py): fix latency based routing	2023-12-30 17:25:40 +05:30

25 commits