Commit graph

27 commits

Author SHA1 Message Date
Krish Dholakia
695f48a8f1
fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check (#6577)
* fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check

* fix(lowest_tpm_rpm_v2.py): return headers in correct format

* test: update test

* build(deps): bump cookie and express in /docs/my-website (#6566)

Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.

Updates `cookie` from 0.6.0 to 0.7.1
- [Release notes](https://github.com/jshttp/cookie/releases)
- [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1)

Updates `express` from 4.20.0 to 4.21.1
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1)

---
updated-dependencies:
- dependency-name: cookie
  dependency-type: indirect
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs(virtual_keys.md): update Dockerfile reference (#6554)

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>

* (proxy fix) - call connect on prisma client when running setup (#6534)

* critical fix - call connect on prisma client when running setup

* fix test_proxy_server_prisma_setup

* fix test_proxy_server_prisma_setup

* Add 3.5 haiku (#6588)

* feat: add claude-3-5-haiku-20241022 entries

* feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models

* add missing entries, remove vision

* remove image token costs

* Litellm perf improvements 3 (#6573)

* perf: move writing key to cache, to background task

* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils

adds 200ms on calls with pgdb connected

* fix(litellm_pre_call_utils.py'): rename call_type to actual call used

* perf(proxy_server.py): remove db logic from _get_config_from_file

was causing db calls to occur on every llm request, if team_id was set on key

* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db

reduces latency/call by ~100ms

* fix(proxy_server.py): minor fix on existing_settings not incl alerting

* fix(exception_mapping_utils.py): map databricks exception string

* fix(auth_checks.py): fix auth check logic

* test: correctly mark flaky test

* fix(utils.py): handle auth token error for tokenizers.from_pretrained

* build: fix map

* build: fix map

* build: fix json for model map

* test: remove eol model

* fix(proxy_server.py): fix db config loading logic

* fix(proxy_server.py): fix order of config / db updates, to ensure fields not overwritten

* test: skip test if required env var is missing

* test: fix test

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com>
2024-11-05 22:03:44 +05:30
Krish Dholakia
dd602753c0
Litellm fix router testing (#5748)
* test: fix testing - azure changed content policy error logic

* test: fix tests to use mock responses

* test(test_image_generation.py): handle api instability

* test(test_image_generation.py): handle azure api instability

* fix(utils.py): fix unbounded variable error

* fix(utils.py): fix unbounded variable error

* test: refactor test to use mock response

* test: mark flaky azure tests
2024-09-17 18:02:23 -07:00
Krrish Dholakia
25e49a59b2 test: skip flaky test 2024-09-04 08:20:57 -07:00
Ishaan Jaff
bcf7f3e437 handle flaky pytests 2024-08-27 22:44:49 -07:00
Ishaan Jaff
f1ffa82062 fix use provider specific routing 2024-08-07 14:37:20 -07:00
Ishaan Jaff
05858cb249 test proxy all model 2024-07-25 18:54:30 -07:00
Ishaan Jaff
8fcd61825b test - /moderations 2024-06-21 16:08:52 -07:00
Ishaan Jaff
d409ffbaa9 fix test_chat_completion_different_deployments 2024-06-17 23:04:48 -07:00
Ishaan Jaff
cb386fda20 test - making mistral embedding request on proxy 2024-06-12 15:10:20 -07:00
Krrish Dholakia
79287a7584 feat(scheduler.py): add request prioritization scheduler
allow user to set priority for a request
2024-05-31 18:51:13 -07:00
Ishaan Jaff
d4288b134b fix - use csv list for batch completions 2024-05-11 14:24:48 -07:00
Ishaan Jaff
e1f94fcbbb test batch completions on litellm proxy 2024-05-11 13:32:30 -07:00
Krrish Dholakia
b014a72f7a test(test_openai_endpoints.py): change key 2024-05-06 11:19:47 -07:00
Krrish Dholakia
e31eb6a6b2 test(test_openai_endpoints.py): add unit testing for streaming + logprobs on docker image 2024-04-23 19:26:19 -07:00
Krrish Dholakia
1507b23e30 test(test_openai_endpoints.py): make test stricter 2024-04-20 12:11:54 -07:00
Krrish Dholakia
01a1a8f731 fix(caching.py): dual cache async_batch_get_cache fix + testing
this fixes a bug in usage-based-routing-v2 which was caused b/c of how the result was being returned from dual cache async_batch_get_cache. it also adds unit testing for that function (and it's sync equivalent)
2024-04-19 15:03:25 -07:00
Ishaan Jaff
6df5337e65 test - wildcard openai models on proxy 2024-04-15 14:05:26 -07:00
Krrish Dholakia
ea1574c160 test(test_openai_endpoints.py): add concurrency testing for user defined rate limits on proxy 2024-04-12 18:56:13 -07:00
Krrish Dholakia
bdfb74f8a5 test(test_openai_endpoints.py): add local test, for proxy concurrency 2024-04-11 17:16:23 -07:00
Krrish Dholakia
591bd9fcdd test(test_openai_endpoints.py): add response header test - make sure it's always <4kb 2024-04-05 12:49:29 -07:00
Krrish Dholakia
c0204310ee fix(main.py): fix translation to text_completions format for async text completion calls 2024-03-30 09:02:51 -07:00
Ishaan Jaff
75614bc29c (fix) img gen endpoints unstable 2024-03-28 12:43:46 -07:00
Ishaan Jaff
2495587e7c (fix) test img gen proxy 2024-03-28 12:33:08 -07:00
Krrish Dholakia
20def3805c test(test_openai_endpoints.py): fix test 2024-02-23 22:57:20 -08:00
Krrish Dholakia
83d43809a7 fix(proxy_server.py): fix proxy server team id bug 2024-02-13 22:33:56 -08:00
Krrish Dholakia
f8870fb48e fix(utils.py): fix proxy streaming spend tracking 2024-01-23 15:59:03 -08:00
Krrish Dholakia
f5ced089d6 test(tests/): add unit testing for proxy server endpoints 2024-01-19 14:54:29 -08:00