Ishaan Jaff
152b44075b
(Bug fix) - Using include_usage
for /completions requests + unit testing ( #8484 )
...
* pass stream options (#8419 )
* test_completion_streaming_usage_metrics
* test_text_completion_include_usage
---------
Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>
2025-02-11 20:29:04 -08:00
Krish Dholakia
26a4958be5
Litellm dev 01 25 2025 p2 ( #8003 )
...
* fix(base_utils.py): supported nested json schema passed in for anthropic calls
* refactor(base_utils.py): refactor ref parsing to prevent infinite loop
* test(test_openai_endpoints.py): refactor anthropic test to use bedrock
* fix(langfuse_prompt_management.py): add unit test for sync langfuse calls
Resolves https://github.com/BerriAI/litellm/issues/7938#issuecomment-2613293757
2025-01-25 16:50:57 -08:00
Ishaan Jaff
536a448968
test_chat_completion_ratelimit add retry on test
2025-01-23 18:10:31 -08:00
Ishaan Jaff
9efb076037
ci/cd run again
2024-12-27 14:53:10 -08:00
Krish Dholakia
3c591167e0
fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check ( #6577 )
...
* fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check
* fix(lowest_tpm_rpm_v2.py): return headers in correct format
* test: update test
* build(deps): bump cookie and express in /docs/my-website (#6566 )
Bumps [cookie](https://github.com/jshttp/cookie ) and [express](https://github.com/expressjs/express ). These dependencies needed to be updated together.
Updates `cookie` from 0.6.0 to 0.7.1
- [Release notes](https://github.com/jshttp/cookie/releases )
- [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1 )
Updates `express` from 4.20.0 to 4.21.1
- [Release notes](https://github.com/expressjs/express/releases )
- [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md )
- [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1 )
---
updated-dependencies:
- dependency-name: cookie
dependency-type: indirect
- dependency-name: express
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* docs(virtual_keys.md): update Dockerfile reference (#6554 )
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
* (proxy fix) - call connect on prisma client when running setup (#6534 )
* critical fix - call connect on prisma client when running setup
* fix test_proxy_server_prisma_setup
* fix test_proxy_server_prisma_setup
* Add 3.5 haiku (#6588 )
* feat: add claude-3-5-haiku-20241022 entries
* feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models
* add missing entries, remove vision
* remove image token costs
* Litellm perf improvements 3 (#6573 )
* perf: move writing key to cache, to background task
* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils
adds 200ms on calls with pgdb connected
* fix(litellm_pre_call_utils.py'): rename call_type to actual call used
* perf(proxy_server.py): remove db logic from _get_config_from_file
was causing db calls to occur on every llm request, if team_id was set on key
* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db
reduces latency/call by ~100ms
* fix(proxy_server.py): minor fix on existing_settings not incl alerting
* fix(exception_mapping_utils.py): map databricks exception string
* fix(auth_checks.py): fix auth check logic
* test: correctly mark flaky test
* fix(utils.py): handle auth token error for tokenizers.from_pretrained
* build: fix map
* build: fix map
* build: fix json for model map
* test: remove eol model
* fix(proxy_server.py): fix db config loading logic
* fix(proxy_server.py): fix order of config / db updates, to ensure fields not overwritten
* test: skip test if required env var is missing
* test: fix test
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com>
2024-11-05 22:03:44 +05:30
Krish Dholakia
a878a61f81
Litellm fix router testing ( #5748 )
...
* test: fix testing - azure changed content policy error logic
* test: fix tests to use mock responses
* test(test_image_generation.py): handle api instability
* test(test_image_generation.py): handle azure api instability
* fix(utils.py): fix unbounded variable error
* fix(utils.py): fix unbounded variable error
* test: refactor test to use mock response
* test: mark flaky azure tests
2024-09-17 18:02:23 -07:00
Krrish Dholakia
656f59258e
test: skip flaky test
2024-09-04 08:20:57 -07:00
Ishaan Jaff
0f999524c9
handle flaky pytests
2024-08-27 22:44:49 -07:00
Ishaan Jaff
31e4fca748
fix use provider specific routing
2024-08-07 14:37:20 -07:00
Ishaan Jaff
3b6ad998e5
test proxy all model
2024-07-25 18:54:30 -07:00
Ishaan Jaff
be97b4e6b5
test - /moderations
2024-06-21 16:08:52 -07:00
Ishaan Jaff
876f0d3f1e
fix test_chat_completion_different_deployments
2024-06-17 23:04:48 -07:00
Ishaan Jaff
9f2dc76a58
test - making mistral embedding request on proxy
2024-06-12 15:10:20 -07:00
Krrish Dholakia
8ff137bce3
feat(scheduler.py): add request prioritization scheduler
...
allow user to set priority for a request
2024-05-31 18:51:13 -07:00
Ishaan Jaff
a844423e00
fix - use csv list for batch completions
2024-05-11 14:24:48 -07:00
Ishaan Jaff
0332f4f49b
test batch completions on litellm proxy
2024-05-11 13:32:30 -07:00
Krrish Dholakia
fba36714af
test(test_openai_endpoints.py): change key
2024-05-06 11:19:47 -07:00
Krrish Dholakia
264ec2de02
test(test_openai_endpoints.py): add unit testing for streaming + logprobs on docker image
2024-04-23 19:26:19 -07:00
Krrish Dholakia
1ea507419d
test(test_openai_endpoints.py): make test stricter
2024-04-20 12:11:54 -07:00
Krrish Dholakia
5da934099f
fix(caching.py): dual cache async_batch_get_cache fix + testing
...
this fixes a bug in usage-based-routing-v2 which was caused b/c of how the result was being returned from dual cache async_batch_get_cache. it also adds unit testing for that function (and it's sync equivalent)
2024-04-19 15:03:25 -07:00
Ishaan Jaff
e33fd644f2
test - wildcard openai models on proxy
2024-04-15 14:05:26 -07:00
Krrish Dholakia
c177407f7b
test(test_openai_endpoints.py): add concurrency testing for user defined rate limits on proxy
2024-04-12 18:56:13 -07:00
Krrish Dholakia
200317ce06
test(test_openai_endpoints.py): add local test, for proxy concurrency
2024-04-11 17:16:23 -07:00
Krrish Dholakia
549ae397c6
test(test_openai_endpoints.py): add response header test - make sure it's always <4kb
2024-04-05 12:49:29 -07:00
Krrish Dholakia
5c199e4e4e
fix(main.py): fix translation to text_completions format for async text completion calls
2024-03-30 09:02:51 -07:00
Ishaan Jaff
ce9fd8b351
(fix) img gen endpoints unstable
2024-03-28 12:43:46 -07:00
Ishaan Jaff
5643f1c919
(fix) test img gen proxy
2024-03-28 12:33:08 -07:00
Krrish Dholakia
41f4f7f0df
test(test_openai_endpoints.py): fix test
2024-02-23 22:57:20 -08:00
Krrish Dholakia
b501af8f43
fix(proxy_server.py): fix proxy server team id bug
2024-02-13 22:33:56 -08:00
Krrish Dholakia
344e232549
fix(utils.py): fix proxy streaming spend tracking
2024-01-23 15:59:03 -08:00
Krrish Dholakia
beca1eaae7
test(tests/): add unit testing for proxy server endpoints
2024-01-19 14:54:29 -08:00