Krish Dholakia
|
136693cac4
|
LiteLLM Minor Fixes & Improvements (11/05/2024) (#6590)
* fix(pattern_matching_router.py): update model name using correct function
* fix(langfuse.py): metadata deepcopy can cause unhandled error (#6563)
Co-authored-by: seva <seva@inita.com>
* fix(stream_chunk_builder_utils.py): correctly set prompt tokens + log correct streaming usage
Closes https://github.com/BerriAI/litellm/issues/6488
* build(deps): bump cookie and express in /docs/my-website (#6566)
Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.
Updates `cookie` from 0.6.0 to 0.7.1
- [Release notes](https://github.com/jshttp/cookie/releases)
- [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1)
Updates `express` from 4.20.0 to 4.21.1
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1)
---
updated-dependencies:
- dependency-name: cookie
dependency-type: indirect
- dependency-name: express
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* docs(virtual_keys.md): update Dockerfile reference (#6554)
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
* (proxy fix) - call connect on prisma client when running setup (#6534)
* critical fix - call connect on prisma client when running setup
* fix test_proxy_server_prisma_setup
* fix test_proxy_server_prisma_setup
* Add 3.5 haiku (#6588)
* feat: add claude-3-5-haiku-20241022 entries
* feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models
* add missing entries, remove vision
* remove image token costs
* Litellm perf improvements 3 (#6573)
* perf: move writing key to cache, to background task
* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils
adds 200ms on calls with pgdb connected
* fix(litellm_pre_call_utils.py'): rename call_type to actual call used
* perf(proxy_server.py): remove db logic from _get_config_from_file
was causing db calls to occur on every llm request, if team_id was set on key
* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db
reduces latency/call by ~100ms
* fix(proxy_server.py): minor fix on existing_settings not incl alerting
* fix(exception_mapping_utils.py): map databricks exception string
* fix(auth_checks.py): fix auth check logic
* test: correctly mark flaky test
* fix(utils.py): handle auth token error for tokenizers.from_pretrained
* build: fix map
* build: fix map
* build: fix json for model map
* fix ImageObject conversion (#6584)
* (fix) litellm.text_completion raises a non-blocking error on simple usage (#6546)
* unit test test_huggingface_text_completion_logprobs
* fix return TextCompletionHandler convert_chat_to_text_completion
* fix hf rest api
* fix test_huggingface_text_completion_logprobs
* fix linting errors
* fix importLiteLLMResponseObjectHandler
* fix test for LiteLLMResponseObjectHandler
* fix test text completion
* fix allow using 15 seconds for premium license check
* testing fix bedrock deprecated cohere.command-text-v14
* (feat) add `Predicted Outputs` for OpenAI (#6594)
* bump openai to openai==1.54.0
* add 'prediction' param
* testing fix bedrock deprecated cohere.command-text-v14
* test test_openai_prediction_param.py
* test_openai_prediction_param_with_caching
* doc Predicted Outputs
* doc Predicted Output
* (fix) Vertex Improve Performance when using `image_url` (#6593)
* fix transformation vertex
* test test_process_gemini_image
* test_image_completion_request
* testing fix - bedrock has deprecated cohere.command-text-v14
* fix vertex pdf
* bump: version 1.51.5 → 1.52.0
* fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check (#6577)
* fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check
* fix(lowest_tpm_rpm_v2.py): return headers in correct format
* test: update test
* build(deps): bump cookie and express in /docs/my-website (#6566)
Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.
Updates `cookie` from 0.6.0 to 0.7.1
- [Release notes](https://github.com/jshttp/cookie/releases)
- [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1)
Updates `express` from 4.20.0 to 4.21.1
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1)
---
updated-dependencies:
- dependency-name: cookie
dependency-type: indirect
- dependency-name: express
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* docs(virtual_keys.md): update Dockerfile reference (#6554)
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
* (proxy fix) - call connect on prisma client when running setup (#6534)
* critical fix - call connect on prisma client when running setup
* fix test_proxy_server_prisma_setup
* fix test_proxy_server_prisma_setup
* Add 3.5 haiku (#6588)
* feat: add claude-3-5-haiku-20241022 entries
* feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models
* add missing entries, remove vision
* remove image token costs
* Litellm perf improvements 3 (#6573)
* perf: move writing key to cache, to background task
* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils
adds 200ms on calls with pgdb connected
* fix(litellm_pre_call_utils.py'): rename call_type to actual call used
* perf(proxy_server.py): remove db logic from _get_config_from_file
was causing db calls to occur on every llm request, if team_id was set on key
* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db
reduces latency/call by ~100ms
* fix(proxy_server.py): minor fix on existing_settings not incl alerting
* fix(exception_mapping_utils.py): map databricks exception string
* fix(auth_checks.py): fix auth check logic
* test: correctly mark flaky test
* fix(utils.py): handle auth token error for tokenizers.from_pretrained
* build: fix map
* build: fix map
* build: fix json for model map
* test: remove eol model
* fix(proxy_server.py): fix db config loading logic
* fix(proxy_server.py): fix order of config / db updates, to ensure fields not overwritten
* test: skip test if required env var is missing
* test: fix test
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com>
* test: mark flaky test
* test: handle anthropic api instability
* test(test_proxy_utils.py): add testing for db config update logic
* Update setuptools in docker and fastapi to latest verison, in order to upgrade starlette version (#6597)
* build(deps): bump cookie and express in /docs/my-website (#6566)
Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.
Updates `cookie` from 0.6.0 to 0.7.1
- [Release notes](https://github.com/jshttp/cookie/releases)
- [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1)
Updates `express` from 4.20.0 to 4.21.1
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1)
---
updated-dependencies:
- dependency-name: cookie
dependency-type: indirect
- dependency-name: express
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* docs(virtual_keys.md): update Dockerfile reference (#6554)
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
* (proxy fix) - call connect on prisma client when running setup (#6534)
* critical fix - call connect on prisma client when running setup
* fix test_proxy_server_prisma_setup
* fix test_proxy_server_prisma_setup
* Add 3.5 haiku (#6588)
* feat: add claude-3-5-haiku-20241022 entries
* feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models
* add missing entries, remove vision
* remove image token costs
* Litellm perf improvements 3 (#6573)
* perf: move writing key to cache, to background task
* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils
adds 200ms on calls with pgdb connected
* fix(litellm_pre_call_utils.py'): rename call_type to actual call used
* perf(proxy_server.py): remove db logic from _get_config_from_file
was causing db calls to occur on every llm request, if team_id was set on key
* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db
reduces latency/call by ~100ms
* fix(proxy_server.py): minor fix on existing_settings not incl alerting
* fix(exception_mapping_utils.py): map databricks exception string
* fix(auth_checks.py): fix auth check logic
* test: correctly mark flaky test
* fix(utils.py): handle auth token error for tokenizers.from_pretrained
* build: fix map
* build: fix map
* build: fix json for model map
* fix ImageObject conversion (#6584)
* (fix) litellm.text_completion raises a non-blocking error on simple usage (#6546)
* unit test test_huggingface_text_completion_logprobs
* fix return TextCompletionHandler convert_chat_to_text_completion
* fix hf rest api
* fix test_huggingface_text_completion_logprobs
* fix linting errors
* fix importLiteLLMResponseObjectHandler
* fix test for LiteLLMResponseObjectHandler
* fix test text completion
* fix allow using 15 seconds for premium license check
* testing fix bedrock deprecated cohere.command-text-v14
* (feat) add `Predicted Outputs` for OpenAI (#6594)
* bump openai to openai==1.54.0
* add 'prediction' param
* testing fix bedrock deprecated cohere.command-text-v14
* test test_openai_prediction_param.py
* test_openai_prediction_param_with_caching
* doc Predicted Outputs
* doc Predicted Output
* (fix) Vertex Improve Performance when using `image_url` (#6593)
* fix transformation vertex
* test test_process_gemini_image
* test_image_completion_request
* testing fix - bedrock has deprecated cohere.command-text-v14
* fix vertex pdf
* bump: version 1.51.5 → 1.52.0
* Update setuptools in docker and fastapi to latest verison, in order to upgrade starlette version
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com>
* fix(langfuse.py): fix linting errors
* fix: fix linting errors
* fix: fix casting error
* fix: fix typing error
* fix: add more tests
* fix(utils.py): fix return_processed_chunk_logic
* Revert "Update setuptools in docker and fastapi to latest verison, in order t…" (#6615)
This reverts commit 1a7f7bdfb7 .
* docs fix clarify team_id on team based logging
* doc fix team based logging with langfuse
* fix flake8 checks
* test: bump sleep time
* refactor: replace claude-instant-1.2 with haiku in testing
* fix(proxy_server.py): move to using sl payload in track_cost_callback
* fix(proxy_server.py): fix linting errors
* fix(proxy_server.py): fallback to kwargs(response_cost) if given
* test: remove claude-instant-1 from tests
* test: fix claude test
* docs fix clarify team_id on team based logging
* doc fix team based logging with langfuse
* build: remove lint.yml
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Vsevolod Karvetskiy <56288164+karvetskiy@users.noreply.github.com>
Co-authored-by: seva <seva@inita.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com>
Co-authored-by: Jacob Hagstedt P Suorra <Jacobh2@users.noreply.github.com>
Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com>
|
2024-11-07 04:17:05 +05:30 |
|
Krish Dholakia
|
d57be47b0f
|
Litellm ruff linting enforcement (#5992)
* ci(config.yml): add a 'check_code_quality' step
Addresses https://github.com/BerriAI/litellm/issues/5991
* ci(config.yml): check why circle ci doesn't pick up this test
* ci(config.yml): fix to run 'check_code_quality' tests
* fix(__init__.py): fix unprotected import
* fix(__init__.py): don't remove unused imports
* build(ruff.toml): update ruff.toml to ignore unused imports
* fix: fix: ruff + pyright - fix linting + type-checking errors
* fix: fix linting errors
* fix(lago.py): fix module init error
* fix: fix linting errors
* ci(config.yml): cd into correct dir for checks
* fix(proxy_server.py): fix linting error
* fix(utils.py): fix bare except
causes ruff linting errors
* fix: ruff - fix remaining linting errors
* fix(clickhouse.py): use standard logging object
* fix(__init__.py): fix unprotected import
* fix: ruff - fix linting errors
* fix: fix linting errors
* ci(config.yml): cleanup code qa step (formatting handled in local_testing)
* fix(_health_endpoints.py): fix ruff linting errors
* ci(config.yml): just use ruff in check_code_quality pipeline for now
* build(custom_guardrail.py): include missing file
* style(embedding_handler.py): fix ruff check
|
2024-10-01 19:44:20 -04:00 |
|
Ishaan Jaff
|
4b7ceade64
|
mark test_key_info_spend_values_streaming as flaky
|
2024-08-29 14:39:53 -07:00 |
|
Krrish Dholakia
|
5703da9b42
|
fix(user_api_key_auth.py): Fixes https://github.com/BerriAI/litellm/issues/5111
|
2024-08-08 10:30:15 -07:00 |
|
Krrish Dholakia
|
c27ae34a39
|
fix(proxy_server.py): use consistent 400-status code error code for exceeded budget errors
standardizes error code for budget exceeded errors to status code 400
|
2024-06-11 16:10:58 -07:00 |
|
Ishaan Jaff
|
6166381579
|
fix migrate to LitellmUserRoles
|
2024-05-30 13:57:50 -07:00 |
|
Krrish Dholakia
|
95566dc249
|
fix(proxy_server.py): fix model list returned for /model/info when team has restricted access
|
2024-05-25 13:21:33 -07:00 |
|
Krrish Dholakia
|
25a2f00db6
|
fix(proxy_server.py): fix model check for /v1/models endpoint when team has restricted access
|
2024-05-25 13:02:03 -07:00 |
|
Ishaan Jaff
|
595a2a5b1b
|
fix test_key_model_list
|
2024-05-22 20:49:19 -07:00 |
|
Krrish Dholakia
|
f2766fddbf
|
fix(proxy_server.py): fix /v1/models bug where it would return empty list
handle 'all-team-models' being set for a given key
|
2024-05-07 13:43:15 -07:00 |
|
Ishaan Jaff
|
31c4f5f811
|
(ci/cd) add detailed debugging info
|
2024-05-06 11:40:16 -07:00 |
|
Krrish Dholakia
|
918367cc7b
|
test: skip hanging test
|
2024-05-05 00:27:38 -07:00 |
|
Krrish Dholakia
|
b9f5b3c1a0
|
build(test_keys.py): improve error message for test
|
2024-05-01 08:22:28 -07:00 |
|
Krrish Dholakia
|
5583197d63
|
fix(proxy_server.py): fix setting offset-aware datetime
|
2024-04-25 21:18:32 -07:00 |
|
Ishaan Jaff
|
7b24a74e77
|
test fix - test_key_delete_ui
|
2024-04-20 12:30:58 -07:00 |
|
Ishaan Jaff
|
a34f725db7
|
fix - test keys
|
2024-04-20 11:43:43 -07:00 |
|
Krrish Dholakia
|
70716b3373
|
fix(proxy_server.py): fix key create logic + add unit tests
|
2024-04-16 13:08:34 -07:00 |
|
Ishaan Jaff
|
e20d3f29b0
|
test - delete keys
|
2024-04-15 16:37:20 -07:00 |
|
Krrish Dholakia
|
af2eabba91
|
fix(proxy_server.py): fix /key/update endpoint to update key duration
also adds a test for this to our ci/cd
|
2024-03-29 21:47:10 -07:00 |
|
Krrish Dholakia
|
c81c9c2583
|
fix(proxy_server.py): fix model info check
|
2024-03-23 15:59:17 -07:00 |
|
Krrish Dholakia
|
066346b8c8
|
test: handle flaky tests locally - causing db timeout issues on circle ci
|
2024-03-19 20:39:53 -07:00 |
|
Krrish Dholakia
|
f588bff69b
|
fix(proxy_server.py): fix spend log update
|
2024-03-18 20:26:28 -07:00 |
|
Krrish Dholakia
|
b9854a99d2
|
test: increase time before checking budget reset - avoid deadlocking
|
2024-03-06 22:16:59 -08:00 |
|
Krrish Dholakia
|
ad55f4dbb5
|
feat(proxy_server.py): retry if virtual key is rate limited
currently for chat completions
|
2024-03-05 19:00:03 -08:00 |
|
Krrish Dholakia
|
2b595bfdc9
|
test(test_keys.py): add more duration for test
|
2024-03-05 15:42:19 -08:00 |
|
Krrish Dholakia
|
4f3d1ee11f
|
test(test_keys.py): extend time for test
|
2024-03-02 21:19:34 -08:00 |
|
Krrish Dholakia
|
37ad5efc61
|
fix(proxy/utils.py): fix resetting budget logic
|
2024-03-02 20:52:54 -08:00 |
|
Krrish Dholakia
|
f7a2d3faef
|
test: skip sagemaker test - aws account suspended
|
2024-02-28 15:00:53 -08:00 |
|
Krrish Dholakia
|
5265d9042d
|
test: testing fixes
|
2024-02-27 20:35:42 -08:00 |
|
Krrish Dholakia
|
4a0df3cb4f
|
fix(proxy_cli.py-&&-proxy_server.py): bump reset budget intervals and fix pool limits for prisma connections
|
2024-02-06 19:39:49 -08:00 |
|
Krrish Dholakia
|
b6adeec347
|
fix(proxy_server.py): prisma client fixes for high traffic
|
2024-02-06 17:30:36 -08:00 |
|
Krrish Dholakia
|
ef0171e063
|
feat(utils.py): support cost tracking for openai/azure image gen models
|
2024-02-03 17:09:54 -08:00 |
|
ishaan-jaff
|
419bddae93
|
(fix) show error in test_keys
|
2024-02-02 19:15:03 -08:00 |
|
Krrish Dholakia
|
bfe2faa454
|
test(test_keys.py): separate streaming key info test from normal completion key info test
|
2024-02-02 19:03:47 -08:00 |
|
ishaan-jaff
|
d7d1aa1266
|
(ci/cd) run again
|
2024-02-02 18:56:03 -08:00 |
|
Krrish Dholakia
|
92ebe132e8
|
test(test_keys.py): fix test
|
2024-01-31 11:26:49 -08:00 |
|
Krrish Dholakia
|
ecc6855e49
|
build(config.yml): add more logging for docker build and test
|
2024-01-31 10:58:25 -08:00 |
|
Krrish Dholakia
|
eea4325e60
|
test(test_keys.py): add more retries for checking if budget updated
|
2024-01-31 10:44:19 -08:00 |
|
Krrish Dholakia
|
70d5f9ef96
|
test(test_keys.py): reorder tests
|
2024-01-31 09:37:37 -08:00 |
|
Krrish Dholakia
|
f9acad87dc
|
feat(proxy_server.py): enable cache controls per key + no-store cache flag
|
2024-01-30 20:46:50 -08:00 |
|
ishaan-jaff
|
9c4d9c2f58
|
(test) /key/info
|
2024-01-26 19:26:55 -08:00 |
|
ishaan-jaff
|
2a60af7164
|
(test) key crossing budget
|
2024-01-26 14:43:16 -08:00 |
|
ishaan-jaff
|
55b95e87dd
|
(fix) SpendLogs Table
|
2024-01-26 13:23:51 -08:00 |
|
Krrish Dholakia
|
554f1a090d
|
test(test_keys.py): add delay for test check
n
|
2024-01-25 18:31:07 -08:00 |
|
Krrish Dholakia
|
09ec6d6458
|
fix(utils.py): fix sagemaker async logging for sync streaming
https://github.com/BerriAI/litellm/issues/1592
|
2024-01-25 12:49:45 -08:00 |
|
Krrish Dholakia
|
8e1157fc92
|
test(test_keys.py): reset proxy spend
|
2024-01-24 21:08:09 -08:00 |
|
Krrish Dholakia
|
bb7705b494
|
test(test_users.py): test budgets with resets
|
2024-01-24 15:30:30 -08:00 |
|
Krrish Dholakia
|
e471157d57
|
test(test_keys.py): add testing to make sure budget resets are working as expected
|
2024-01-24 15:13:11 -08:00 |
|
Krrish Dholakia
|
f0ada5c950
|
test(test_spend_logs): new endpoint test for /spend/logs
|
2024-01-24 11:20:31 -08:00 |
|
Krrish Dholakia
|
d6844f43c8
|
test(test_keys.py): use correct model name for token counting
|
2024-01-23 17:46:14 -08:00 |
|