litellm

Author	SHA1	Message	Date
Krish Dholakia	136693cac4	LiteLLM Minor Fixes & Improvements (11/05/2024) (#6590 ) * fix(pattern_matching_router.py): update model name using correct function * fix(langfuse.py): metadata deepcopy can cause unhandled error (#6563) Co-authored-by: seva <seva@inita.com> * fix(stream_chunk_builder_utils.py): correctly set prompt tokens + log correct streaming usage Closes https://github.com/BerriAI/litellm/issues/6488 * build(deps): bump cookie and express in /docs/my-website (#6566) Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together. Updates `cookie` from 0.6.0 to 0.7.1 - [Release notes](https://github.com/jshttp/cookie/releases) - [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1) Updates `express` from 4.20.0 to 4.21.1 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md) - [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1) --- updated-dependencies: - dependency-name: cookie dependency-type: indirect - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * docs(virtual_keys.md): update Dockerfile reference (#6554) Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * (proxy fix) - call connect on prisma client when running setup (#6534) * critical fix - call connect on prisma client when running setup * fix test_proxy_server_prisma_setup * fix test_proxy_server_prisma_setup * Add 3.5 haiku (#6588) * feat: add claude-3-5-haiku-20241022 entries * feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models * add missing entries, remove vision * remove image token costs * Litellm perf improvements 3 (#6573) * perf: move writing key to cache, to background task * perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils adds 200ms on calls with pgdb connected * fix(litellm_pre_call_utils.py'): rename call_type to actual call used * perf(proxy_server.py): remove db logic from _get_config_from_file was causing db calls to occur on every llm request, if team_id was set on key * fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db reduces latency/call by ~100ms * fix(proxy_server.py): minor fix on existing_settings not incl alerting * fix(exception_mapping_utils.py): map databricks exception string * fix(auth_checks.py): fix auth check logic * test: correctly mark flaky test * fix(utils.py): handle auth token error for tokenizers.from_pretrained * build: fix map * build: fix map * build: fix json for model map * fix ImageObject conversion (#6584) * (fix) litellm.text_completion raises a non-blocking error on simple usage (#6546) * unit test test_huggingface_text_completion_logprobs * fix return TextCompletionHandler convert_chat_to_text_completion * fix hf rest api * fix test_huggingface_text_completion_logprobs * fix linting errors * fix importLiteLLMResponseObjectHandler * fix test for LiteLLMResponseObjectHandler * fix test text completion * fix allow using 15 seconds for premium license check * testing fix bedrock deprecated cohere.command-text-v14 * (feat) add `Predicted Outputs` for OpenAI (#6594) * bump openai to openai==1.54.0 * add 'prediction' param * testing fix bedrock deprecated cohere.command-text-v14 * test test_openai_prediction_param.py * test_openai_prediction_param_with_caching * doc Predicted Outputs * doc Predicted Output * (fix) Vertex Improve Performance when using `image_url` (#6593) * fix transformation vertex * test test_process_gemini_image * test_image_completion_request * testing fix - bedrock has deprecated cohere.command-text-v14 * fix vertex pdf * bump: version 1.51.5 → 1.52.0 * fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check (#6577) * fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check * fix(lowest_tpm_rpm_v2.py): return headers in correct format * test: update test * build(deps): bump cookie and express in /docs/my-website (#6566) Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together. Updates `cookie` from 0.6.0 to 0.7.1 - [Release notes](https://github.com/jshttp/cookie/releases) - [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1) Updates `express` from 4.20.0 to 4.21.1 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md) - [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1) --- updated-dependencies: - dependency-name: cookie dependency-type: indirect - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * docs(virtual_keys.md): update Dockerfile reference (#6554) Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * (proxy fix) - call connect on prisma client when running setup (#6534) * critical fix - call connect on prisma client when running setup * fix test_proxy_server_prisma_setup * fix test_proxy_server_prisma_setup * Add 3.5 haiku (#6588) * feat: add claude-3-5-haiku-20241022 entries * feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models * add missing entries, remove vision * remove image token costs * Litellm perf improvements 3 (#6573) * perf: move writing key to cache, to background task * perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils adds 200ms on calls with pgdb connected * fix(litellm_pre_call_utils.py'): rename call_type to actual call used * perf(proxy_server.py): remove db logic from _get_config_from_file was causing db calls to occur on every llm request, if team_id was set on key * fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db reduces latency/call by ~100ms * fix(proxy_server.py): minor fix on existing_settings not incl alerting * fix(exception_mapping_utils.py): map databricks exception string * fix(auth_checks.py): fix auth check logic * test: correctly mark flaky test * fix(utils.py): handle auth token error for tokenizers.from_pretrained * build: fix map * build: fix map * build: fix json for model map * test: remove eol model * fix(proxy_server.py): fix db config loading logic * fix(proxy_server.py): fix order of config / db updates, to ensure fields not overwritten * test: skip test if required env var is missing * test: fix test --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> * test: mark flaky test * test: handle anthropic api instability * test(test_proxy_utils.py): add testing for db config update logic * Update setuptools in docker and fastapi to latest verison, in order to upgrade starlette version (#6597) * build(deps): bump cookie and express in /docs/my-website (#6566) Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together. Updates `cookie` from 0.6.0 to 0.7.1 - [Release notes](https://github.com/jshttp/cookie/releases) - [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1) Updates `express` from 4.20.0 to 4.21.1 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md) - [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1) --- updated-dependencies: - dependency-name: cookie dependency-type: indirect - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * docs(virtual_keys.md): update Dockerfile reference (#6554) Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * (proxy fix) - call connect on prisma client when running setup (#6534) * critical fix - call connect on prisma client when running setup * fix test_proxy_server_prisma_setup * fix test_proxy_server_prisma_setup * Add 3.5 haiku (#6588) * feat: add claude-3-5-haiku-20241022 entries * feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models * add missing entries, remove vision * remove image token costs * Litellm perf improvements 3 (#6573) * perf: move writing key to cache, to background task * perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils adds 200ms on calls with pgdb connected * fix(litellm_pre_call_utils.py'): rename call_type to actual call used * perf(proxy_server.py): remove db logic from _get_config_from_file was causing db calls to occur on every llm request, if team_id was set on key * fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db reduces latency/call by ~100ms * fix(proxy_server.py): minor fix on existing_settings not incl alerting * fix(exception_mapping_utils.py): map databricks exception string * fix(auth_checks.py): fix auth check logic * test: correctly mark flaky test * fix(utils.py): handle auth token error for tokenizers.from_pretrained * build: fix map * build: fix map * build: fix json for model map * fix ImageObject conversion (#6584) * (fix) litellm.text_completion raises a non-blocking error on simple usage (#6546) * unit test test_huggingface_text_completion_logprobs * fix return TextCompletionHandler convert_chat_to_text_completion * fix hf rest api * fix test_huggingface_text_completion_logprobs * fix linting errors * fix importLiteLLMResponseObjectHandler * fix test for LiteLLMResponseObjectHandler * fix test text completion * fix allow using 15 seconds for premium license check * testing fix bedrock deprecated cohere.command-text-v14 * (feat) add `Predicted Outputs` for OpenAI (#6594) * bump openai to openai==1.54.0 * add 'prediction' param * testing fix bedrock deprecated cohere.command-text-v14 * test test_openai_prediction_param.py * test_openai_prediction_param_with_caching * doc Predicted Outputs * doc Predicted Output * (fix) Vertex Improve Performance when using `image_url` (#6593) * fix transformation vertex * test test_process_gemini_image * test_image_completion_request * testing fix - bedrock has deprecated cohere.command-text-v14 * fix vertex pdf * bump: version 1.51.5 → 1.52.0 * Update setuptools in docker and fastapi to latest verison, in order to upgrade starlette version --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com> * fix(langfuse.py): fix linting errors * fix: fix linting errors * fix: fix casting error * fix: fix typing error * fix: add more tests * fix(utils.py): fix return_processed_chunk_logic * Revert "Update setuptools in docker and fastapi to latest verison, in order t…" (#6615) This reverts commit `1a7f7bdfb7`. * docs fix clarify team_id on team based logging * doc fix team based logging with langfuse * fix flake8 checks * test: bump sleep time * refactor: replace claude-instant-1.2 with haiku in testing * fix(proxy_server.py): move to using sl payload in track_cost_callback * fix(proxy_server.py): fix linting errors * fix(proxy_server.py): fallback to kwargs(response_cost) if given * test: remove claude-instant-1 from tests * test: fix claude test * docs fix clarify team_id on team based logging * doc fix team based logging with langfuse * build: remove lint.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Vsevolod Karvetskiy <56288164+karvetskiy@users.noreply.github.com> Co-authored-by: seva <seva@inita.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Co-authored-by: Jacob Hagstedt P Suorra <Jacobh2@users.noreply.github.com> Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com>	2024-11-07 04:17:05 +05:30
Krish Dholakia	d57be47b0f	Litellm ruff linting enforcement (#5992 ) * ci(config.yml): add a 'check_code_quality' step Addresses https://github.com/BerriAI/litellm/issues/5991 * ci(config.yml): check why circle ci doesn't pick up this test * ci(config.yml): fix to run 'check_code_quality' tests * fix(__init__.py): fix unprotected import * fix(__init__.py): don't remove unused imports * build(ruff.toml): update ruff.toml to ignore unused imports * fix: fix: ruff + pyright - fix linting + type-checking errors * fix: fix linting errors * fix(lago.py): fix module init error * fix: fix linting errors * ci(config.yml): cd into correct dir for checks * fix(proxy_server.py): fix linting error * fix(utils.py): fix bare except causes ruff linting errors * fix: ruff - fix remaining linting errors * fix(clickhouse.py): use standard logging object * fix(__init__.py): fix unprotected import * fix: ruff - fix linting errors * fix: fix linting errors * ci(config.yml): cleanup code qa step (formatting handled in local_testing) * fix(_health_endpoints.py): fix ruff linting errors * ci(config.yml): just use ruff in check_code_quality pipeline for now * build(custom_guardrail.py): include missing file * style(embedding_handler.py): fix ruff check	2024-10-01 19:44:20 -04:00
Ishaan Jaff	4b7ceade64	mark test_key_info_spend_values_streaming as flaky	2024-08-29 14:39:53 -07:00
Krrish Dholakia	5703da9b42	fix(user_api_key_auth.py): Fixes https://github.com/BerriAI/litellm/issues/5111	2024-08-08 10:30:15 -07:00
Krrish Dholakia	c27ae34a39	fix(proxy_server.py): use consistent 400-status code error code for exceeded budget errors standardizes error code for budget exceeded errors to status code 400	2024-06-11 16:10:58 -07:00
Ishaan Jaff	6166381579	fix migrate to LitellmUserRoles	2024-05-30 13:57:50 -07:00
Krrish Dholakia	95566dc249	fix(proxy_server.py): fix model list returned for `/model/info` when team has restricted access	2024-05-25 13:21:33 -07:00
Krrish Dholakia	25a2f00db6	fix(proxy_server.py): fix model check for `/v1/models` endpoint when team has restricted access	2024-05-25 13:02:03 -07:00
Ishaan Jaff	595a2a5b1b	fix test_key_model_list	2024-05-22 20:49:19 -07:00
Krrish Dholakia	f2766fddbf	fix(proxy_server.py): fix `/v1/models` bug where it would return empty list handle 'all-team-models' being set for a given key	2024-05-07 13:43:15 -07:00
Ishaan Jaff	31c4f5f811	(ci/cd) add detailed debugging info	2024-05-06 11:40:16 -07:00
Krrish Dholakia	918367cc7b	test: skip hanging test	2024-05-05 00:27:38 -07:00
Krrish Dholakia	b9f5b3c1a0	build(test_keys.py): improve error message for test	2024-05-01 08:22:28 -07:00
Krrish Dholakia	5583197d63	fix(proxy_server.py): fix setting offset-aware datetime	2024-04-25 21:18:32 -07:00
Ishaan Jaff	7b24a74e77	test fix - test_key_delete_ui	2024-04-20 12:30:58 -07:00
Ishaan Jaff	a34f725db7	fix - test keys	2024-04-20 11:43:43 -07:00
Krrish Dholakia	70716b3373	fix(proxy_server.py): fix key create logic + add unit tests	2024-04-16 13:08:34 -07:00
Ishaan Jaff	e20d3f29b0	test - delete keys	2024-04-15 16:37:20 -07:00
Krrish Dholakia	af2eabba91	fix(proxy_server.py): fix `/key/update` endpoint to update key duration also adds a test for this to our ci/cd	2024-03-29 21:47:10 -07:00
Krrish Dholakia	c81c9c2583	fix(proxy_server.py): fix model info check	2024-03-23 15:59:17 -07:00
Krrish Dholakia	066346b8c8	test: handle flaky tests locally - causing db timeout issues on circle ci	2024-03-19 20:39:53 -07:00
Krrish Dholakia	f588bff69b	fix(proxy_server.py): fix spend log update	2024-03-18 20:26:28 -07:00
Krrish Dholakia	b9854a99d2	test: increase time before checking budget reset - avoid deadlocking	2024-03-06 22:16:59 -08:00
Krrish Dholakia	ad55f4dbb5	feat(proxy_server.py): retry if virtual key is rate limited currently for chat completions	2024-03-05 19:00:03 -08:00
Krrish Dholakia	2b595bfdc9	test(test_keys.py): add more duration for test	2024-03-05 15:42:19 -08:00
Krrish Dholakia	4f3d1ee11f	test(test_keys.py): extend time for test	2024-03-02 21:19:34 -08:00
Krrish Dholakia	37ad5efc61	fix(proxy/utils.py): fix resetting budget logic	2024-03-02 20:52:54 -08:00
Krrish Dholakia	f7a2d3faef	test: skip sagemaker test - aws account suspended	2024-02-28 15:00:53 -08:00
Krrish Dholakia	5265d9042d	test: testing fixes	2024-02-27 20:35:42 -08:00
Krrish Dholakia	4a0df3cb4f	fix(proxy_cli.py-&&-proxy_server.py): bump reset budget intervals and fix pool limits for prisma connections	2024-02-06 19:39:49 -08:00
Krrish Dholakia	b6adeec347	fix(proxy_server.py): prisma client fixes for high traffic	2024-02-06 17:30:36 -08:00
Krrish Dholakia	ef0171e063	feat(utils.py): support cost tracking for openai/azure image gen models	2024-02-03 17:09:54 -08:00
ishaan-jaff	419bddae93	(fix) show error in test_keys	2024-02-02 19:15:03 -08:00
Krrish Dholakia	bfe2faa454	test(test_keys.py): separate streaming key info test from normal completion key info test	2024-02-02 19:03:47 -08:00
ishaan-jaff	d7d1aa1266	(ci/cd) run again	2024-02-02 18:56:03 -08:00
Krrish Dholakia	92ebe132e8	test(test_keys.py): fix test	2024-01-31 11:26:49 -08:00
Krrish Dholakia	ecc6855e49	build(config.yml): add more logging for docker build and test	2024-01-31 10:58:25 -08:00
Krrish Dholakia	eea4325e60	test(test_keys.py): add more retries for checking if budget updated	2024-01-31 10:44:19 -08:00
Krrish Dholakia	70d5f9ef96	test(test_keys.py): reorder tests	2024-01-31 09:37:37 -08:00
Krrish Dholakia	f9acad87dc	feat(proxy_server.py): enable cache controls per key + no-store cache flag	2024-01-30 20:46:50 -08:00
ishaan-jaff	9c4d9c2f58	(test) /key/info	2024-01-26 19:26:55 -08:00
ishaan-jaff	2a60af7164	(test) key crossing budget	2024-01-26 14:43:16 -08:00
ishaan-jaff	55b95e87dd	(fix) SpendLogs Table	2024-01-26 13:23:51 -08:00
Krrish Dholakia	554f1a090d	test(test_keys.py): add delay for test check n	2024-01-25 18:31:07 -08:00
Krrish Dholakia	09ec6d6458	fix(utils.py): fix sagemaker async logging for sync streaming https://github.com/BerriAI/litellm/issues/1592	2024-01-25 12:49:45 -08:00
Krrish Dholakia	8e1157fc92	test(test_keys.py): reset proxy spend	2024-01-24 21:08:09 -08:00
Krrish Dholakia	bb7705b494	test(test_users.py): test budgets with resets	2024-01-24 15:30:30 -08:00
Krrish Dholakia	e471157d57	test(test_keys.py): add testing to make sure budget resets are working as expected	2024-01-24 15:13:11 -08:00
Krrish Dholakia	f0ada5c950	test(test_spend_logs): new endpoint test for /spend/logs	2024-01-24 11:20:31 -08:00
Krrish Dholakia	d6844f43c8	test(test_keys.py): use correct model name for token counting	2024-01-23 17:46:14 -08:00

1 2

53 commits