litellm

Author	SHA1	Message	Date
Krrish Dholakia	c29853552b	bump: version 1.49.6 → 1.49.7	2024-10-17 23:01:06 -07:00
Krrish Dholakia	6e7e96211c	docs(argilla.md): add sampling rate to argilla calls	2024-10-17 22:54:12 -07:00
Krrish Dholakia	4f5ff65882	docs(argilla.md): add doc on argilla logging	2024-10-17 22:51:55 -07:00
Krrish Dholakia	87f256bec1	docs(user_keys.md): add regex doc for clientside auth params	2024-10-17 22:42:29 -07:00
Krish Dholakia	f252350881	LiteLLM Minor Fixes & Improvements (10/17/2024) (#6293 ) * fix(ui_sso.py): fix faulty admin only check Fixes https://github.com/BerriAI/litellm/issues/6286 * refactor(sso_helper_utils.py): refactor /sso/callback to use helper utils, covered by unit testing Prevent future regressions * feat(prompt_factory): support 'ensure_alternating_roles' param Closes https://github.com/BerriAI/litellm/issues/6257 * fix(proxy/utils.py): add dailytagspend to expected views * feat(auth_utils.py): support setting regex for clientside auth credentials Fixes https://github.com/BerriAI/litellm/issues/6203 * build(cookbook): add tutorial for mlflow + langchain + litellm proxy tracing * feat(argilla.py): add argilla logging integration Closes https://github.com/BerriAI/litellm/issues/6201 * fix: fix linting errors * fix: fix ruff error * test: fix test * fix: update vertex ai assumption - parts not always guaranteed (#6296) * docs(configs.md): add argila env var to docs	2024-10-17 22:09:11 -07:00
Krrish Dholakia	5e381caf75	Revert "fix(ui_sso.py): fix faulty admin check" This reverts commit `22d95c99b5`.	2024-10-17 11:04:26 -07:00
Krrish Dholakia	22d95c99b5	fix(ui_sso.py): fix faulty admin check fix check to make sure admin can log into ui in 'admin_only' ui access mode Fixes https://github.com/BerriAI/litellm/issues/6286	2024-10-17 11:02:49 -07:00
Ishaan Jaff	f724f3131d	(testing) add unit tests for LLMCachingHandler Class (#6279 ) * add unit testing for test_async_set_cache * test test_async_log_cache_hit_on_callbacks * assert the correct response type is returned * test_convert_cached_result_to_model_response * unit testing for caching handler	2024-10-17 19:12:57 +05:30
Ishaan Jaff	202b5cc2cd	test_awesome_otel_with_message_logging_off	2024-10-17 16:43:25 +05:30
Ishaan Jaff	4554bf760c	(testing) add test coverage for LLM OTEL logging (#6227 ) * add test coverage for OTEL logging * test_async_otel_callback * test test_awesome_otel_with_message_logging_off * fix otel testing * add otel testing * otel testing * otel testing * otel testing coverage * otel add testing	2024-10-17 16:34:04 +05:30
Ishaan Jaff	5bada7cbce	fix otel tests	2024-10-17 16:32:56 +05:30
Ishaan Jaff	dd4f01a75e	Revert "(perf) move s3 logging to Batch logging + async [94% faster p… (#6275 ) * Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)" This reverts commit `2a5624af47`. * fix test s3 * add test_basic_s3_logging	2024-10-17 16:14:57 +05:30
Krrish Dholakia	81766e7049	bump: version 1.49.5 → 1.49.6	2024-10-16 22:19:08 -07:00
Krish Dholakia	38a9a106d2	LiteLLM Minor Fixes & Improvements (10/16/2024) (#6265 ) * fix(caching_handler.py): handle positional arguments in add cache logic Fixes https://github.com/BerriAI/litellm/issues/6264 * feat(litellm_pre_call_utils.py): allow forwarding openai org id to backend client https://github.com/BerriAI/litellm/issues/6237 * docs(configs.md): add 'forward_openai_org_id' to docs * fix(proxy_server.py): return model info if user_model is set Fixes https://github.com/BerriAI/litellm/issues/6233 * fix(hosted_vllm/chat/transformation.py): don't set tools unless non-none * fix(openai.py): improve debug log for openai 'str' error Addresses https://github.com/BerriAI/litellm/issues/6272 * fix(proxy_server.py): fix linting error * fix(proxy_server.py): fix linting errors * test: skip WIP test * docs(openai.md): add docs on passing openai org id from client to openai	2024-10-16 22:16:23 -07:00
yujonglee	43878bd2a0	remove ask mode (#6271 )	2024-10-16 22:01:50 -07:00
Krish Dholakia	e22e8d24ef	Litellm router code coverage 3 (#6274 ) * refactor(router.py): move assistants api endpoints to using 1 pass-through factory function Reduces code, increases testing coverage * refactor(router.py): reduce _common_check_available_deployment function size make code more maintainable - reduce possible errors * test(router_code_coverage.py): include batch_utils + pattern matching in enforced 100% code coverage Improves reliability * fix(router.py): fix model id match model dump	2024-10-16 21:30:25 -07:00
Ishaan Jaff	891e9001b5	(testing) add router unit testing for `send_llm_exception_alert` , `router_cooldown_event_callback` , cooldown utils (#6258 ) * add router unit testing for send_llm_exception_alert * test router_cooldown_event_callback * test test_router_cooldown_event_callback_no_prometheus * test_router_cooldown_event_callback_no_deployment * test_router_cooldown_event_callback_no_deployment * add testing for test_should_run_cooldown_logic * test_increment_deployment_successes_for_current_minute_does_not_write_to_redis * test test_should_cooldown_deployment_allowed_fails_set_on_router * use helper for _is_allowed_fails_set_on_router * add complete testing for cooldown utils * move router unit tests * move router handle error * fix test_send_llm_exception_alert_no_logger	2024-10-16 23:19:51 +05:30
Ishaan Jaff	8530000b44	(testing) Router add testing coverage (#6253 ) * test: add more router code coverage * test: additional router testing coverage * fix: fix linting error * test: fix tests for ci/cd * test: fix test * test: handle flaky tests --------- Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>	2024-10-16 07:32:27 -07:00
Krish Dholakia	54ebdbf7ce	LiteLLM Minor Fixes & Improvements (10/15/2024) (#6242 ) * feat(litellm_pre_call_utils.py): support forwarding request headers to backend llm api * fix(litellm_pre_call_utils.py): handle custom litellm key header * test(router_code_coverage.py): check if all router functions are dire… (#6186) * test(router_code_coverage.py): check if all router functions are directly tested prevent regressions * docs(configs.md): document all environment variables (#6185) * docs: make it easier to find anthropic/openai prompt caching doc * aded codecov yml (#6207) * fix codecov.yaml * run ci/cd again * (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * fix test_embedding_caching_azure_individual_items_reordered * (feat) prometheus have well defined latency buckets (#6211) * fix prometheus have well defined latency buckets * use a well define latency bucket * use types file for prometheus logging * add test for LATENCY_BUCKETS * fix prom testing * fix config.yml * (refactor caching) use LLMCachingHandler for caching streaming responses (#6210) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * bump (#6187) * update code cov yaml * fix config.yml * add caching component to code cov * fix config.yml ci/cd * add coverage for proxy auth * (refactor caching) use common `_retrieve_from_cache` helper (#6212) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * refactor - use _retrieve_from_cache * refactor use _convert_cached_result_to_model_response * fix linting errors * bump: version 1.49.2 → 1.49.3 * fix code cov components * test(test_router_helpers.py): add router component unit tests * test: add additional router tests * test: add more router testing * test: add more router testing + more mock functions * ci(router_code_coverage.py): fix check --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> * bump: version 1.49.3 → 1.49.4 * (refactor) use helper function `_assemble_complete_response_from_streaming_chunks` to assemble complete responses in caching and logging callbacks (#6220) * (refactor) use _assemble_complete_response_from_streaming_chunks * add unit test for test_assemble_complete_response_from_streaming_chunks_1 * fix assemble complete_streaming_response * config add logging_testing * add logging_coverage in codecov * test test_assemble_complete_response_from_streaming_chunks_3 * add unit tests for _assemble_complete_response_from_streaming_chunks * fix remove unused / junk function * add test for streaming_chunks when error assembling * (refactor) OTEL - use safe_set_attribute for setting attributes (#6226) * otel - use safe_set_attribute for setting attributes * fix OTEL only use safe_set_attribute * (fix) prompt caching cost calculation OpenAI, Azure OpenAI (#6231) * fix prompt caching cost calculation * fix testing for prompt cache cost calc * fix(allowed_model_region): allow us as allowed region (#6234) * test(router_code_coverage.py): check if all router functions are dire… (#6186) * test(router_code_coverage.py): check if all router functions are directly tested prevent regressions * docs(configs.md): document all environment variables (#6185) * docs: make it easier to find anthropic/openai prompt caching doc * aded codecov yml (#6207) * fix codecov.yaml * run ci/cd again * (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * fix test_embedding_caching_azure_individual_items_reordered * (feat) prometheus have well defined latency buckets (#6211) * fix prometheus have well defined latency buckets * use a well define latency bucket * use types file for prometheus logging * add test for LATENCY_BUCKETS * fix prom testing * fix config.yml * (refactor caching) use LLMCachingHandler for caching streaming responses (#6210) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * bump (#6187) * update code cov yaml * fix config.yml * add caching component to code cov * fix config.yml ci/cd * add coverage for proxy auth * (refactor caching) use common `_retrieve_from_cache` helper (#6212) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * refactor - use _retrieve_from_cache * refactor use _convert_cached_result_to_model_response * fix linting errors * bump: version 1.49.2 → 1.49.3 * fix code cov components * test(test_router_helpers.py): add router component unit tests * test: add additional router tests * test: add more router testing * test: add more router testing + more mock functions * ci(router_code_coverage.py): fix check --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> * bump: version 1.49.3 → 1.49.4 * (refactor) use helper function `_assemble_complete_response_from_streaming_chunks` to assemble complete responses in caching and logging callbacks (#6220) * (refactor) use _assemble_complete_response_from_streaming_chunks * add unit test for test_assemble_complete_response_from_streaming_chunks_1 * fix assemble complete_streaming_response * config add logging_testing * add logging_coverage in codecov * test test_assemble_complete_response_from_streaming_chunks_3 * add unit tests for _assemble_complete_response_from_streaming_chunks * fix remove unused / junk function * add test for streaming_chunks when error assembling * (refactor) OTEL - use safe_set_attribute for setting attributes (#6226) * otel - use safe_set_attribute for setting attributes * fix OTEL only use safe_set_attribute * fix(allowed_model_region): allow us as allowed region --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> * fix(litellm_pre_call_utils.py): support 'us' region routing + fix header forwarding to filter on `x-` headers * docs(customer_routing.md): fix region-based routing example * feat(azure.py): handle empty arguments function call - azure Closes https://github.com/BerriAI/litellm/issues/6241 * feat(guardrails_ai.py): support guardrails ai integration Adds support for on-prem guardrails via guardrails ai * fix(proxy/utils.py): prevent sql injection attack Fixes https://huntr.com/bounties/a4f6d357-5b44-4e00-9cac-f1cc351211d2 * fix: fix linting errors * fix(litellm_pre_call_utils.py): don't log litellm api key in proxy server request headers * fix(litellm_pre_call_utils.py): don't forward stainless headers * docs(guardrails_ai.md): add guardrails ai quick start to docs * test: handle flaky test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> Co-authored-by: Marcus Elwin <marcus@elwin.com>	2024-10-16 07:32:06 -07:00
Ishaan Jaff	fc5b75d171	(router testing) Add testing coverage for `run_async_fallback` and `run_sync_fallback` (#6256 ) * add type hints for run_async_fallback * fix async fallback doc string * test run_async_fallback	2024-10-16 16:16:17 +05:30
Ishaan Jaff	d9a71650e3	(refactor) - caching use separate files for each cache class (#6251 ) * fix remove qdrant semantic caching to it's own folder * refactor use 1 file for s3 caching * fix use sep files for in mem and redis caching * fix refactor caching * add readme.md for caching folder	2024-10-16 13:17:21 +05:30
Ishaan Jaff	97ba4eea7d	(refactor) sync caching - use `LLMCachingHandler` class for get_cache (#6249 ) * caching - use _sync_set_cache * add sync _sync_add_streaming_response_to_cache * use caching class for cache storage * fix use _sync_get_cache * fix circular import * use _update_litellm_logging_obj_environment * use one helper for _process_async_embedding_cached_response * fix _is_call_type_supported_by_cache * fix checking cache * fix sync get cache * fix use _combine_cached_embedding_response_with_api_result * fix _update_litellm_logging_obj_environment * adjust test_redis_cache_acompletion_stream_bedrock	2024-10-16 12:33:49 +05:30
Ishaan Jaff	183bd5d873	(testing - litellm.Router ) add unit test coverage for pattern matching / wildcard routing (#6250 ) * add testing coverage for pattern match router * fix add_pattern * fix typo on router_cooldown_event_callback * add testing for pattern match router * fix add explanation for pattern match router	2024-10-16 11:58:05 +05:30
Ishaan Jaff	c390b07e29	fix RerankResponse make meta optional (#6248 )	2024-10-16 11:47:44 +05:30
Ishaan Jaff	c1657efda2	bump: version 1.49.4 → 1.49.5	2024-10-16 11:46:33 +05:30
Ishaan Jaff	da6a7c3a55	(refactor) caching - use _sync_set_cache (#6224 ) * caching - use _sync_set_cache * add sync _sync_add_streaming_response_to_cache * use caching class for cache storage	2024-10-16 10:38:07 +05:30
dependabot[bot]	b8d4973661	Bump hono from 4.5.8 to 4.6.5 in /litellm-js/spend-logs (#6245 ) Bumps [hono](https://github.com/honojs/hono) from 4.5.8 to 4.6.5. - [Release notes](https://github.com/honojs/hono/releases) - [Commits](https://github.com/honojs/hono/compare/v4.5.8...v4.6.5) --- updated-dependencies: - dependency-name: hono dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-16 10:37:31 +05:30
Ishaan Jaff	6909d8e11b	fix arize handle optional params (#6243 )	2024-10-16 08:33:40 +05:30
Ishaan Jaff	1994100028	(fix) prompt caching cost calculation OpenAI, Azure OpenAI (#6231 ) * fix prompt caching cost calculation * fix testing for prompt cache cost calc	2024-10-15 18:55:31 +05:30
Ishaan Jaff	846bb4cb91	(refactor) OTEL - use safe_set_attribute for setting attributes (#6226 ) * otel - use safe_set_attribute for setting attributes * fix OTEL only use safe_set_attribute	2024-10-15 13:39:29 +05:30
Ishaan Jaff	a69c670baa	(refactor) use helper function `_assemble_complete_response_from_streaming_chunks` to assemble complete responses in caching and logging callbacks (#6220 ) * (refactor) use _assemble_complete_response_from_streaming_chunks * add unit test for test_assemble_complete_response_from_streaming_chunks_1 * fix assemble complete_streaming_response * config add logging_testing * add logging_coverage in codecov * test test_assemble_complete_response_from_streaming_chunks_3 * add unit tests for _assemble_complete_response_from_streaming_chunks * fix remove unused / junk function * add test for streaming_chunks when error assembling	2024-10-15 12:45:12 +05:30
Krrish Dholakia	e9a46b992c	bump: version 1.49.3 → 1.49.4	2024-10-14 22:44:14 -07:00
Krish Dholakia	1eb435e50a	test(router_code_coverage.py): check if all router functions are dire… (#6186 ) * test(router_code_coverage.py): check if all router functions are directly tested prevent regressions * docs(configs.md): document all environment variables (#6185) * docs: make it easier to find anthropic/openai prompt caching doc * aded codecov yml (#6207) * fix codecov.yaml * run ci/cd again * (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * fix test_embedding_caching_azure_individual_items_reordered * (feat) prometheus have well defined latency buckets (#6211) * fix prometheus have well defined latency buckets * use a well define latency bucket * use types file for prometheus logging * add test for LATENCY_BUCKETS * fix prom testing * fix config.yml * (refactor caching) use LLMCachingHandler for caching streaming responses (#6210) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * bump (#6187) * update code cov yaml * fix config.yml * add caching component to code cov * fix config.yml ci/cd * add coverage for proxy auth * (refactor caching) use common `_retrieve_from_cache` helper (#6212) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * refactor - use _retrieve_from_cache * refactor use _convert_cached_result_to_model_response * fix linting errors * bump: version 1.49.2 → 1.49.3 * fix code cov components * test(test_router_helpers.py): add router component unit tests * test: add additional router tests * test: add more router testing * test: add more router testing + more mock functions * ci(router_code_coverage.py): fix check --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com>	2024-10-14 22:44:00 -07:00
Krish Dholakia	39486e2003	Litellm dev 10 14 2024 (#6221 ) * fix(__init__.py): expose DualCache, RedisCache, InMemoryCache on root abstract internal file refactors from impacting users * feat(utils.py): handle invalid openai parallel tool calling response Fixes https://community.openai.com/t/model-tries-to-call-unknown-function-multi-tool-use-parallel/490653 * docs(bedrock.md): clarify all bedrock models are supported Closes https://github.com/BerriAI/litellm/issues/6168#issuecomment-2412082236	2024-10-14 22:11:14 -07:00
Ishaan Jaff	cda0a993e2	fix importing Cache from litellm (#6219 )	2024-10-15 08:47:23 +05:30
Ishaan Jaff	d0a3052937	(refactor router.py ) - PR 3 - Ensure all functions under 100 lines (#6181 ) * add flake 8 check * split up litellm _acompletion * fix get model client * refactor use commong func to add metadata to kwargs * use common func to get timeout * re-use helper to _get_async_model_client * use _handle_mock_testing_rate_limit_error * fix docstring for _handle_mock_testing_rate_limit_error * fix function_with_retries * use helper for mock testing fallbacks * router - use 1 func for simple_shuffle * add doc string for simple_shuffle * use 1 function for filtering cooldown deployments * fix use common helper to _get_fallback_model_group_from_fallbacks	2024-10-14 21:27:54 +05:30
Ishaan Jaff	0761a03d05	fix code cov components	2024-10-14 20:35:41 +05:30
Ishaan Jaff	cc6f34e9d4	bump: version 1.49.2 → 1.49.3	2024-10-14 19:13:01 +05:30
Ishaan Jaff	4ebe6813c0	(refactor caching) use common `_retrieve_from_cache` helper (#6212 ) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * refactor - use _retrieve_from_cache * refactor use _convert_cached_result_to_model_response * fix linting errors	2024-10-14 19:12:41 +05:30
Ishaan Jaff	284deafd0d	add coverage for proxy auth	2024-10-14 18:50:10 +05:30
Ishaan Jaff	ef2b8b0991	fix config.yml ci/cd	2024-10-14 18:42:31 +05:30
Ishaan Jaff	045a029c54	add caching component to code cov	2024-10-14 18:40:24 +05:30
Ishaan Jaff	23b4ded35e	fix config.yml	2024-10-14 18:38:07 +05:30
Ishaan Jaff	fac1bb4deb	update code cov yaml	2024-10-14 18:31:22 +05:30
yujonglee	4132a97787	bump (#6187 )	2024-10-14 18:22:54 +05:30
Ishaan Jaff	d1bef4ad81	(refactor caching) use LLMCachingHandler for caching streaming responses (#6210 ) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting	2024-10-14 17:46:45 +05:30
Ishaan Jaff	78f3228e17	fix config.yml	2024-10-14 17:28:52 +05:30
Ishaan Jaff	c677fea7e8	fix prom testing	2024-10-14 17:23:40 +05:30
Ishaan Jaff	603299e3c8	(feat) prometheus have well defined latency buckets (#6211 ) * fix prometheus have well defined latency buckets * use a well define latency bucket * use types file for prometheus logging * add test for LATENCY_BUCKETS	2024-10-14 17:16:01 +05:30
Ishaan Jaff	4d1b4beb3d	(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208 ) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * fix test_embedding_caching_azure_individual_items_reordered	2024-10-14 16:34:01 +05:30

1 2 3 4 5 ...

18098 commits