litellm

Author	SHA1	Message	Date
Ishaan Jaff	c047d51cc8	(feat) add `Predicted Outputs` for OpenAI (#6594 ) * bump openai to openai==1.54.0 * add 'prediction' param * testing fix bedrock deprecated cohere.command-text-v14 * test test_openai_prediction_param.py * test_openai_prediction_param_with_caching * doc Predicted Outputs * doc Predicted Output	2024-11-04 21:16:57 -08:00
Krish Dholakia	3a6ba0b955	Litellm perf improvements 3 (#6573 ) * perf: move writing key to cache, to background task * perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils adds 200ms on calls with pgdb connected * fix(litellm_pre_call_utils.py'): rename call_type to actual call used * perf(proxy_server.py): remove db logic from _get_config_from_file was causing db calls to occur on every llm request, if team_id was set on key * fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db reduces latency/call by ~100ms * fix(proxy_server.py): minor fix on existing_settings not incl alerting * fix(exception_mapping_utils.py): map databricks exception string * fix(auth_checks.py): fix auth check logic * test: correctly mark flaky test * fix(utils.py): handle auth token error for tokenizers.from_pretrained	2024-11-05 03:51:26 +05:30
Ishaan Jaff	441adad3ae	(router_strategy/) ensure all async functions use async cache methods (#6489 ) * fix router strat * use async set / get cache in router_strategy * add coverage for router strategy * fix imports * fix batch_get_cache * use async methods for least busy * fix least busy use async methods * fix test_dual_cache_increment * test async_get_available_deployment when routing_strategy="least-busy"	2024-10-29 21:07:17 +05:30
Ishaan Jaff	c42ec81b8d	fix name of tests on config	2024-10-25 10:44:14 +04:00
Krish Dholakia	cc8dd80209	allow configuring httpx hooks for AsyncHTTPHandler (#6290 ) (#6415 ) * allow configuring httpx hooks for AsyncHTTPHandler (#6290) Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Fixes and minor improvements for Helm Chart (#6402) * reckoner hack * fix default * add extracontainers option * revert chart * fix extracontainers * fix deployment * remove init container * update docs * add helm lint to deploy step * change name * (refactor) prometheus async_log_success_event to be under 100 LOC (#6416) * unit testig for prometheus * unit testing for success metrics * use 1 helper for _increment_token_metrics * use helper for _increment_remaining_budget_metrics * use _increment_remaining_budget_metrics * use _increment_top_level_request_and_spend_metrics * use helper for _set_latency_metrics * remove noqa violation * fix test prometheus * test prometheus * unit testing for all prometheus helper functions * fix prom unit tests * fix unit tests prometheus * fix unit test prom * (refactor) router - use static methods for client init utils (#6420) * use InitalizeOpenAISDKClient * use InitalizeOpenAISDKClient static method * fix # noqa: PLR0915 * (code cleanup) remove unused and undocumented logging integrations - litedebugger, berrispend (#6406) * code cleanup remove unused and undocumented code files * fix unused logging integrations cleanup * update chart version * add circleci tests --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Xingyao Wang <xingyao@all-hands.dev> * fix: fix linting error * fix(http_handler.py): fix linting error --------- Co-authored-by: Alejandro Rodríguez <alejorro70@gmail.com> Co-authored-by: Robert Brennan <accounts@rbren.io> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>	2024-10-24 22:00:24 -07:00
Ishaan Jaff	13e0b3f626	(feat) Support `audio`, `modalities` params (#6304 ) * add audio, modalities param * add test for gpt audio models * add get_supported_openai_params for GPT audio models * add supported params for audio * test_audio_output_from_model * bump openai to openai==1.52.0 * bump openai on pyproject * fix audio test * fix test mock_chat_response * handle audio for Message * fix handling audio for OAI compatible API endpoints * fix linting * fix mock dbrx test	2024-10-18 19:14:25 +05:30
Ishaan Jaff	a69c670baa	(refactor) use helper function `_assemble_complete_response_from_streaming_chunks` to assemble complete responses in caching and logging callbacks (#6220 ) * (refactor) use _assemble_complete_response_from_streaming_chunks * add unit test for test_assemble_complete_response_from_streaming_chunks_1 * fix assemble complete_streaming_response * config add logging_testing * add logging_coverage in codecov * test test_assemble_complete_response_from_streaming_chunks_3 * add unit tests for _assemble_complete_response_from_streaming_chunks * fix remove unused / junk function * add test for streaming_chunks when error assembling	2024-10-15 12:45:12 +05:30
Krish Dholakia	1eb435e50a	test(router_code_coverage.py): check if all router functions are dire… (#6186 ) * test(router_code_coverage.py): check if all router functions are directly tested prevent regressions * docs(configs.md): document all environment variables (#6185) * docs: make it easier to find anthropic/openai prompt caching doc * aded codecov yml (#6207) * fix codecov.yaml * run ci/cd again * (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * fix test_embedding_caching_azure_individual_items_reordered * (feat) prometheus have well defined latency buckets (#6211) * fix prometheus have well defined latency buckets * use a well define latency bucket * use types file for prometheus logging * add test for LATENCY_BUCKETS * fix prom testing * fix config.yml * (refactor caching) use LLMCachingHandler for caching streaming responses (#6210) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * bump (#6187) * update code cov yaml * fix config.yml * add caching component to code cov * fix config.yml ci/cd * add coverage for proxy auth * (refactor caching) use common `_retrieve_from_cache` helper (#6212) * use folder for caching * fix importing caching * fix clickhouse pyright * fix linting * fix correctly pass kwargs and args * fix test case for embedding * fix linting * fix embedding caching logic * fix refactor handle utils.py * refactor async set stream cache * fix linting * refactor - use _retrieve_from_cache * refactor use _convert_cached_result_to_model_response * fix linting errors * bump: version 1.49.2 → 1.49.3 * fix code cov components * test(test_router_helpers.py): add router component unit tests * test: add additional router tests * test: add more router testing * test: add more router testing + more mock functions * ci(router_code_coverage.py): fix check --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com>	2024-10-14 22:44:00 -07:00
Ishaan Jaff	ef2b8b0991	fix config.yml ci/cd	2024-10-14 18:42:31 +05:30
Ishaan Jaff	23b4ded35e	fix config.yml	2024-10-14 18:38:07 +05:30
Ishaan Jaff	78f3228e17	fix config.yml	2024-10-14 17:28:52 +05:30
Ishaan Jaff	c677fea7e8	fix prom testing	2024-10-14 17:23:40 +05:30
Krish Dholakia	15b44c3221	docs(configs.md): document all environment variables (#6185 )	2024-10-13 09:57:03 -07:00
Krish Dholakia	9f506860b5	ci(config.yml): add further testing coverage to codecov (#6184 ) Improve visibility into current testing	2024-10-12 22:37:46 -07:00
Krish Dholakia	d350733fa9	ci(config.yml): add local_testing tests to codecov coverage check (#6183 ) * ci(config.yml): add local_testing tests to codecov coverage check * fix(config.yml): wait for local_testing to complete before uploading coverage	2024-10-12 18:51:40 -07:00
Krrish Dholakia	a208c275d6	ci(config.yml): run all router tests	2024-10-12 16:01:57 -07:00
Krish Dholakia	fc13c023b7	build(config.yml): add codecov to repo (#6172 ) * build(config.yml): add codecov to repo ensures all commits have testing coverage * build(config.yml): fix ci config * build: fix .yml * build(config.yml): fix ci/cd * ci(config.yml): specify module to measure code coverage for * ci(config.yml): update config.yml version * ci: trigger new run * ci(config.yml): store combine * build(config.yml): check files before combine * ci(config.yml): fix check * ci(config.yml): add codecov coverage to ci/cd * ci(config.yml): add codecov to router tests * ci(config.yml): wait for router testing to complete before running codecov upload * ci(config.yml): handle multiple coverage.xml's * fix(router.py): cleanup print stack * ci(config.yml): fix config * ci(config.yml): fix config	2024-10-12 14:48:17 -07:00
Ishaan Jaff	d1760b1b04	(fix) clean up root repo - move entrypoint.sh and build_admin_ui to /docker (#6110 ) * fix move docker files to docker folders * move check file length * fix docker hub deploy * fix clean up root * fix circle ci config	2024-10-08 11:34:43 +05:30
Ishaan Jaff	59b247ab23	fix config.yml	2024-10-08 08:36:03 +05:30
Krish Dholakia	04e5963b65	Litellm expose disable schema update flag (#6085 ) * fix: enable new 'disable_prisma_schema_update' flag * build(config.yml): remove setup remote docker step * ci(config.yml): give container time to start up * ci(config.yml): update test * build(config.yml): actually start docker * build(config.yml): simplify grep check * fix(prisma_client.py): support reading disable_schema_update via env vars * ci(config.yml): add test to check if all general settings are documented * build(test_General_settings.py): check available dir * ci: check ../ repo path * build: check ./ * build: fix test	2024-10-05 21:26:51 -04:00
Krish Dholakia	fac3b2ee42	Add pyright to ci/cd + Fix remaining type-checking errors (#6082 ) * fix: fix type-checking errors * fix: fix additional type-checking errors * fix: additional type-checking error fixes * fix: fix additional type-checking errors * fix: additional type-check fixes * fix: fix all type-checking errors + add pyright to ci/cd * fix: fix incorrect import * ci(config.yml): use mypy on ci/cd * fix: fix type-checking errors in utils.py * fix: fix all type-checking errors on main.py * fix: fix mypy linting errors * fix(anthropic/cost_calculator.py): fix linting errors * fix: fix mypy linting errors * fix: fix linting errors	2024-10-05 17:04:00 -04:00
Ishaan Jaff	4e88fd65e1	(feat) openai prompt caching (non streaming) - add prompt_tokens_details in usage response (#6039 ) * add prompt_tokens_details in usage response * use _prompt_tokens_details as a param in Usage * fix linting errors * fix type error * fix ci/cd deps * bump deps for openai * bump deps openai * fix llm translation testing * fix llm translation embedding	2024-10-03 23:31:10 +05:30
Krish Dholakia	d57be47b0f	Litellm ruff linting enforcement (#5992 ) * ci(config.yml): add a 'check_code_quality' step Addresses https://github.com/BerriAI/litellm/issues/5991 * ci(config.yml): check why circle ci doesn't pick up this test * ci(config.yml): fix to run 'check_code_quality' tests * fix(__init__.py): fix unprotected import * fix(__init__.py): don't remove unused imports * build(ruff.toml): update ruff.toml to ignore unused imports * fix: fix: ruff + pyright - fix linting + type-checking errors * fix: fix linting errors * fix(lago.py): fix module init error * fix: fix linting errors * ci(config.yml): cd into correct dir for checks * fix(proxy_server.py): fix linting error * fix(utils.py): fix bare except causes ruff linting errors * fix: ruff - fix remaining linting errors * fix(clickhouse.py): use standard logging object * fix(__init__.py): fix unprotected import * fix: ruff - fix linting errors * fix: fix linting errors * ci(config.yml): cleanup code qa step (formatting handled in local_testing) * fix(_health_endpoints.py): fix ruff linting errors * ci(config.yml): just use ruff in check_code_quality pipeline for now * build(custom_guardrail.py): include missing file * style(embedding_handler.py): fix ruff check	2024-10-01 19:44:20 -04:00
Ishaan Jaff	045ecf3ffb	(feat proxy slack alerting) - allow opting in to getting key / internal user alerts (#5990 ) * define all slack alert types * use correct type hints for alert type * use correct defaults on slack alerting * add readme for slack alerting * fix linting error * update readme * docs all alert types * update slack alerting docs * fix slack alerting docs * handle new testing dir structure * fix config for testing * fix testing folder related imports * fix /tests import errors * fix import stream_chunk_testdata * docs alert types * fix test test_langfuse_trace_id * fix type checks for slack alerting * fix outage alerting test slack	2024-10-01 10:49:22 -07:00
Krrish Dholakia	c0cdc6e496	build(config.yml): fix build_and_test part of tests	2024-09-28 21:08:14 -07:00
Krrish Dholakia	3560f0ef2c	refactor: move all testing to top-level of repo Closes https://github.com/BerriAI/litellm/issues/486	2024-09-28 21:08:14 -07:00
Krish Dholakia	bd17424c4b	LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925 ) (#5937 ) * LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925) * fix(litellm_logging.py): don't initialize prometheus_logger if non premium user Prevents bad error messages in logs Fixes https://github.com/BerriAI/litellm/issues/5897 * Add Support for Custom Providers in Vision and Function Call Utils (#5688) * Add Support for Custom Providers in Vision and Function Call Utils Lookup * Remove parallel function call due to missing model info param * Add Unit Tests for Vision and Function Call Changes * fix-#5920: set header value to string to fix "'int' object has no att… (#5922) * LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880) * LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) * feat(auth_utils.py): enable admin to allow client-side credentials to be passed Makes it easier for devs to experiment with finetuned fireworks ai models * feat(router.py): allow setting configurable_clientside_auth_params for a model Closes https://github.com/BerriAI/litellm/issues/5843 * build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit Fixes https://github.com/BerriAI/litellm/issues/5850 * fix(azure_ai/): support content list for azure ai Fixes https://github.com/BerriAI/litellm/issues/4237 * fix(litellm_logging.py): always set saved_cache_cost Set to 0 by default * fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing handles calling 405b+ size models * fix(slack_alerting.py): fix error alerting for failed spend tracking Fixes regression with slack alerting error monitoring * fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error * docs(bedrock.md): add llama3-1 models * test: fix tests * fix(azure_ai/chat): fix transformation for azure ai calls * feat(azure_ai/embed): Add azure ai embeddings support Closes https://github.com/BerriAI/litellm/issues/5861 * fix(azure_ai/embed): enable async embedding * feat(azure_ai/embed): support azure ai multimodal embeddings * fix(azure_ai/embed): support async multi modal embeddings * feat(together_ai/embed): support together ai embedding calls * feat(rerank/main.py): log source documents for rerank endpoints to langfuse improves rerank endpoint logging * fix(langfuse.py): support logging `/audio/speech` input to langfuse * test(test_embedding.py): fix test * test(test_completion_cost.py): fix helper util * fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'" --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926) This reverts commit `a554ae2695`. * build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing Enables cost tracking for azure ai cohere rerank models * fix(litellm_logging.py): fix debug log to be clearer Closes https://github.com/BerriAI/litellm/issues/5909 * test(test_utils.py): fix test name * fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models * fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints * fix(converse_handler.py): support new llama 3-2 models Fixes https://github.com/BerriAI/litellm/issues/5901 * fix(litellm_logging.py): ensure response is redacted for standard message logging Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360 * fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation allows user to set custom cost for model * fix(config.yml): fix docker hub auht * build(config.yml): add docker auth to all tests * fix(db/create_views.py): fix linting error * fix(main.py): fix circular import * fix(azure_ai/__init__.py): fix circular import * fix(main.py): fix import * fix: fix linting errors * test: fix test * fix(proxy_server.py): pass premium user value on startup used for prometheus init --------- Co-authored-by: Cole Murray <colemurray.cs@gmail.com> Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com> * handle streaming for azure ai studio error * [Perf Proxy] parallel request limiter - use one cache update call (#5932) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf * test: fix test * test(test_rerank.py): fix test --------- Co-authored-by: Cole Murray <colemurray.cs@gmail.com> Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-09-27 17:54:13 -07:00
Ishaan Jaff	f4613a100d	[Perf Proxy] parallel request limiter - use one cache update call (#5932 ) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf	2024-09-27 17:24:46 -07:00
Ishaan Jaff	d9e798ecda	[Testing-Proxy] Add E2E Admin UI testing (#5845 ) * add working ui e2e testing * ui test * ui playwright testing * install python on ui testing * add playwright testing * fix ui testing * fix ui testing * add redis vars for testing * fix playwright testing * fix playwright testing * rename ui testing * move e2e ui testing	2024-09-23 11:34:42 -07:00
Ishaan Jaff	1973ae8fb8	[Feat] Allow setting `supports_vision` for Custom OpenAI endpoints + Added testing (#5821 ) * add test for using images with custom openai endpoints * run all otel tests * update name of test * add custom openai model to test config * add test for setting supports_vision=True for model * fix test guardrails aporia * docs supports vison * fix yaml * fix yaml * docs supports vision * fix bedrock guardrail test * fix cohere rerank test * update model_group doc string * add better prints on test	2024-09-21 11:35:55 -07:00
Krish Dholakia	3933fba41f	LiteLLM Minor Fixes & Improvements (09/19/2024) (#5793 ) * fix(model_prices_and_context_window.json): add cost tracking for more vertex llama3.1 model 8b and 70b models * fix(proxy/utils.py): handle data being none on pre-call hooks * fix(proxy/): create views on initial proxy startup fixes base case, where user starts proxy for first time Fixes https://github.com/BerriAI/litellm/issues/5756 * build(config.yml): fix vertex version for test * feat(ui/): support enabling/disabling slack alerting Allows admin to turn on/off slack alerting through ui * feat(rerank/main.py): support langfuse logging * fix(proxy/utils.py): fix linting errors * fix(langfuse.py): log clean metadata * test(tests): replace deprecated openai model	2024-09-20 08:19:52 -07:00
Ishaan Jaff	85acdb9193	[Feat] Add `max_completion_tokens` param (#5691 ) * add max_completion_tokens * add max_completion_tokens * add max_completion_tokens support for OpenAI models * add max_completion_tokens param * add max_completion_tokens for bedrock converse models * add test for converse maxTokens * fix openai o1 param mapping test * move test optional params * add max_completion_tokens for anthropic api * fix conftest * add max_completion tokens for vertex ai partner models * add max_completion_tokens for fireworks ai * add max_completion_tokens for hf rest api * add test for param mapping * add param mapping for vertex, gemini + testing * predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd * add max_completion_tokens to openai supported params * fix fireworks ai param mapping	2024-09-14 14:57:01 -07:00
Ishaan Jaff	ded40e4d41	bump openai to 1.45.0	2024-09-12 14:18:15 -07:00
Ishaan Jaff	bea34c9231	fix config.yml	2024-09-12 09:28:45 -07:00
Ishaan Jaff	9ca7de58d6	fix testing	2024-09-12 08:42:00 -07:00
Ishaan Jaff	e5a776dc07	make separate assistants testing pipeline	2024-09-12 08:30:21 -07:00
Ishaan Jaff	f880e2b958	fix respx	2024-09-12 08:26:31 -07:00
Ishaan Jaff	fbe92df87e	fix router tests	2024-09-12 08:24:37 -07:00
Ishaan Jaff	9c79c1c7b2	fix ci/cd tests	2024-09-12 08:23:56 -07:00
Ishaan Jaff	d944bd98b9	fix config.yml	2024-09-12 08:21:05 -07:00
Ishaan Jaff	d65ba87014	add litellm router testing	2024-09-12 08:19:34 -07:00
Ishaan Jaff	da29b070bb	print load test results	2024-09-11 20:53:52 -07:00
Ishaan Jaff	a08ad0ea70	add load tests to ci/cd	2024-09-11 20:50:57 -07:00
Ishaan Jaff	f339f9614a	fix requirements.txt	2024-09-11 13:35:37 -07:00
Ishaan Jaff	1415bdd6fa	fix testing + req.txt	2024-09-11 13:30:42 -07:00
Ishaan Jaff	a66f03f860	fix installing litellm	2024-09-11 12:45:39 -07:00
Ishaan Jaff	edc51f45ac	add error message on test	2024-09-05 15:46:13 -07:00
Ishaan Jaff	30137b0b72	run ci - cd again	2024-09-05 15:42:40 -07:00
Ishaan Jaff	05e02fbe76	run ci/cd again	2024-09-05 15:38:49 -07:00
Ishaan Jaff	004a71b8dc	use requirements txt	2024-09-05 15:34:17 -07:00

1 2 3 4 5 ...

356 commits