litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 10:44:24 +00:00

Author	SHA1	Message	Date
Krrish Dholakia	9860a4e819	docs(routing.md): cleanup docs All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 16s Details Helm unit test / unit-test (push) Successful in 16s Details	2025-03-14 14:34:15 -07:00
Ishaan Jaff	207f41cbea	docs fix router default settings	2025-03-05 08:29:21 -08:00
Krrish Dholakia	d0413ec96b	docs(routing.md): add section on weighted deployments	2025-02-17 17:02:06 -08:00
Krish Dholakia	70a9ea99f2	Controll fallback prompts client-side (#7334 ) * feat(router.py): support passing model-specific messages in fallbacks * docs(routing.md): separate router timeouts into separate doc allow for 1 fallbacks doc (across proxy/router) * docs(routing.md): cleanup router docs * docs(reliability.md): cleanup docs * docs(reliability.md): cleaned up fallback doc just have 1 doc across sdk/proxy simplifies docs * docs(reliability.md): add setting model-specific fallback prompts * fix: fix linting errors * test: skip test causing openai rate limit errros * test: fix test * test: run vertex test first to catch error	2024-12-20 19:09:53 -08:00
Krish Dholakia	0c0498dd60	Litellm dev 12 07 2024 (#7086 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details * fix(main.py): support passing max retries to azure/openai embedding integrations Fixes https://github.com/BerriAI/litellm/issues/7003 * feat(team_endpoints.py): allow updating team model aliases Closes https://github.com/BerriAI/litellm/issues/6956 * feat(router.py): allow specifying model id as fallback - skips any cooldown check Allows a default model to be checked if all models in cooldown s/o @micahjsmith * docs(reliability.md): add fallback to specific model to docs * fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util Allows user to identify if messages/tools have prompt caching Related issue: https://github.com/BerriAI/litellm/issues/6784 * feat(router.py): store model id for prompt caching valid prompt Allows routing to that model id on subsequent requests * fix(router.py): only cache if prompt is valid prompt caching prompt prevents storing unnecessary items in cache * feat(router.py): support routing prompt caching enabled models to previous deployments Closes https://github.com/BerriAI/litellm/issues/6784 * test: fix linting errors * feat(databricks/): convert basemodel to dict and exclude none values allow passing pydantic message to databricks * fix(utils.py): ensure all chat completion messages are dict * (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081) * add custom_llm_provider to SpendLogsPayload * add custom_llm_provider to SpendLogs * add custom llm provider to SpendLogs payload * test_spend_logs_payload * Add MLflow to the side bar (#7031) Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * (bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082) * catch DB_CONNECTION_ERROR_TYPES * fix DB retry mechanism for SpendLog updates * use DB_CONNECTION_ERROR_TYPES in auth checks * fix exp back off for writing SpendLogs * use _raise_failed_update_spend_exception to ensure errors print as NON blocking * test_update_spend_logs_multiple_batches_with_failure * (Feat) Add StructuredOutputs support for Fireworks.AI (#7085) * fix model cost map fireworks ai "supports_response_schema": true, * fix supports_response_schema * fix map openai params fireworks ai * test_map_response_format * test_map_response_format * added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084) * bump: version 1.53.9 → 1.54.0 * fix deepinfra * litellm db fixes LiteLLM_UserTable (#7089) * ci/cd queue new release * fix llama-3.3-70b-versatile * refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090) * fix refactor - use consistent file naming convention * ci/cd run again * fix naming structure * fix use consistent naming (#7092) --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>	2024-12-08 00:30:33 -08:00
Krish Dholakia	2d2931a215	LiteLLM Minor Fixes & Improvements (11/26/2024) (#6913 ) * docs(config_settings.md): document all router_settings * ci(config.yml): add router_settings doc test to ci/cd * test: debug test on ci/cd * test: debug ci/cd test * test: fix test * fix(team_endpoints.py): skip invalid team object. don't fail `/team/list` call Causes downstream errors if ui just fails to load team list * test(base_llm_unit_tests.py): add 'response_format={"type": "text"}' test to base_llm_unit_tests adds complete coverage for all 'response_format' values to ci/cd * feat(router.py): support wildcard routes in `get_router_model_info()` Addresses https://github.com/BerriAI/litellm/issues/6914 * build(model_prices_and_context_window.json): add tpm/rpm limits for all gemini models Allows for ratelimit tracking for gemini models even with wildcard routing enabled Addresses https://github.com/BerriAI/litellm/issues/6914 * feat(router.py): add tpm/rpm tracking on success/failure to global_router Addresses https://github.com/BerriAI/litellm/issues/6914 * feat(router.py): support wildcard routes on router.get_model_group_usage() * fix(router.py): fix linting error * fix(router.py): implement get_remaining_tokens_and_requests Addresses https://github.com/BerriAI/litellm/issues/6914 * fix(router.py): fix linting errors * test: fix test * test: fix tests * docs(config_settings.md): add missing dd env vars to docs * fix(router.py): check if hidden params is dict	2024-11-28 00:01:38 +05:30
Emmanuel Ferdman	9cf3dcbbf3	Update routing references (#6758 ) * Update routing references Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * Update routing references Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> --------- Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2024-11-16 08:28:44 -08:00
Ishaan Jaff	bbf4db79c1	docs - show correct rpm - > tpm conversion for Azure	2024-09-27 17:18:55 -07:00
Krish Dholakia	7ed6938a3f	LiteLLM Minor Fixes & Improvements (09/20/2024) (#5807 ) * fix(vertex_llm_base.py): Handle api_base = "" Fixes https://github.com/BerriAI/litellm/issues/5798 * fix(o1_transformation.py): handle stream_options not being supported https://github.com/BerriAI/litellm/issues/5803 * docs(routing.md): fix docs Closes https://github.com/BerriAI/litellm/issues/5808 * perf(internal_user_endpoints.py): reduce db calls for getting team_alias for a key Use the list gotten earlier in `/user/info` endpoint Reduces ui keys tab load time to 800ms (prev. 28s+) * feat(proxy_server.py): support CONFIG_FILE_PATH as env var Closes https://github.com/BerriAI/litellm/issues/5744 * feat(get_llm_provider_logic.py): add `litellm_proxy/` as a known openai-compatible route simplifies calling litellm proxy Reduces confusion when calling models on litellm proxy from litellm sdk * docs(litellm_proxy.md): cleanup docs * fix(internal_user_endpoints.py): fix pydantic obj * test(test_key_generate_prisma.py): fix test	2024-09-20 20:21:32 -07:00
Krish Dholakia	98c34a7e27	LiteLLM Minor Fixes and Improvements (11/09/2024) (#5634 ) * fix(caching.py): set ttl for async_increment cache fixes issue where ttl for redis client was not being set on increment_cache Fixes https://github.com/BerriAI/litellm/issues/5609 * fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis Fixes https://github.com/BerriAI/litellm/issues/5609 * fix(router.py): support adding retry policy + allowed fails policy via config.yaml * fix(router.py): don't cooldown single deployments No point, as there's no other deployment to loadbalance with. * fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens Closes https://github.com/BerriAI/litellm/issues/5605 * docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs * fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set Fixes issue where key logging would not be set if team metadata was not none * fix(secret_managers/main.py): load environment variables correctly Fixes issue where os.environ/ was not being loaded correctly * test(test_router.py): fix test * feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek * test: fix tests * test: fix test * test: fix test * test: fix test * test: fix test	2024-09-11 22:36:06 -07:00
Krrish Dholakia	f5905e1000	docs(routing.md): add proxy loadbalancing tutorial	2024-09-03 07:38:19 -07:00
Krrish Dholakia	7f1531006c	docs(routing.md): add weight-based shuffling to docs	2024-08-30 08:24:12 -07:00
Ishaan Jaff	bd7bf7f6b0	fix endpoint name on router	2024-08-16 12:46:43 -07:00
Ishaan Jaff	5bc8b59b11	docs - use consistent name for LiteLLM proxy server	2024-08-03 12:49:35 -07:00
Krrish Dholakia	051dfee421	docs(routing.md): add docs on how to disable cooldowns	2024-07-01 15:05:38 -07:00
Krrish Dholakia	8fbc34e7e9	docs(routing.md): add dynamic cooldowns to docs	2024-06-25 17:01:58 -07:00
Krrish Dholakia	86cb5aa031	docs(routing.md): add quickstart	2024-06-24 22:25:39 -07:00
Ishaan Jaff	91d9d59717	docs - routing	2024-06-20 14:32:52 -07:00
Krrish Dholakia	dca23aaa3e	docs(routing.md): improve docs	2024-06-14 21:55:27 -07:00
Krrish Dholakia	a7a63f8392	docs(routing.md): update routing fallback docs with proxy examples	2024-06-14 21:53:45 -07:00
Krrish Dholakia	31a7328279	docs(routing.md): add content policy fallbacks to docs	2024-06-14 21:47:55 -07:00
Ishaan Jaff	9f0ae21ef5	docs - AllowedFailsPolicy	2024-06-01 17:56:57 -07:00
Krrish Dholakia	22b9933096	docs(routing.md): add default_fallbacks to routing.md docs	2024-05-13 21:28:15 -07:00
Ishaan Jaff	3a838934c9	docs - cooldown deployment	2024-05-13 12:50:59 -07:00
Ishaan Jaff	dac8c644fd	docs - router show cooldown_time	2024-05-13 12:49:51 -07:00
Krrish Dholakia	0c87bb5adf	docs(reliability.md): add region based routing to proxy + sdk docs	2024-05-11 11:34:12 -07:00
Krrish Dholakia	67b4aa28bd	docs(routing.md): make clear lowest cost routing is async	2024-05-07 21:34:18 -07:00
Ishaan Jaff	d46544d2bc	docs setup alerting on router	2024-05-07 18:26:45 -07:00
Ishaan Jaff	d5f93048cc	docs - lowest cost routing	2024-05-07 13:15:30 -07:00
Ishaan Jaff	4c909194c7	docs - lowest - latency routing	2024-05-07 12:43:44 -07:00
Ishaan Jaff	bbf5d79069	docs - set retry policy	2024-05-04 17:52:01 -07:00
Krrish Dholakia	6a2ddc2791	docs(routing.md): add docs on lowest latency routing buffer	2024-04-30 22:41:50 -07:00
Krrish Dholakia	cef2d95bb4	docs(routing.md): add max parallel requests to router docs	2024-04-29 15:37:48 -07:00
Ishaan Jaff	fa83e2da06	docs - fix routing	2024-04-25 13:43:51 -07:00
Krrish Dholakia	c5d880b6fd	docs(routing.md): add simple shuffle async support to docs	2024-04-20 15:02:22 -07:00
Krrish Dholakia	d999acd20d	docs(routing.md): reorder routing strategies	2024-04-10 22:29:24 -07:00
Krrish Dholakia	83a7a9f0b7	docs(routing.md): add calling via proxy tutorial to router docs	2024-04-10 22:24:29 -07:00
Krrish Dholakia	9f517b2907	docs(routing.md): add async usage based routing to docs	2024-04-10 21:51:36 -07:00
CLARKBENHAM	e96d97d9e5	remove formating changes	2024-04-08 21:31:21 -07:00
CLARKBENHAM	6e20bb13b2	Revert "doc pre_call_check: enables router rate limits for concurrent calls" This reverts commit `886c859519`.	2024-04-08 21:27:38 -07:00
CLARKBENHAM	886c859519	doc pre_call_check: enables router rate limits for concurrent calls	2024-04-08 21:20:59 -07:00
Krrish Dholakia	a917fadf45	docs(routing.md): refactor docs to show how to use pre-call checks and fallback across model groups	2024-04-01 11:21:27 -07:00
Krrish Dholakia	9e9de7f6e2	docs(routing.md): add fallbacks being done in order	2024-03-24 12:13:19 -07:00
Krrish Dholakia	1c60fd0e78	docs(routing.md): add url	2024-03-23 20:03:42 -07:00
Krrish Dholakia	7c74ea8b77	docs(routing.md): add proxy example to pre-call checks in routing docs	2024-03-23 20:00:50 -07:00
Krrish Dholakia	e8e7964025	docs(routing.md): add pre-call checks to docs	2024-03-23 19:10:34 -07:00
Krrish Dholakia	47424b8c90	docs(routing.md): fix routing example on docs	2024-03-11 22:17:04 -07:00
ishaan-jaff	e46980c56c	(docs) using litellm router	2024-03-11 21:18:10 -07:00
ishaan-jaff	2b2e62477b	(docs) use correct base model for cost	2024-02-20 21:00:43 -08:00
ishaan-jaff	950c753429	(docs) on callbacks tracking api_key, base etc	2024-01-27 08:31:50 -08:00

1 2

95 commits