litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 11:14:04 +00:00

Author	SHA1	Message	Date
Krish Dholakia	713d762411	LiteLLM Minor Fixes and Improvements (09/13/2024) (#5689 ) * refactor: cleanup unused variables + fix pyright errors * feat(health_check.py): Closes https://github.com/BerriAI/litellm/issues/5686 * fix(o1_reasoning.py): add stricter check for o-1 reasoning model * refactor(mistral/): make it easier to see mistral transformation logic * fix(openai.py): fix openai o-1 model param mapping Fixes https://github.com/BerriAI/litellm/issues/5685 * feat(main.py): infer finetuned gemini model from base model Fixes https://github.com/BerriAI/litellm/issues/5678 * docs(vertex.md): update docs to call finetuned gemini models * feat(proxy_server.py): allow admin to hide proxy model aliases Closes https://github.com/BerriAI/litellm/issues/5692 * docs(load_balancing.md): add docs on hiding alias models from proxy config * fix(base.py): don't raise notimplemented error * fix(user_api_key_auth.py): fix model max budget check * fix(router.py): fix elif * fix(user_api_key_auth.py): don't set team_id to empty str * fix(team_endpoints.py): fix response type * test(test_completion.py): handle predibase error * test(test_proxy_server.py): fix test * fix(o1_transformation.py): fix max_completion_token mapping * test(test_image_generation.py): mark flaky test	2024-09-14 10:02:55 -07:00
Krish Dholakia	91c918fd70	LiteLLM Minor Fixes and Improvements (09/12/2024) (#5658 ) * fix(factory.py): handle tool call content as list Fixes https://github.com/BerriAI/litellm/issues/5652 * fix(factory.py): enforce stronger typing * fix(router.py): return model alias in /v1/model/info and /v1/model_group/info * fix(user_api_key_auth.py): move noisy warning message to debug cleanup logs * fix(types.py): cleanup pydantic v2 deprecated param Fixes https://github.com/BerriAI/litellm/issues/5649 * docs(gemini.md): show how to pass inline data to gemini api Fixes https://github.com/BerriAI/litellm/issues/5674	2024-09-12 23:04:06 -07:00
Ishaan Jaff	e7c22f63e7	[Fix-Router] Don't cooldown when only 1 deployment exists (#5673 ) * fix get model list * fix test custom callback router * fix embedding fallback test * fix router retry policy on AuthErrors * fix router test * add test for single deployments no cooldown test prod * add test test_single_deployment_no_cooldowns_test_prod_mock_completion_calls	2024-09-12 19:14:58 -07:00
Ishaan Jaff	91dd3e11c4	[Feat-Perf] Use Batching + Squashing (#5645 ) * use folder for slack alerting * clean up slack alerting * fix test alerting	2024-09-12 18:37:53 -07:00
Krish Dholakia	dec53961f7	LiteLLM Minor Fixes and Improvements (11/09/2024) (#5634 ) * fix(caching.py): set ttl for async_increment cache fixes issue where ttl for redis client was not being set on increment_cache Fixes https://github.com/BerriAI/litellm/issues/5609 * fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis Fixes https://github.com/BerriAI/litellm/issues/5609 * fix(router.py): support adding retry policy + allowed fails policy via config.yaml * fix(router.py): don't cooldown single deployments No point, as there's no other deployment to loadbalance with. * fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens Closes https://github.com/BerriAI/litellm/issues/5605 * docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs * fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set Fixes issue where key logging would not be set if team metadata was not none * fix(secret_managers/main.py): load environment variables correctly Fixes issue where os.environ/ was not being loaded correctly * test(test_router.py): fix test * feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek * test: fix tests * test: fix test * test: fix test * test: fix test * test: fix test	2024-09-11 22:36:06 -07:00
Krish Dholakia	52849e6422	LiteLLM Minor Fixes and Improvements (09/07/2024) (#5580 ) * fix(litellm_logging.py): set completion_start_time_float to end_time_float if none Fixes https://github.com/BerriAI/litellm/issues/5500 * feat(_init_.py): add new 'openai_text_completion_compatible_providers' list Fixes https://github.com/BerriAI/litellm/issues/5558 Handles correctly routing fireworks ai calls when done via text completions * fix: fix linting errors * fix: fix linting errors * fix(openai.py): fix exception raised * fix(openai.py): fix error handling * fix(_redis.py): allow all supported arguments for redis cluster (#5554) * Revert "fix(_redis.py): allow all supported arguments for redis cluster (#5554)" (#5583) This reverts commit `f2191ef4cb`. * fix(router.py): return model alias w/ underlying deployment on router.get_model_list() Fixes https://github.com/BerriAI/litellm/issues/5524#issuecomment-2336410666 * test: handle flaky tests --------- Co-authored-by: Jonas Dittrich <58814480+Kakadus@users.noreply.github.com>	2024-09-09 18:54:17 -07:00
Elad Segal	98dfa69475	Properly use `allowed_fails_policy` when it has fields with a value of 0 (#5604 )	2024-09-09 16:35:12 -07:00
Krrish Dholakia	84d57017e5	Revert "fix(router.py): return model alias w/ underlying deployment on router.get_model_list()" This reverts commit `638896309c`.	2024-09-07 18:04:56 -07:00
Krrish Dholakia	f62801f795	fix(router.py): return model alias w/ underlying deployment on router.get_model_list() Fixes https://github.com/BerriAI/litellm/issues/5524#issuecomment-2336410666	2024-09-07 18:01:31 -07:00
Krish Dholakia	2cab33b061	LiteLLM Minor Fixes and Improvements (08/06/2024) (#5567 ) * fix(utils.py): return citations for perplexity streaming Fixes https://github.com/BerriAI/litellm/issues/5535 * fix(anthropic/chat.py): support fallbacks for anthropic streaming (#5542) * fix(anthropic/chat.py): support fallbacks for anthropic streaming Fixes https://github.com/BerriAI/litellm/issues/5512 * fix(anthropic/chat.py): use module level http client if none given (prevents early client closure) * fix: fix linting errors * fix(http_handler.py): fix raise_for_status error handling * test: retry flaky test * fix otel type * fix(bedrock/embed): fix error raising * test(test_openai_batches_and_files.py): skip azure batches test (for now) quota exceeded * fix(test_router.py): skip azure batch route test (for now) - hit batch quota limits --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * All `model_group_alias` should show up in `/models`, `/model/info` , `/model_group/info` (#5539) * fix(router.py): support returning model_alias model names in `/v1/models` * fix(proxy_server.py): support returning model alias'es on `/model/info` * feat(router.py): support returning model group alias for `/model_group/info` * fix(proxy_server.py): fix linting errors * fix(proxy_server.py): fix linting errors * build(model_prices_and_context_window.json): add amazon titan text premier pricing information Closes https://github.com/BerriAI/litellm/issues/5560 * feat(litellm_logging.py): log standard logging response object for pass through endpoints. Allows bedrock /invoke agent calls to be correctly logged to langfuse + s3 * fix(success_handler.py): fix linting error * fix(success_handler.py): fix linting errors * fix(team_endpoints.py): Allows admin to update team member budgets --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-09-06 17:16:24 -07:00
Ishaan Jaff	7370a994f5	use correct type hints for audio transcriptions	2024-09-05 09:12:27 -07:00
Krish Dholakia	6fdee99632	LiteLLM Minor fixes + improvements (08/04/2024) (#5505 ) * Minor IAM AWS OIDC Improvements (#5246) * AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region. * AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles. * (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow. * fix(router.py): log rejected requests Fixes https://github.com/BerriAI/litellm/issues/5498 * refactor: don't use verbose_logger.exception, if exception is raised User might already have handling for this. But alerting systems in prod will raise this as an unhandled error. * fix(datadog.py): support setting datadog source as an env var Fixes https://github.com/BerriAI/litellm/issues/5508 * docs(logging.md): add dd_source to datadog docs * fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers * (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509) * feat(anthropic.py): support 'cache_control' param for content when it is a string * Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519) This reverts commit `3fac0349c2`. * refactor: ci/cd run again --------- Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>	2024-09-04 22:16:55 -07:00
Krish Dholakia	18da7adce9	feat(router.py): Support Loadbalancing batch azure api endpoints (#5469 ) * feat(router.py): initial commit for loadbalancing azure batch api endpoints Closes https://github.com/BerriAI/litellm/issues/5396 * fix(router.py): working `router.acreate_file()` * feat(router.py): working router.acreate_batch endpoint * feat(router.py): expose router.aretrieve_batch function Make it easy for user to retrieve the batch information * feat(router.py): support 'router.alist_batches' endpoint Adds support for getting all batches across all endpoints * feat(router.py): working loadbalancing on `/v1/files` * feat(proxy_server.py): working loadbalancing on `/v1/batches` * feat(proxy_server.py): working loadbalancing on Retrieve + List batch	2024-09-02 21:32:55 -07:00
Krish Dholakia	ca4e746545	LiteLLM minor fixes + improvements (31/08/2024) (#5464 ) * fix(vertex_endpoints.py): fix vertex ai pass through endpoints * test(test_streaming.py): skip model due to end of life * feat(custom_logger.py): add special callback for model hitting tpm/rpm limits Closes https://github.com/BerriAI/litellm/issues/4096	2024-09-01 13:31:42 -07:00
Krish Dholakia	321b0961b5	fix: Minor LiteLLM Fixes + Improvements (29/08/2024) (#5436 ) * fix(model_checks.py): support returning wildcard models on `/v1/models` Fixes https://github.com/BerriAI/litellm/issues/4903 * fix(bedrock_httpx.py): support calling bedrock via api_base Closes https://github.com/BerriAI/litellm/pull/4587 * fix(litellm_logging.py): only leave last 4 char of gemini key unmasked Fixes https://github.com/BerriAI/litellm/issues/5433 * feat(router.py): support setting 'weight' param for models on router Closes https://github.com/BerriAI/litellm/issues/5410 * test(test_bedrock_completion.py): add unit test for custom api base * fix(model_checks.py): handle no "/" in model	2024-08-29 22:40:25 -07:00
Krrish Dholakia	6e4f0a95da	fix(router.py): fix cooldown check	2024-08-28 16:38:42 -07:00
Ishaan Jaff	df4f1458e6	feat - add rerank on proxy	2024-08-27 17:36:40 -07:00
Krrish Dholakia	c558648180	fix(router.py): fix aembedding type hints Fixes https://github.com/BerriAI/litellm/issues/5383	2024-08-27 14:29:18 -07:00
Krrish Dholakia	c795e9feeb	fix(router.py): enable dynamic retry after in exception string Updates cooldown logic to cooldown individual models Closes https://github.com/BerriAI/litellm/issues/1339	2024-08-24 16:59:30 -07:00
Krrish Dholakia	27a5cd12e0	fix(utils.py): correctly re-raise the headers from an exception, if present Fixes issue where retry after on router was not using azure / openai numbers	2024-08-24 12:30:30 -07:00
Krrish Dholakia	6415f92bbb	fix(router.py): don't cooldown on apiconnectionerrors Fixes issue where model would be in cooldown due to api connection errors	2024-08-24 09:53:05 -07:00
Krrish Dholakia	45048ee006	fix(router.py): fix linting error	2024-08-21 15:35:10 -07:00
Ishaan Jaff	528bb3f7ac	test test_using_default_working_fallback	2024-08-20 13:32:55 -07:00
Ishaan Jaff	165e0e3ad1	fix run sync fallbacks	2024-08-20 12:55:36 -07:00
Ishaan Jaff	078fe97053	fix fallbacks dont recurse on the same fallback	2024-08-20 12:50:20 -07:00
Ishaan Jaff	fb16ff2335	fix don't retry errors when no healthy deployments available	2024-08-20 12:17:05 -07:00
Ishaan Jaff	5e2f962ba3	test + never retry on 404 errors	2024-08-20 11:59:43 -07:00
Ishaan Jaff	7171efc729	use model access groups for teams	2024-08-17 16:45:53 -07:00
Krrish Dholakia	2874b94fb1	refactor: replace .error() with .exception() logging for better debugging on sentry	2024-08-16 09:22:47 -07:00
Ishaan Jaff	25af3ffe5b	v0 track fallback events	2024-08-10 13:31:00 -07:00
Krrish Dholakia	482acc7ee1	fix(router.py): fallback on 400-status code requests	2024-08-09 12:16:49 -07:00
Krrish Dholakia	07e5847e65	feat(router.py): allow using .acompletion() for request prioritization allows /chat/completion endpoint to work for request prioritization calls	2024-08-07 16:43:12 -07:00
Ishaan Jaff	a0b2c107c4	fix getting provider_specific_deployment	2024-08-07 15:20:59 -07:00
Ishaan Jaff	31e4fca748	fix use provider specific routing	2024-08-07 14:37:20 -07:00
Ishaan Jaff	bb9493e5f7	router use provider specific wildcard routing	2024-08-07 14:12:10 -07:00
Ishaan Jaff	6a1a4eb822	add + test provider specific routing	2024-08-07 13:49:46 -07:00
Krrish Dholakia	0de640700d	fix(router.py): add reason for fallback failure to client-side exception string make it easier to debug why a fallback failed to occur	2024-08-07 13:02:47 -07:00
Ishaan Jaff	0dd8f50477	use router_cooldown_handler	2024-08-07 10:40:55 -07:00
Krrish Dholakia	fdb47e5479	fix: fix test to specify allowed_fails	2024-08-05 21:39:59 -07:00
Krrish Dholakia	934883999a	fix(router.py): move deployment cooldown list message to error log, not client-side don't show user all deployments	2024-08-03 12:49:39 -07:00
Krrish Dholakia	b0d2727bbf	feat(router.py): add flag for mock testing loadbalancing for rate limit errors	2024-08-03 12:34:11 -07:00
Krrish Dholakia	dc58b9f33e	fix(utils.py): fix linting errors	2024-07-30 18:38:10 -07:00
Krrish Dholakia	96ad9c877c	fix(router.py): gracefully handle scenario where completion response doesn't have total tokens Closes https://github.com/BerriAI/litellm/issues/4968	2024-07-30 15:14:03 -07:00
Krrish Dholakia	3a1eedfbf3	feat(ollama_chat.py): support ollama tool calling Closes https://github.com/BerriAI/litellm/issues/4812	2024-07-26 21:51:54 -07:00
Krrish Dholakia	e39ff46222	docs(config.md): update wildcard docs	2024-07-26 08:59:53 -07:00
Ishaan Jaff	a46c463dee	router support setting pass_through_all_models	2024-07-25 18:34:12 -07:00
Krrish Dholakia	1d33759bb1	fix(router.py): add support for diskcache to router	2024-07-25 14:30:46 -07:00
Ishaan Jaff	7888074012	fix - test router debug logs	2024-07-20 18:45:31 -07:00
Ishaan Jaff	fcee8bc61f	router - use verbose logger when using litellm.Router	2024-07-20 17:36:25 -07:00
Ishaan Jaff	d1a4246d2b	control using enable_tag_filtering	2024-07-18 19:39:04 -07:00

1 2 3 4 5 ...

606 commits