litellm

Author	SHA1	Message	Date
Krrish Dholakia	3c741b7beb	docs(docker_quick_start.md): update quick start with azure connection error	2024-09-16 07:31:32 -07:00
Krrish Dholakia	5fb270a559	build(model_prices_and_context_window.json): bump claude-3-5-sonnet max tokens	2024-09-15 13:57:41 -07:00
F1bos	b64b7a94ae	(models): Enable JSON Schema Support for Gemini 1.5 Flash Models (#5708 ) * Fixed gemini-1.5-flash pricing * (models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827 * Added gemini/gemini-1.5-flash-001 model * Updated supports_response_schema to true for gemini flash 1.5 models	2024-09-15 13:52:00 -07:00
Krish Dholakia	da77706c26	Litellm stable dev (#5711 ) * feat(aws_base_llm.py): prevents recreating boto3 credentials during high traffic Leads to 100ms perf boost in local testing * fix(base_aws_llm.py): fix credential caching check to see if token is set * refactor(bedrock/chat): separate converse api and invoke api + isolate converse api transformation logic Make it easier to see how requests are transformed for /converse * fix: fix imports * fix(bedrock/embed): fix reordering of headers * fix(base_aws_llm.py): fix get credential logic * fix(converse_handler.py): fix ai21 streaming response	2024-09-14 23:22:59 -07:00
Ishaan Jaff	2efdd2a6a4	mark test as flaky	2024-09-14 19:32:22 -07:00
Ishaan Jaff	0c33b8dd12	docs	2024-09-14 19:13:45 -07:00
Ishaan Jaff	c220fc0e92	docs max_completion_tokens	2024-09-14 19:12:12 -07:00
Ishaan Jaff	e447784650	bump: version 1.45.0 → 1.46.0	2024-09-14 18:49:24 -07:00
Ishaan Jaff	680d00ed11	[Feat-Prometheus] Add prometheus metric for tracking cooldown events (#5705 ) * add litellm_deployment_cooled_down * track num cooldowns on prometheus * track exception status * fix linting * docs prom metrics * cleanup premium user checks	2024-09-14 18:46:45 -07:00
Ishaan Jaff	c8eff2dc65	[Feat-Prometheus] Track exception status on `litellm_deployment_failure_responses` (#5706 ) * add litellm_deployment_cooled_down * track num cooldowns on prometheus * track exception status * fix linting * docs prom metrics * cleanup premium user checks * prom track deployment failure state * docs prometheus	2024-09-14 18:44:31 -07:00
Ishaan Jaff	b878a67a7c	fic otel load test %	2024-09-14 18:04:28 -07:00
Ishaan Jaff	c8d15544c8	[Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments (#5698 ) * move cooldown logic to it's own helper * add new track deployment metrics folder * increment success, fails for deployment in current minute * fix cooldown logic * fix test_aaarouter_dynamic_cooldown_message_retry_time * fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls * clean up get from deployment test * fix _async_get_healthy_deployments * add mock InternalServerError * test deployment failing 25% requests * add test_high_traffic_cooldowns_one_bad_deployment * fix vertex load test * add test for rate limit error models in cool down * change default cooldown time * fix cooldown message time * fix cooldown on 429 error * fix doc string for _should_cooldown_deployment * fix sync cooldown logic router	2024-09-14 18:01:19 -07:00
Ishaan Jaff	7c2ddba6c6	sambanova support (#5547 ) (#5703 ) * add sambanova support * sambanova support * updated api endpoint for sambanova --------- Co-authored-by: Venu Anuganti <venu@venublog.com> Co-authored-by: Venu Anuganti <venu@vairmac2020>	2024-09-14 17:23:04 -07:00
Ishaan Jaff	85acdb9193	[Feat] Add `max_completion_tokens` param (#5691 ) * add max_completion_tokens * add max_completion_tokens * add max_completion_tokens support for OpenAI models * add max_completion_tokens param * add max_completion_tokens for bedrock converse models * add test for converse maxTokens * fix openai o1 param mapping test * move test optional params * add max_completion_tokens for anthropic api * fix conftest * add max_completion tokens for vertex ai partner models * add max_completion_tokens for fireworks ai * add max_completion_tokens for hf rest api * add test for param mapping * add param mapping for vertex, gemini + testing * predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd * add max_completion_tokens to openai supported params * fix fireworks ai param mapping	2024-09-14 14:57:01 -07:00
Ahmet	415a3ede9e	Update model_prices_and_context_window.json (#5700 ) added audio_speech mode on the sample_spec for clarity.	2024-09-14 11:22:08 -07:00
Krish Dholakia	dad1ad2077	LiteLLM Minor Fixes and Improvements (09/14/2024) (#5697 ) * fix(health_check.py): hide sensitive keys from health check debug information k * fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue * fix(vertex_llm_base.py): fix exception message to not log credentials	2024-09-14 10:32:39 -07:00
Krish Dholakia	60709a0753	LiteLLM Minor Fixes and Improvements (09/13/2024) (#5689 ) * refactor: cleanup unused variables + fix pyright errors * feat(health_check.py): Closes https://github.com/BerriAI/litellm/issues/5686 * fix(o1_reasoning.py): add stricter check for o-1 reasoning model * refactor(mistral/): make it easier to see mistral transformation logic * fix(openai.py): fix openai o-1 model param mapping Fixes https://github.com/BerriAI/litellm/issues/5685 * feat(main.py): infer finetuned gemini model from base model Fixes https://github.com/BerriAI/litellm/issues/5678 * docs(vertex.md): update docs to call finetuned gemini models * feat(proxy_server.py): allow admin to hide proxy model aliases Closes https://github.com/BerriAI/litellm/issues/5692 * docs(load_balancing.md): add docs on hiding alias models from proxy config * fix(base.py): don't raise notimplemented error * fix(user_api_key_auth.py): fix model max budget check * fix(router.py): fix elif * fix(user_api_key_auth.py): don't set team_id to empty str * fix(team_endpoints.py): fix response type * test(test_completion.py): handle predibase error * test(test_proxy_server.py): fix test * fix(o1_transformation.py): fix max_completion_token mapping * test(test_image_generation.py): mark flaky test	2024-09-14 10:02:55 -07:00
F1bos	db3af20d84	(models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827 (#5693 ) * Fixed gemini-1.5-flash pricing * (models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827	2024-09-14 08:41:48 -07:00
Ishaan Jaff	741c8e8a45	[Feat - Perf Improvement] DataDog Logger 91% lower latency (#5687 ) * fix refactor dd to be an instance of custom logger * migrate dd logger to be async * clean up dd logging * add datadog sync and async code * use batching for datadog logger * add doc string for dd logging * add clear doc string * fix doc string * allow debugging intake url * clean up requirements.txt * allow setting custom batch size on logger * fix dd logging to use compression * fix linting * add dd load test * fix dd load test * fix dd url * add test_datadog_logging_http_request * fix test_datadog_logging_http_request	2024-09-13 17:39:17 -07:00
Ishaan Jaff	cd8d7ca915	[Fix] Performance - use in memory cache when downloading images from a url (#5657 ) * fix use in memory cache when getting images * fix linting * fix load testing * fix load test size * fix load test size * trigger ci/cd again	2024-09-13 07:23:42 -07:00
Krrish Dholakia	cdd7cd4d69	build: bump from 1.44.28 -> 1.45.0	2024-09-12 23:10:29 -07:00
Krish Dholakia	4657a40ef1	LiteLLM Minor Fixes and Improvements (09/12/2024) (#5658 ) * fix(factory.py): handle tool call content as list Fixes https://github.com/BerriAI/litellm/issues/5652 * fix(factory.py): enforce stronger typing * fix(router.py): return model alias in /v1/model/info and /v1/model_group/info * fix(user_api_key_auth.py): move noisy warning message to debug cleanup logs * fix(types.py): cleanup pydantic v2 deprecated param Fixes https://github.com/BerriAI/litellm/issues/5649 * docs(gemini.md): show how to pass inline data to gemini api Fixes https://github.com/BerriAI/litellm/issues/5674	2024-09-12 23:04:06 -07:00
David Manouchehri	795047c37f	Add o1 models on OpenRouter. (#5676 )	2024-09-12 22:16:10 -07:00
Krish Dholakia	00047de1c6	fix(user_dashboard.tsx): don't call /global/spend on startup (#5668 ) at 1m+ rows, query timeouts cause ui errors	2024-09-12 22:15:52 -07:00
Krish Dholakia	d94d47424f	fix(proxy/utils.py): auto-update if required view missing from db. raise warning for optional views. (#5675 ) Prevents missing optional views from blocking proxy startup.	2024-09-12 22:15:44 -07:00
Ishaan Jaff	fa01b5c7d9	bump: version 1.44.27 → 1.44.28	2024-09-12 19:17:34 -07:00
Ishaan Jaff	19a06d7842	[Fix-Router] Don't cooldown when only 1 deployment exists (#5673 ) * fix get model list * fix test custom callback router * fix embedding fallback test * fix router retry policy on AuthErrors * fix router test * add test for single deployments no cooldown test prod * add test test_single_deployment_no_cooldowns_test_prod_mock_completion_calls	2024-09-12 19:14:58 -07:00
Ishaan Jaff	13ba22d6fd	docs add o1 to docs	2024-09-12 19:06:13 -07:00
Ishaan Jaff	e7c9716841	[Feat-Perf] Use Batching + Squashing (#5645 ) * use folder for slack alerting * clean up slack alerting * fix test alerting	2024-09-12 18:37:53 -07:00
Ishaan Jaff	fe5e0bcd15	Merge pull request #5666 from BerriAI/litellm_add_openai_o1 [Feat] Add OpenAI O1 Family Param mapping / config	2024-09-12 16:15:53 -07:00
Ishaan Jaff	a1fe2701f2	Merge branch 'main' into litellm_add_openai_o1	2024-09-12 16:15:43 -07:00
Ishaan Jaff	bb38e9cbf8	fix gcs logging	2024-09-12 15:24:04 -07:00
Ishaan Jaff	46ce4995b8	fix type errors	2024-09-12 14:49:43 -07:00
Ishaan Jaff	0f24f339f3	fix handle user message	2024-09-12 14:34:32 -07:00
Ishaan Jaff	ded40e4d41	bump openai to 1.45.0	2024-09-12 14:18:15 -07:00
Ishaan Jaff	14dc7b3b54	fix linting	2024-09-12 14:15:18 -07:00
Ishaan Jaff	a5a0773b19	fix handle o1 not supporting system message	2024-09-12 14:09:13 -07:00
Ishaan Jaff	3490862795	bump: version 1.44.26 → 1.44.27	2024-09-12 13:41:05 -07:00
Ishaan Jaff	d2510a04a2	fix pricing	2024-09-12 13:41:01 -07:00
Ishaan Jaff	f5e9e9fc9a	add o1 reasoning tests	2024-09-12 13:40:15 -07:00
Krish Dholakia	c76d2c6ade	Refactor 'check_view_exists' logic (#5659 ) * fix(proxy/utils.py): comment out auto-upsert logic in check_view_exists Prevents proxy from failing on startup due to faulty logic * fix(db/migration_scripts/create_views.py): fix 'DailyTagSpend' quotation on check * fix(create_views.py): mongly global spend time period should be 30d not 20d * fix(schema.prisma): index on startTime and endUser for efficient UI querying	2024-09-12 13:39:50 -07:00
David Manouchehri	5c1a70be21	Fix token and remove dups. (#5662 )	2024-09-12 13:33:35 -07:00
Ishaan Jaff	fed9c89cc7	add OpenAI o1 config	2024-09-12 13:22:59 -07:00
David Manouchehri	b4f97763f0	(models): Add o1 pricing. (#5661 )	2024-09-12 11:47:04 -07:00
Ishaan Jaff	fab176fc20	Merge pull request #5660 from lowjiansheng/js-openai-o1 Add gpt o1 and o1 mini models	2024-09-12 11:35:06 -07:00
lowjiansheng	3afe70c1f2	gpt o1 and o1 mini	2024-09-13 02:27:57 +08:00
Ishaan Jaff	ead1e0c708	Merge pull request #5655 from BerriAI/litellm_testing_clean_up [Fix Ci/cd] Separate testing pipeline for litellm router	2024-09-12 11:05:26 -07:00
Ishaan Jaff	085e1751ad	mark test as flaky	2024-09-12 09:29:37 -07:00
Ishaan Jaff	bea34c9231	fix config.yml	2024-09-12 09:28:45 -07:00
Ishaan Jaff	90d096b639	ci/cd run again	2024-09-12 08:42:34 -07:00

1 2 3 4 5 ...

17755 commits