Commit graph

17755 commits

Author SHA1 Message Date
Krrish Dholakia
3c741b7beb docs(docker_quick_start.md): update quick start with azure connection error 2024-09-16 07:31:32 -07:00
Krrish Dholakia
5fb270a559 build(model_prices_and_context_window.json): bump claude-3-5-sonnet max tokens 2024-09-15 13:57:41 -07:00
F1bos
b64b7a94ae
(models): Enable JSON Schema Support for Gemini 1.5 Flash Models (#5708)
* Fixed gemini-1.5-flash pricing

* (models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827

* Added gemini/gemini-1.5-flash-001 model

* Updated supports_response_schema to true for gemini flash 1.5 models
2024-09-15 13:52:00 -07:00
Krish Dholakia
da77706c26
Litellm stable dev (#5711)
* feat(aws_base_llm.py): prevents recreating boto3 credentials during high traffic

Leads to 100ms perf boost in local testing

* fix(base_aws_llm.py): fix credential caching check to see if token is set

* refactor(bedrock/chat): separate converse api and invoke api + isolate converse api transformation logic

Make it easier to see how requests are transformed for /converse

* fix: fix imports

* fix(bedrock/embed): fix reordering of headers

* fix(base_aws_llm.py): fix get credential logic

* fix(converse_handler.py): fix ai21 streaming response
2024-09-14 23:22:59 -07:00
Ishaan Jaff
2efdd2a6a4 mark test as flaky 2024-09-14 19:32:22 -07:00
Ishaan Jaff
0c33b8dd12 docs 2024-09-14 19:13:45 -07:00
Ishaan Jaff
c220fc0e92 docs max_completion_tokens 2024-09-14 19:12:12 -07:00
Ishaan Jaff
e447784650 bump: version 1.45.0 → 1.46.0 2024-09-14 18:49:24 -07:00
Ishaan Jaff
680d00ed11
[Feat-Prometheus] Add prometheus metric for tracking cooldown events (#5705)
* add litellm_deployment_cooled_down

* track num cooldowns on prometheus

* track exception status

* fix linting

* docs prom metrics

* cleanup premium user checks
2024-09-14 18:46:45 -07:00
Ishaan Jaff
c8eff2dc65
[Feat-Prometheus] Track exception status on litellm_deployment_failure_responses (#5706)
* add litellm_deployment_cooled_down

* track num cooldowns on prometheus

* track exception status

* fix linting

* docs prom metrics

* cleanup premium user checks

* prom track deployment failure state

* docs prometheus
2024-09-14 18:44:31 -07:00
Ishaan Jaff
b878a67a7c fic otel load test % 2024-09-14 18:04:28 -07:00
Ishaan Jaff
c8d15544c8
[Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments (#5698)
* move cooldown logic to it's own helper

* add new track deployment metrics folder

* increment success, fails for deployment in current minute

* fix cooldown logic

* fix test_aaarouter_dynamic_cooldown_message_retry_time

* fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls

* clean up get from deployment test

* fix _async_get_healthy_deployments

* add mock InternalServerError

* test deployment failing 25% requests

* add test_high_traffic_cooldowns_one_bad_deployment

* fix vertex load test

* add test for rate limit error models in cool down

* change default cooldown time

* fix cooldown message time

* fix cooldown on 429 error

* fix doc string for _should_cooldown_deployment

* fix sync cooldown logic router
2024-09-14 18:01:19 -07:00
Ishaan Jaff
7c2ddba6c6
sambanova support (#5547) (#5703)
* add sambanova support

* sambanova support

* updated api endpoint for sambanova

---------

Co-authored-by: Venu Anuganti <venu@venublog.com>
Co-authored-by: Venu Anuganti <venu@vairmac2020>
2024-09-14 17:23:04 -07:00
Ishaan Jaff
85acdb9193
[Feat] Add max_completion_tokens param (#5691)
* add max_completion_tokens

* add max_completion_tokens

* add max_completion_tokens support for OpenAI models

* add max_completion_tokens param

* add max_completion_tokens for bedrock converse models

* add test for converse maxTokens

* fix openai o1 param mapping test

* move test optional params

* add max_completion_tokens for anthropic api

* fix conftest

* add max_completion tokens for vertex ai partner models

* add max_completion_tokens for fireworks ai

* add max_completion_tokens for hf rest api

* add test for param mapping

* add param mapping for vertex, gemini + testing

* predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd

* add max_completion_tokens to openai supported params

* fix fireworks ai param mapping
2024-09-14 14:57:01 -07:00
Ahmet
415a3ede9e
Update model_prices_and_context_window.json (#5700)
added audio_speech mode on the sample_spec for clarity.
2024-09-14 11:22:08 -07:00
Krish Dholakia
dad1ad2077
LiteLLM Minor Fixes and Improvements (09/14/2024) (#5697)
* fix(health_check.py): hide sensitive keys from health check debug information k

* fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue

* fix(vertex_llm_base.py): fix exception message to not log credentials
2024-09-14 10:32:39 -07:00
Krish Dholakia
60709a0753
LiteLLM Minor Fixes and Improvements (09/13/2024) (#5689)
* refactor: cleanup unused variables + fix pyright errors

* feat(health_check.py): Closes https://github.com/BerriAI/litellm/issues/5686

* fix(o1_reasoning.py): add stricter check for o-1 reasoning model

* refactor(mistral/): make it easier to see mistral transformation logic

* fix(openai.py): fix openai o-1 model param mapping

Fixes https://github.com/BerriAI/litellm/issues/5685

* feat(main.py): infer finetuned gemini model from base model

Fixes https://github.com/BerriAI/litellm/issues/5678

* docs(vertex.md): update docs to call finetuned gemini models

* feat(proxy_server.py): allow admin to hide proxy model aliases

Closes https://github.com/BerriAI/litellm/issues/5692

* docs(load_balancing.md): add docs on hiding alias models from proxy config

* fix(base.py): don't raise notimplemented error

* fix(user_api_key_auth.py): fix model max budget check

* fix(router.py): fix elif

* fix(user_api_key_auth.py): don't set team_id to empty str

* fix(team_endpoints.py): fix response type

* test(test_completion.py): handle predibase error

* test(test_proxy_server.py): fix test

* fix(o1_transformation.py): fix max_completion_token mapping

* test(test_image_generation.py): mark flaky test
2024-09-14 10:02:55 -07:00
F1bos
db3af20d84
(models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827 (#5693)
* Fixed gemini-1.5-flash pricing

* (models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827
2024-09-14 08:41:48 -07:00
Ishaan Jaff
741c8e8a45
[Feat - Perf Improvement] DataDog Logger 91% lower latency (#5687)
* fix refactor dd to be an instance of custom logger

* migrate dd logger to be async

* clean up dd logging

* add datadog sync and async code

* use batching for datadog logger

* add doc string for dd logging

* add clear doc string

* fix doc string

* allow debugging intake url

* clean up requirements.txt

* allow setting custom batch size on logger

* fix dd logging to use compression

* fix linting

* add dd load test

* fix dd load test

* fix dd url

* add test_datadog_logging_http_request

* fix test_datadog_logging_http_request
2024-09-13 17:39:17 -07:00
Ishaan Jaff
cd8d7ca915
[Fix] Performance - use in memory cache when downloading images from a url (#5657)
* fix use in memory cache when getting images

* fix linting

* fix load testing

* fix load test size

* fix load test size

* trigger ci/cd again
2024-09-13 07:23:42 -07:00
Krrish Dholakia
cdd7cd4d69 build: bump from 1.44.28 -> 1.45.0 2024-09-12 23:10:29 -07:00
Krish Dholakia
4657a40ef1
LiteLLM Minor Fixes and Improvements (09/12/2024) (#5658)
* fix(factory.py): handle tool call content as list

Fixes https://github.com/BerriAI/litellm/issues/5652

* fix(factory.py): enforce stronger typing

* fix(router.py): return model alias in /v1/model/info and /v1/model_group/info

* fix(user_api_key_auth.py): move noisy warning message to debug

cleanup logs

* fix(types.py): cleanup pydantic v2 deprecated param

Fixes https://github.com/BerriAI/litellm/issues/5649

* docs(gemini.md): show how to pass inline data to gemini api

Fixes https://github.com/BerriAI/litellm/issues/5674
2024-09-12 23:04:06 -07:00
David Manouchehri
795047c37f
Add o1 models on OpenRouter. (#5676) 2024-09-12 22:16:10 -07:00
Krish Dholakia
00047de1c6
fix(user_dashboard.tsx): don't call /global/spend on startup (#5668)
at 1m+ rows, query timeouts cause ui errors
2024-09-12 22:15:52 -07:00
Krish Dholakia
d94d47424f
fix(proxy/utils.py): auto-update if required view missing from db. raise warning for optional views. (#5675)
Prevents missing optional views from blocking proxy startup.
2024-09-12 22:15:44 -07:00
Ishaan Jaff
fa01b5c7d9 bump: version 1.44.27 → 1.44.28 2024-09-12 19:17:34 -07:00
Ishaan Jaff
19a06d7842
[Fix-Router] Don't cooldown when only 1 deployment exists (#5673)
* fix get model list

* fix test custom callback router

* fix embedding fallback test

* fix router retry policy on AuthErrors

* fix router test

* add test for single deployments no cooldown test prod

* add test test_single_deployment_no_cooldowns_test_prod_mock_completion_calls
2024-09-12 19:14:58 -07:00
Ishaan Jaff
13ba22d6fd docs add o1 to docs 2024-09-12 19:06:13 -07:00
Ishaan Jaff
e7c9716841
[Feat-Perf] Use Batching + Squashing (#5645)
* use folder for slack alerting

* clean up slack alerting

* fix test alerting
2024-09-12 18:37:53 -07:00
Ishaan Jaff
fe5e0bcd15
Merge pull request #5666 from BerriAI/litellm_add_openai_o1
[Feat] Add OpenAI O1 Family Param mapping / config
2024-09-12 16:15:53 -07:00
Ishaan Jaff
a1fe2701f2 Merge branch 'main' into litellm_add_openai_o1 2024-09-12 16:15:43 -07:00
Ishaan Jaff
bb38e9cbf8 fix gcs logging 2024-09-12 15:24:04 -07:00
Ishaan Jaff
46ce4995b8 fix type errors 2024-09-12 14:49:43 -07:00
Ishaan Jaff
0f24f339f3 fix handle user message 2024-09-12 14:34:32 -07:00
Ishaan Jaff
ded40e4d41 bump openai to 1.45.0 2024-09-12 14:18:15 -07:00
Ishaan Jaff
14dc7b3b54 fix linting 2024-09-12 14:15:18 -07:00
Ishaan Jaff
a5a0773b19 fix handle o1 not supporting system message 2024-09-12 14:09:13 -07:00
Ishaan Jaff
3490862795 bump: version 1.44.26 → 1.44.27 2024-09-12 13:41:05 -07:00
Ishaan Jaff
d2510a04a2 fix pricing 2024-09-12 13:41:01 -07:00
Ishaan Jaff
f5e9e9fc9a add o1 reasoning tests 2024-09-12 13:40:15 -07:00
Krish Dholakia
c76d2c6ade
Refactor 'check_view_exists' logic (#5659)
* fix(proxy/utils.py): comment out auto-upsert logic in check_view_exists

Prevents proxy from failing on startup due to faulty logic

* fix(db/migration_scripts/create_views.py): fix 'DailyTagSpend' quotation on check

* fix(create_views.py): mongly global spend time period should be 30d not 20d

* fix(schema.prisma): index on startTime and endUser for efficient UI querying
2024-09-12 13:39:50 -07:00
David Manouchehri
5c1a70be21
Fix token and remove dups. (#5662) 2024-09-12 13:33:35 -07:00
Ishaan Jaff
fed9c89cc7 add OpenAI o1 config 2024-09-12 13:22:59 -07:00
David Manouchehri
b4f97763f0
(models): Add o1 pricing. (#5661) 2024-09-12 11:47:04 -07:00
Ishaan Jaff
fab176fc20
Merge pull request #5660 from lowjiansheng/js-openai-o1
Add gpt o1 and o1 mini models
2024-09-12 11:35:06 -07:00
lowjiansheng
3afe70c1f2 gpt o1 and o1 mini 2024-09-13 02:27:57 +08:00
Ishaan Jaff
ead1e0c708
Merge pull request #5655 from BerriAI/litellm_testing_clean_up
[Fix Ci/cd] Separate testing pipeline for litellm router
2024-09-12 11:05:26 -07:00
Ishaan Jaff
085e1751ad mark test as flaky 2024-09-12 09:29:37 -07:00
Ishaan Jaff
bea34c9231 fix config.yml 2024-09-12 09:28:45 -07:00
Ishaan Jaff
90d096b639 ci/cd run again 2024-09-12 08:42:34 -07:00