Krrish Dholakia
3c741b7beb
docs(docker_quick_start.md): update quick start with azure connection error
2024-09-16 07:31:32 -07:00
Krrish Dholakia
5fb270a559
build(model_prices_and_context_window.json): bump claude-3-5-sonnet max tokens
2024-09-15 13:57:41 -07:00
F1bos
b64b7a94ae
(models): Enable JSON Schema Support for Gemini 1.5 Flash Models ( #5708 )
...
* Fixed gemini-1.5-flash pricing
* (models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827
* Added gemini/gemini-1.5-flash-001 model
* Updated supports_response_schema to true for gemini flash 1.5 models
2024-09-15 13:52:00 -07:00
Krish Dholakia
da77706c26
Litellm stable dev ( #5711 )
...
* feat(aws_base_llm.py): prevents recreating boto3 credentials during high traffic
Leads to 100ms perf boost in local testing
* fix(base_aws_llm.py): fix credential caching check to see if token is set
* refactor(bedrock/chat): separate converse api and invoke api + isolate converse api transformation logic
Make it easier to see how requests are transformed for /converse
* fix: fix imports
* fix(bedrock/embed): fix reordering of headers
* fix(base_aws_llm.py): fix get credential logic
* fix(converse_handler.py): fix ai21 streaming response
2024-09-14 23:22:59 -07:00
Ishaan Jaff
2efdd2a6a4
mark test as flaky
2024-09-14 19:32:22 -07:00
Ishaan Jaff
0c33b8dd12
docs
2024-09-14 19:13:45 -07:00
Ishaan Jaff
c220fc0e92
docs max_completion_tokens
2024-09-14 19:12:12 -07:00
Ishaan Jaff
e447784650
bump: version 1.45.0 → 1.46.0
2024-09-14 18:49:24 -07:00
Ishaan Jaff
680d00ed11
[Feat-Prometheus] Add prometheus metric for tracking cooldown events ( #5705 )
...
* add litellm_deployment_cooled_down
* track num cooldowns on prometheus
* track exception status
* fix linting
* docs prom metrics
* cleanup premium user checks
2024-09-14 18:46:45 -07:00
Ishaan Jaff
c8eff2dc65
[Feat-Prometheus] Track exception status on litellm_deployment_failure_responses
( #5706 )
...
* add litellm_deployment_cooled_down
* track num cooldowns on prometheus
* track exception status
* fix linting
* docs prom metrics
* cleanup premium user checks
* prom track deployment failure state
* docs prometheus
2024-09-14 18:44:31 -07:00
Ishaan Jaff
b878a67a7c
fic otel load test %
2024-09-14 18:04:28 -07:00
Ishaan Jaff
c8d15544c8
[Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments ( #5698 )
...
* move cooldown logic to it's own helper
* add new track deployment metrics folder
* increment success, fails for deployment in current minute
* fix cooldown logic
* fix test_aaarouter_dynamic_cooldown_message_retry_time
* fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls
* clean up get from deployment test
* fix _async_get_healthy_deployments
* add mock InternalServerError
* test deployment failing 25% requests
* add test_high_traffic_cooldowns_one_bad_deployment
* fix vertex load test
* add test for rate limit error models in cool down
* change default cooldown time
* fix cooldown message time
* fix cooldown on 429 error
* fix doc string for _should_cooldown_deployment
* fix sync cooldown logic router
2024-09-14 18:01:19 -07:00
Ishaan Jaff
7c2ddba6c6
sambanova support ( #5547 ) ( #5703 )
...
* add sambanova support
* sambanova support
* updated api endpoint for sambanova
---------
Co-authored-by: Venu Anuganti <venu@venublog.com>
Co-authored-by: Venu Anuganti <venu@vairmac2020>
2024-09-14 17:23:04 -07:00
Ishaan Jaff
85acdb9193
[Feat] Add max_completion_tokens
param ( #5691 )
...
* add max_completion_tokens
* add max_completion_tokens
* add max_completion_tokens support for OpenAI models
* add max_completion_tokens param
* add max_completion_tokens for bedrock converse models
* add test for converse maxTokens
* fix openai o1 param mapping test
* move test optional params
* add max_completion_tokens for anthropic api
* fix conftest
* add max_completion tokens for vertex ai partner models
* add max_completion_tokens for fireworks ai
* add max_completion_tokens for hf rest api
* add test for param mapping
* add param mapping for vertex, gemini + testing
* predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd
* add max_completion_tokens to openai supported params
* fix fireworks ai param mapping
2024-09-14 14:57:01 -07:00
Ahmet
415a3ede9e
Update model_prices_and_context_window.json ( #5700 )
...
added audio_speech mode on the sample_spec for clarity.
2024-09-14 11:22:08 -07:00
Krish Dholakia
dad1ad2077
LiteLLM Minor Fixes and Improvements (09/14/2024) ( #5697 )
...
* fix(health_check.py): hide sensitive keys from health check debug information k
* fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue
* fix(vertex_llm_base.py): fix exception message to not log credentials
2024-09-14 10:32:39 -07:00
Krish Dholakia
60709a0753
LiteLLM Minor Fixes and Improvements (09/13/2024) ( #5689 )
...
* refactor: cleanup unused variables + fix pyright errors
* feat(health_check.py): Closes https://github.com/BerriAI/litellm/issues/5686
* fix(o1_reasoning.py): add stricter check for o-1 reasoning model
* refactor(mistral/): make it easier to see mistral transformation logic
* fix(openai.py): fix openai o-1 model param mapping
Fixes https://github.com/BerriAI/litellm/issues/5685
* feat(main.py): infer finetuned gemini model from base model
Fixes https://github.com/BerriAI/litellm/issues/5678
* docs(vertex.md): update docs to call finetuned gemini models
* feat(proxy_server.py): allow admin to hide proxy model aliases
Closes https://github.com/BerriAI/litellm/issues/5692
* docs(load_balancing.md): add docs on hiding alias models from proxy config
* fix(base.py): don't raise notimplemented error
* fix(user_api_key_auth.py): fix model max budget check
* fix(router.py): fix elif
* fix(user_api_key_auth.py): don't set team_id to empty str
* fix(team_endpoints.py): fix response type
* test(test_completion.py): handle predibase error
* test(test_proxy_server.py): fix test
* fix(o1_transformation.py): fix max_completion_token mapping
* test(test_image_generation.py): mark flaky test
2024-09-14 10:02:55 -07:00
F1bos
db3af20d84
(models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827 ( #5693 )
...
* Fixed gemini-1.5-flash pricing
* (models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827
2024-09-14 08:41:48 -07:00
Ishaan Jaff
741c8e8a45
[Feat - Perf Improvement] DataDog Logger 91% lower latency ( #5687 )
...
* fix refactor dd to be an instance of custom logger
* migrate dd logger to be async
* clean up dd logging
* add datadog sync and async code
* use batching for datadog logger
* add doc string for dd logging
* add clear doc string
* fix doc string
* allow debugging intake url
* clean up requirements.txt
* allow setting custom batch size on logger
* fix dd logging to use compression
* fix linting
* add dd load test
* fix dd load test
* fix dd url
* add test_datadog_logging_http_request
* fix test_datadog_logging_http_request
2024-09-13 17:39:17 -07:00
Ishaan Jaff
cd8d7ca915
[Fix] Performance - use in memory cache when downloading images from a url ( #5657 )
...
* fix use in memory cache when getting images
* fix linting
* fix load testing
* fix load test size
* fix load test size
* trigger ci/cd again
2024-09-13 07:23:42 -07:00
Krrish Dholakia
cdd7cd4d69
build: bump from 1.44.28 -> 1.45.0
2024-09-12 23:10:29 -07:00
Krish Dholakia
4657a40ef1
LiteLLM Minor Fixes and Improvements (09/12/2024) ( #5658 )
...
* fix(factory.py): handle tool call content as list
Fixes https://github.com/BerriAI/litellm/issues/5652
* fix(factory.py): enforce stronger typing
* fix(router.py): return model alias in /v1/model/info and /v1/model_group/info
* fix(user_api_key_auth.py): move noisy warning message to debug
cleanup logs
* fix(types.py): cleanup pydantic v2 deprecated param
Fixes https://github.com/BerriAI/litellm/issues/5649
* docs(gemini.md): show how to pass inline data to gemini api
Fixes https://github.com/BerriAI/litellm/issues/5674
2024-09-12 23:04:06 -07:00
David Manouchehri
795047c37f
Add o1 models on OpenRouter. ( #5676 )
2024-09-12 22:16:10 -07:00
Krish Dholakia
00047de1c6
fix(user_dashboard.tsx): don't call /global/spend on startup ( #5668 )
...
at 1m+ rows, query timeouts cause ui errors
2024-09-12 22:15:52 -07:00
Krish Dholakia
d94d47424f
fix(proxy/utils.py): auto-update if required view missing from db. raise warning for optional views. ( #5675 )
...
Prevents missing optional views from blocking proxy startup.
2024-09-12 22:15:44 -07:00
Ishaan Jaff
fa01b5c7d9
bump: version 1.44.27 → 1.44.28
2024-09-12 19:17:34 -07:00
Ishaan Jaff
19a06d7842
[Fix-Router] Don't cooldown when only 1 deployment exists ( #5673 )
...
* fix get model list
* fix test custom callback router
* fix embedding fallback test
* fix router retry policy on AuthErrors
* fix router test
* add test for single deployments no cooldown test prod
* add test test_single_deployment_no_cooldowns_test_prod_mock_completion_calls
2024-09-12 19:14:58 -07:00
Ishaan Jaff
13ba22d6fd
docs add o1 to docs
2024-09-12 19:06:13 -07:00
Ishaan Jaff
e7c9716841
[Feat-Perf] Use Batching + Squashing ( #5645 )
...
* use folder for slack alerting
* clean up slack alerting
* fix test alerting
2024-09-12 18:37:53 -07:00
Ishaan Jaff
fe5e0bcd15
Merge pull request #5666 from BerriAI/litellm_add_openai_o1
...
[Feat] Add OpenAI O1 Family Param mapping / config
2024-09-12 16:15:53 -07:00
Ishaan Jaff
a1fe2701f2
Merge branch 'main' into litellm_add_openai_o1
2024-09-12 16:15:43 -07:00
Ishaan Jaff
bb38e9cbf8
fix gcs logging
2024-09-12 15:24:04 -07:00
Ishaan Jaff
46ce4995b8
fix type errors
2024-09-12 14:49:43 -07:00
Ishaan Jaff
0f24f339f3
fix handle user message
2024-09-12 14:34:32 -07:00
Ishaan Jaff
ded40e4d41
bump openai to 1.45.0
2024-09-12 14:18:15 -07:00
Ishaan Jaff
14dc7b3b54
fix linting
2024-09-12 14:15:18 -07:00
Ishaan Jaff
a5a0773b19
fix handle o1 not supporting system message
2024-09-12 14:09:13 -07:00
Ishaan Jaff
3490862795
bump: version 1.44.26 → 1.44.27
2024-09-12 13:41:05 -07:00
Ishaan Jaff
d2510a04a2
fix pricing
2024-09-12 13:41:01 -07:00
Ishaan Jaff
f5e9e9fc9a
add o1 reasoning tests
2024-09-12 13:40:15 -07:00
Krish Dholakia
c76d2c6ade
Refactor 'check_view_exists' logic ( #5659 )
...
* fix(proxy/utils.py): comment out auto-upsert logic in check_view_exists
Prevents proxy from failing on startup due to faulty logic
* fix(db/migration_scripts/create_views.py): fix 'DailyTagSpend' quotation on check
* fix(create_views.py): mongly global spend time period should be 30d not 20d
* fix(schema.prisma): index on startTime and endUser for efficient UI querying
2024-09-12 13:39:50 -07:00
David Manouchehri
5c1a70be21
Fix token and remove dups. ( #5662 )
2024-09-12 13:33:35 -07:00
Ishaan Jaff
fed9c89cc7
add OpenAI o1 config
2024-09-12 13:22:59 -07:00
David Manouchehri
b4f97763f0
(models): Add o1 pricing. ( #5661 )
2024-09-12 11:47:04 -07:00
Ishaan Jaff
fab176fc20
Merge pull request #5660 from lowjiansheng/js-openai-o1
...
Add gpt o1 and o1 mini models
2024-09-12 11:35:06 -07:00
lowjiansheng
3afe70c1f2
gpt o1 and o1 mini
2024-09-13 02:27:57 +08:00
Ishaan Jaff
ead1e0c708
Merge pull request #5655 from BerriAI/litellm_testing_clean_up
...
[Fix Ci/cd] Separate testing pipeline for litellm router
2024-09-12 11:05:26 -07:00
Ishaan Jaff
085e1751ad
mark test as flaky
2024-09-12 09:29:37 -07:00
Ishaan Jaff
bea34c9231
fix config.yml
2024-09-12 09:28:45 -07:00
Ishaan Jaff
90d096b639
ci/cd run again
2024-09-12 08:42:34 -07:00