Commit graph

1839 commits

Author SHA1 Message Date
Krish Dholakia
4657a40ef1
LiteLLM Minor Fixes and Improvements (09/12/2024) (#5658)
* fix(factory.py): handle tool call content as list

Fixes https://github.com/BerriAI/litellm/issues/5652

* fix(factory.py): enforce stronger typing

* fix(router.py): return model alias in /v1/model/info and /v1/model_group/info

* fix(user_api_key_auth.py): move noisy warning message to debug

cleanup logs

* fix(types.py): cleanup pydantic v2 deprecated param

Fixes https://github.com/BerriAI/litellm/issues/5649

* docs(gemini.md): show how to pass inline data to gemini api

Fixes https://github.com/BerriAI/litellm/issues/5674
2024-09-12 23:04:06 -07:00
Krish Dholakia
d94d47424f
fix(proxy/utils.py): auto-update if required view missing from db. raise warning for optional views. (#5675)
Prevents missing optional views from blocking proxy startup.
2024-09-12 22:15:44 -07:00
Ishaan Jaff
e7c9716841
[Feat-Perf] Use Batching + Squashing (#5645)
* use folder for slack alerting

* clean up slack alerting

* fix test alerting
2024-09-12 18:37:53 -07:00
Ishaan Jaff
7c9591881c use callback_settings when intializing otel 2024-09-09 16:05:48 -07:00
Ishaan Jaff
805e4c5754 add spend_report_frequency as a general setting 2024-09-07 11:44:58 -07:00
Krish Dholakia
72e961af3c
LiteLLM Minor Fixes and Improvements (08/06/2024) (#5567)
* fix(utils.py): return citations for perplexity streaming

Fixes https://github.com/BerriAI/litellm/issues/5535

* fix(anthropic/chat.py): support fallbacks for anthropic streaming (#5542)

* fix(anthropic/chat.py): support fallbacks for anthropic streaming

Fixes https://github.com/BerriAI/litellm/issues/5512

* fix(anthropic/chat.py): use module level http client if none given (prevents early client closure)

* fix: fix linting errors

* fix(http_handler.py): fix raise_for_status error handling

* test: retry flaky test

* fix otel type

* fix(bedrock/embed): fix error raising

* test(test_openai_batches_and_files.py): skip azure batches test (for now) quota exceeded

* fix(test_router.py): skip azure batch route test (for now) - hit batch quota limits

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* All `model_group_alias` should show up in `/models`, `/model/info` , `/model_group/info` (#5539)

* fix(router.py): support returning model_alias model names in `/v1/models`

* fix(proxy_server.py): support returning model alias'es on `/model/info`

* feat(router.py): support returning model group alias for `/model_group/info`

* fix(proxy_server.py): fix linting errors

* fix(proxy_server.py): fix linting errors

* build(model_prices_and_context_window.json): add amazon titan text premier pricing information

Closes https://github.com/BerriAI/litellm/issues/5560

* feat(litellm_logging.py): log standard logging response object for pass through endpoints. Allows bedrock /invoke agent calls to be correctly logged to langfuse + s3

* fix(success_handler.py): fix linting error

* fix(success_handler.py): fix linting errors

* fix(team_endpoints.py): Allows admin to update team member budgets

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-06 17:16:24 -07:00
Krish Dholakia
f584021f7c
LiteLLM Minor Fixes and Improvements (#5537)
* fix(vertex_ai): Fixes issue where multimodal message without text was failing vertex calls

Fixes https://github.com/BerriAI/litellm/issues/5515

* fix(azure.py): move to using httphandler for oidc token calls

Fixes issue where ssl certificates weren't being picked up as expected

Closes https://github.com/BerriAI/litellm/issues/5522

* feat: Allows admin to set a default_max_internal_user_budget in config, and allow setting more specific values as env vars

* fix(proxy_server.py): fix read for max_internal_user_budget

* build(model_prices_and_context_window.json): add regional gpt-4o-2024-08-06 pricing

Closes https://github.com/BerriAI/litellm/issues/5540

* test: skip re-test
2024-09-05 18:03:34 -07:00
Krish Dholakia
1e7e538261
LiteLLM Minor fixes + improvements (08/04/2024) (#5505)
* Minor IAM AWS OIDC Improvements (#5246)

* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.

* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.

* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.

* fix(router.py): log rejected requests

Fixes https://github.com/BerriAI/litellm/issues/5498

* refactor: don't use verbose_logger.exception, if exception is raised

User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.

* fix(datadog.py): support setting datadog source as an env var

Fixes https://github.com/BerriAI/litellm/issues/5508

* docs(logging.md): add dd_source to datadog docs

* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers

* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509)

* feat(anthropic.py): support 'cache_control' param for content when it is a string

* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519)

This reverts commit 3fac0349c2.

* refactor: ci/cd run again

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-09-04 22:16:55 -07:00
Ishaan Jaff
20a5bbe6a6 fix allow general guardrails on free tier 2024-09-04 19:59:32 -07:00
Krish Dholakia
be3c7b401e
LiteLLM Minor fixes + improvements (08/03/2024) (#5488)
* fix(internal_user_endpoints.py): set budget_reset_at for /user/update

* fix(vertex_and_google_ai_studio_gemini.py): handle accumulated json

Fixes https://github.com/BerriAI/litellm/issues/5479

* fix(vertex_ai_and_gemini.py): fix assistant message function call when content is not None

Fixes https://github.com/BerriAI/litellm/issues/5490

* fix(proxy_server.py): generic state uuid for okta sso

* fix(lago.py): improve debug logs

Debugging for https://github.com/BerriAI/litellm/issues/5477

* docs(bedrock.md): add bedrock cross-region inferencing to docs

* fix(azure.py): return azure response headers on aembedding call

* feat(azure.py): return azure response headers for `/audio/transcription`

* fix(types/utils.py): standardize deepseek / anthropic prompt caching usage information

Closes https://github.com/BerriAI/litellm/issues/5285

* docs(usage.md): add docs on litellm usage object

* test(test_completion.py): mark flaky test
2024-09-03 21:21:34 -07:00
Ishaan Jaff
3c898e23ea refactor secret managers 2024-09-03 10:58:02 -07:00
Ishaan Jaff
b0178a85cf refactor get_secret 2024-09-03 10:42:12 -07:00
Krish Dholakia
9f3fa29624
feat(router.py): Support Loadbalancing batch azure api endpoints (#5469)
* feat(router.py): initial commit for loadbalancing azure batch api endpoints

Closes https://github.com/BerriAI/litellm/issues/5396

* fix(router.py): working `router.acreate_file()`

* feat(router.py): working router.acreate_batch endpoint

* feat(router.py): expose router.aretrieve_batch function

Make it easy for user to retrieve the batch information

* feat(router.py): support 'router.alist_batches' endpoint

Adds support for getting all batches across all endpoints

* feat(router.py): working loadbalancing on `/v1/files`

* feat(proxy_server.py): working loadbalancing on `/v1/batches`

* feat(proxy_server.py): working loadbalancing on Retrieve + List batch
2024-09-02 21:32:55 -07:00
Ishaan Jaff
aa13977136 refactor vtx image gen 2024-09-02 17:35:51 -07:00
Ishaan Jaff
56f10224df
Merge pull request #5457 from BerriAI/litellm_track_spend_logs_for_vertex_pass_through_endpoints
[Feat-Proxy] track spend logs for vertex pass through endpoints
2024-08-31 16:30:15 -07:00
Ishaan Jaff
b35bfb0302 fix cost tracking for vertex ai native 2024-08-31 08:22:27 -07:00
Ishaan Jaff
7d746064ab add gcs bucket base 2024-08-30 10:41:39 -07:00
Ishaan Jaff
ad88c7d0a8 show all error types on swagger 2024-08-29 18:50:41 -07:00
Ishaan Jaff
fb5be57bb8 v0 add rerank on litellm proxy 2024-08-27 17:28:39 -07:00
Ishaan Jaff
74f0e60962 fix set Caching Default Off 2024-08-24 09:43:39 -07:00
Krrish Dholakia
ac9a1e65ab fix(proxy_server.py): fix post /v1/batches endpoint
Fixes https://github.com/BerriAI/litellm/issues/5279#issuecomment-2307919820
2024-08-23 20:38:00 -07:00
Krrish Dholakia
ab28e55b76 fix(proxy_server.py): support env vars for controlling global max parallel request retry/timeouts
fixes issue where litellm module level settings weren't working for global retries, due to time of init
2024-08-23 16:06:08 -07:00
Ishaan Jaff
1b1e0f2d77 init custom guardrail class 2024-08-23 10:54:42 -07:00
Krish Dholakia
76b3db334b
Merge branch 'main' into litellm_azure_batch_apis 2024-08-22 19:07:54 -07:00
Krrish Dholakia
735fc804ed fix(proxy_server.py): expose flag to disable retries when max parallel request limit is hit 2024-08-22 16:49:52 -07:00
Krrish Dholakia
63cd94c32a fix: fix linting errors 2024-08-22 15:51:59 -07:00
Krrish Dholakia
8625663458 feat(proxy_server.py): support azure batch api endpoints 2024-08-22 15:21:43 -07:00
Krish Dholakia
68cb5cae58
Merge branch 'main' into litellm_redis_cluster 2024-08-22 11:06:14 -07:00
Ishaan Jaff
a120135dd1 fix allow setting LiteLLM license as .env 2024-08-22 10:05:00 -07:00
Ishaan Jaff
cc8e6f1d44 fix allow setting license in config.yaml 2024-08-22 09:45:15 -07:00
Ishaan Jaff
2be984ebee add docstring for /embeddings and /completions 2024-08-22 09:30:47 -07:00
Ishaan Jaff
f6e80b0031 add doc string for /chat/completions swagger 2024-08-22 09:27:40 -07:00
Ishaan Jaff
a174cbdd72
Merge branch 'main' into litellm_pass_through_vtx_multi_modal 2024-08-21 17:23:22 -07:00
Ishaan Jaff
e9537c6560 proxy - print embedding request when recieved 2024-08-21 17:00:18 -07:00
Krish Dholakia
72169fd5c4
Merge branch 'main' into litellm_disable_storing_master_key_hash_in_db 2024-08-21 15:37:25 -07:00
Krrish Dholakia
e2d7539690 feat(caching.py): redis cluster support
Closes https://github.com/BerriAI/litellm/issues/4358
2024-08-21 15:01:52 -07:00
Ishaan Jaff
d6493b0e7f docs semantic caching qdrant 2024-08-21 13:03:41 -07:00
Krrish Dholakia
89014dfc07 feat(proxy_server.py): support disabling storing master key hash in db, for spend tracking 2024-08-21 12:35:37 -07:00
Krrish Dholakia
6f8840daa1 fix(proxy_server.py): fix invalid login message to not show passed in pwd
Closes https://github.com/BerriAI/litellm/issues/5290
2024-08-20 08:56:57 -07:00
Ishaan Jaff
9ef6ae2f7c
Merge pull request #4868 from msabramo/allow-not-displaying-feedback-box
Allow not displaying feedback box
2024-08-20 08:53:45 -07:00
Ishaan Jaff
c7b3978655
Merge pull request #5288 from BerriAI/litellm_aporia_refactor
[Feat] V2 aporia guardrails litellm
2024-08-19 20:41:45 -07:00
Ishaan Jaff
8cd1963c11 feat - guardrails v2 2024-08-19 18:24:20 -07:00
Krrish Dholakia
1701c48ad5 feat(langfuse_endpoints.py): support langfuse pass through endpoints by default 2024-08-19 17:28:34 -07:00
Ishaan Jaff
613bd1babd feat - return applied guardrails in response headers 2024-08-19 11:56:20 -07:00
Ishaan Jaff
4685b9909a feat - allow accessing data post success call 2024-08-19 11:35:33 -07:00
Krish Dholakia
ff6ff133ee
Merge pull request #5260 from BerriAI/google_ai_studio_pass_through
Pass-through endpoints for Gemini - Google AI Studio
2024-08-17 13:51:51 -07:00
Ishaan Jaff
feb8c3c5b4
Merge pull request #5259 from BerriAI/litellm_return_remaining_tokens_in_header
[Feat] return `x-litellm-key-remaining-requests-{model}`: 1, `x-litellm-key-remaining-tokens-{model}: None` in response headers
2024-08-17 12:41:16 -07:00
Ishaan Jaff
ee0f772b5c feat return rmng tokens for model for api key 2024-08-17 12:35:10 -07:00
Krrish Dholakia
bc0023a409 feat(google_ai_studio_endpoints.py): support pass-through endpoint for all google ai studio requests
New Feature
2024-08-17 10:46:59 -07:00
Ishaan Jaff
5985c7e933 feat - use commong helper for getting model group 2024-08-17 10:46:04 -07:00