Commit graph

3368 commits

Author SHA1 Message Date
Ishaan Jaff
c6ca835046 docs files api
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
2024-12-24 20:46:43 -08:00
Ishaan Jaff
47e12802df
(feat) /batches Add support for using /batches endpoints in OAI format (#7402)
* run azure testing on ci/cd

* update docs on azure batches endpoints

* add input azure.jsonl

* refactor - use separate file for batches endpoints

* fixes for passing custom llm provider to /batch endpoints

* pass custom llm provider to files endpoints

* update azure batches doc

* add info for azure batches api

* update batches endpoints

* use simple helper for raising proxy exception

* update config.yml

* fix imports

* update tests

* use existing settings

* update env var used

* update configs

* update config.yml

* update ft testing
2024-12-24 16:58:05 -08:00
Krish Dholakia
c3edfc2c92
LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 35s
* build(model_prices_and_context_window.json): add gemini-1.5-flash context caching

* fix(context_caching/transformation.py): just use last identified cache point

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(vertex_ai/gemini/): track context caching tokens

* refactor(gemini/): place transformation.py inside `chat/` folder

make it easy for user to know we support the equivalent endpoint

* fix: fix import

* refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder

make it easier to see cost calculation logic

* fix: fix linting errors

* fix: fix circular import

* feat(gemini/cost_calculator.py): support gemini context caching cost calculation

generifies anthropic's cost calculation function and uses it across anthropic + gemini

* build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching

Closes https://github.com/BerriAI/litellm/issues/6891

* docs(gemini.md): add gemini context caching architecture diagram

make it easier for user to understand how context caching works

* docs(gemini.md): link to relevant gemini context caching code

* docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code

* fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario

* fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation

* test: fix test
2024-12-23 22:02:52 -08:00
Ishaan Jaff
442d309bcd update release notes 2024-12-23 21:48:33 -08:00
Ishaan Jaff
f4a2143b82 update release notes 2024-12-23 21:43:47 -08:00
Ishaan Jaff
1f466ec9cf release notes 2024-12-23 21:38:56 -08:00
Ishaan Jaff
957fbc6dfb docs batches 2024-12-23 21:24:06 -08:00
Ishaan Jaff
43077a88d5 docs add files to supported endpoints 2024-12-23 20:51:34 -08:00
Krish Dholakia
48316520f4
LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 (#7386)
* fix(main.py): support 'mock_timeout=true' param

allows mock requests on proxy to have a time delay, for testing

* fix(main.py): ensure mock timeouts raise litellm.Timeout error

triggers retry/fallbacks

* fix: fix fallback + mock timeout testing

* fix(router.py): always return remaining tpm/rpm limits, if limits are known

allows for rate limit headers to be guaranteed

* docs(timeout.md): add docs on mock timeout = true

* fix(main.py): fix linting errors

* test: fix test
2024-12-23 17:41:27 -08:00
Krish Dholakia
db59e08958
Litellm dev 12 23 2024 p1 (#7383)
* feat(guardrails_endpoint.py): new `/guardrails/list` endpoint

Allow users to view what the available guardrails are

* docs: document new `/guardrails/list` endpoint

* docs(enterprise.md): update docs

* fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats

* fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests

* fix: fix linting errors

* fix: remove unused import
2024-12-23 16:33:31 -08:00
Krish Dholakia
3671829e39
Complete 'requests' library removal (#7350)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure

* refactor(watsonx/completion/handler.py): move to using base llm http handler

removes 'requests' library usage

* fix(watsonx_text/transformation.py): fix result transformation

migrates to transformation.py, for usage with base llm http handler

* fix(streaming_handler.py): migrate watsonx streaming to transformation.py

ensures streaming works with base llm http handler

* fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic

* fix(watsonx/): fix chat route post completion route refactor

* refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well

* refactor(base.py): remove requests library usage from litellm

* build(pyproject.toml): remove requests library usage

* fix: fix linting errors

* fix: fix linting errors

* fix(types/utils.py): fix validation errors for modelresponsestream

* fix(replicate/handler.py): fix linting errors

* fix(litellm_logging.py): handle modelresponsestream object

* fix(streaming_handler.py): fix modelresponsestream args

* fix: remove unused imports

* test: fix test

* fix: fix test

* test: fix test

* test: fix tests

* test: fix test

* test: fix patch target

* test: fix test
2024-12-22 07:21:25 -08:00
Ishaan Jaff
8b1ea40e7b add img to release notes
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 36s
2024-12-21 21:24:16 -08:00
Ishaan Jaff
bb801ad3d9 update release notes 2024-12-21 21:20:13 -08:00
Krish Dholakia
1aed988c11
Litellm docs update (#7365)
* docs: fix release notes url + fix urlk

* docs(index.md): add link to github releases

* docs(index.md): add linkedin social url to release notes
2024-12-21 21:09:50 -08:00
Ishaan Jaff
31f5dcf8e5 docs 2024-12-21 20:52:39 -08:00
Ishaan Jaff
291229ee46 docs add 1.55.8 changelog 2024-12-21 20:51:39 -08:00
Ishaan Jaff
5a8f67c171 release notes v1.55.8 2024-12-21 20:31:54 -08:00
Ishaan Jaff
a3e732de39
(chore) - enforce model budgets on virtual keys as enterprise feature (#7353)
* docs - enforce model budget as enterprise feature

* docs link to correct place
2024-12-21 14:18:53 -08:00
Ishaan Jaff
6107f9f3f3
[Bug fix ]: Triton /infer handler incompatible with batch responses (#7337)
* migrate triton to base llm http handler

* clean up triton handler.py

* use transform functions for triton

* add TritonConfig

* get openai params for triton

* use triton embedding config

* test_completion_triton_generate_api

* test_completion_triton_infer_api

* fix TritonConfig doc string

* use TritonResponseIterator

* fix triton embeddings

* docs triton chat usage
2024-12-20 20:59:40 -08:00
Krish Dholakia
70a9ea99f2
Controll fallback prompts client-side (#7334)
* feat(router.py): support passing model-specific messages in fallbacks

* docs(routing.md): separate router timeouts into separate doc

allow for 1 fallbacks doc (across proxy/router)

* docs(routing.md): cleanup router docs

* docs(reliability.md): cleanup docs

* docs(reliability.md): cleaned up fallback doc

just have 1 doc across sdk/proxy

simplifies docs

* docs(reliability.md): add setting model-specific fallback prompts

* fix: fix linting errors

* test: skip test causing openai rate limit errros

* test: fix test

* test: run vertex test first to catch error
2024-12-20 19:09:53 -08:00
jravi-fireworks
f8cf11f6d5
Fix LiteLLM documentation (#7333)
Co-authored-by: Jetashree Ravi <jetashreeravi@Jetashrees-MBP.attlocal.net>
2024-12-20 15:04:23 -08:00
Krish Dholakia
27a4d08604
Litellm dev 2024 12 19 p3 (#7322)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params

Fixes https://github.com/BerriAI/litellm/issues/7242

* test: new test for langfuse prompt management hook

Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296

* feat(main.py): add 'get_chat_completion_prompt' customlogger hook

allows for langfuse prompt management

Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296

* feat(langfuse_prompt_management.py): working e2e langfuse prompt management

works with `langfuse/` route

* feat(main.py): initial tracing for dynamic langfuse params

allows admin to specify langfuse keys by model in model_list

* feat(main.py): support passing langfuse credentials dynamically

* fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params

allows dynamic langfuse params to work

* fix: fix linting errors

* docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial

* docs(prompt_management.md): cleanup doc

* docs: cleanup topnav

* docs(prompt_management.md): update docs to be easier to use

* fix: remove unused imports

* docs(prompt_management.md): add architectural overview doc

* fix(litellm_logging.py): fix dynamic param passing

* fix(langfuse_prompt_management.py): fix linting errors

* fix: fix linting errors

* fix: use typing_extensions for typealias to ensure python3.8 compatibility

* test: use stream_options in test to account for tiktoken diff

* fix: improve import error message, and check run test earlier
2024-12-20 13:30:16 -08:00
Krrish Dholakia
834067c570 docs: refactor admin ui docs 2024-12-20 10:20:07 -08:00
Ishaan Jaff
35076212ef docs infinity rerank api docs 2024-12-19 18:51:55 -08:00
Ishaan Jaff
3a6dba8853 docs base rerank config 2024-12-19 18:33:32 -08:00
Ishaan Jaff
236561cdb8 docs add ref prs 2024-12-19 18:31:38 -08:00
Ishaan Jaff
617ac63d14
(feat) add infinity rerank models (#7321)
* Support Infinity Reranker (custom reranking models) (#7247)

* Support Infinity Reranker

* Clean code

* Included transformation.py

* Clean code

* Added Infinity reranker test

* Clean code

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* transform_rerank_response

* update handler.py

* infinity rerank updates

* ci/cd run again

* add infinity unit tests

* docs add instruction on how to add a new provider for rerank

---------

Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>
2024-12-19 18:30:28 -08:00
Ishaan Jaff
6261ec3599
(feat proxy) v2 - model max budgets (#7302)
* clean up unused code

* add _PROXY_VirtualKeyModelMaxBudgetLimiter

* adjust type imports

* working _PROXY_VirtualKeyModelMaxBudgetLimiter

* fix user_api_key_model_max_budget

* fix user_api_key_model_max_budget

* update naming

* update naming

* fix changes to RouterBudgetLimiting

* test_call_with_key_over_model_budget

* test_call_with_key_over_model_budget

* handle _get_request_model_budget_config

* e2e test for test_call_with_key_over_model_budget

* clean up test

* run ci/cd again

* add validate_model_max_budget

* docs fix

* update doc

* add e2e testing for _PROXY_VirtualKeyModelMaxBudgetLimiter

* test_unit_test_max_model_budget_limiter.py
2024-12-18 19:42:46 -08:00
Krish Dholakia
5253f639cd
fix(health.md): add rerank model health check information (#7295)
* fix(health.md): add rerank model health check information

* build(model_prices_and_context_window.json): add gemini 2.0 for google ai studio - pricing + commercial rate limits

* build(model_prices_and_context_window.json): add gemini-2.0 supports audio output = true

* docs(team_model_add.md): clarify allowing teams to add models is an enterprise feature

* fix(o1_transformation.py): add support for 'n', 'response_format' and 'stop' params for o1 and 'stream_options' param for o1-mini

* build(model_prices_and_context_window.json): add 'supports_system_message' to supporting openai models

needed as o1-preview, and o1-mini models don't support 'system message

* fix(o1_transformation.py): translate system message based on if o1 model supports it

* fix(o1_transformation.py): return 'stream' param support if o1-mini/o1-preview

o1 currently doesn't support streaming, but the other model versions do

Fixes https://github.com/BerriAI/litellm/issues/7292

* fix(o1_transformation.py): return tool calling/response_format in supported params if model map says so

Fixes https://github.com/BerriAI/litellm/issues/7292

* fix: fix linting errors

* fix: update '_transform_messages'

* fix(o1_transformation.py): fix provider passed for supported param checks

* test(base_llm_unit_tests.py): skip test if api takes >5s to respond

* fix(utils.py): return false in 'supports_factory' if can't find value

* fix(o1_transformation.py): always return stream + stream_options as supported params + handle stream options being passed in for azure o1

* feat(openai.py): support stream faking natively in openai handler

Allows o1 calls to be faked for just the "o1" model, allows native streaming for o1-mini, o1-preview

 Fixes https://github.com/BerriAI/litellm/issues/7292

* fix(openai.py): use inference param instead of original optional param
2024-12-18 19:18:10 -08:00
Krish Dholakia
6a45ee1ef7
fix(hosted_vllm/transformation.py): return fake api key, if none give… (#7301)
* fix(hosted_vllm/transformation.py): return fake api key, if none give. Prevents httpx error

Fixes https://github.com/BerriAI/litellm/issues/7291

* test: fix test

* fix(main.py): add hosted_vllm/ support for embeddings endpoint

Closes https://github.com/BerriAI/litellm/issues/7290

* docs(vllm.md): add docs on vllm embeddings usage

* fix(__init__.py): fix sambanova model test

* fix(base_llm_unit_tests.py): skip pydantic obj test if model takes >5s to respond
2024-12-18 18:41:53 -08:00
Ishaan Jaff
246e3bafc8
(feat - proxy) Add status_code to litellm_proxy_total_requests_metric_total (#7293)
* fix _select_model_name_for_cost_calc docstring

* add STATUS_CODE  to prometheus

* test prometheus unit tests

* test_prometheus_unit_tests.py

* update Proxy Level Tracking Metrics docs

* fix test_proxy_failure_metrics

* fix test_proxy_failure_metrics
2024-12-18 15:55:02 -08:00
Krish Dholakia
2f08341a08
Litellm dev readd prompt caching (#7299)
* fix(router.py): re-add saving model id on prompt caching valid successful deployment

* fix(router.py): introduce optional pre_call_checks

isolate prompt caching logic in a separate file

* fix(prompt_caching_deployment_check.py): fix import

* fix(router.py): new 'async_filter_deployments' event hook

allows custom logger to filter deployments returned to routing strategy

* feat(prompt_caching_deployment_check.py): initial working commit of prompt caching based routing

* fix(cooldown_callbacks.py): fix linting error

* fix(budget_limiter.py): move budget logger to async_filter_deployment hook

* test: add unit test

* test(test_router_helper_utils.py): add unit testing

* fix(budget_limiter.py): fix linting errors

* docs(config_settings.md): add 'optional_pre_call_checks' to router_settings param docs
2024-12-18 15:13:49 -08:00
Ishaan Jaff
523beedb4c update docs 2024-12-18 11:16:30 -08:00
Ishaan Jaff
9c30387e66 tag budgets fixes 2024-12-18 10:28:37 -08:00
Ishaan Jaff
f3c546b79e
(feat) proxy Azure Blob Storage - Add support for AZURE_STORAGE_ACCOUNT_KEY Auth (#7280)
* add upload_to_azure_data_lake_with_azure_account_key

* async_upload_payload_to_azure_blob_storage

* docs add AZURE_STORAGE_ACCOUNT_KEY

* add azure-storage-file-datalake
2024-12-17 17:35:45 -08:00
Krish Dholakia
cd5bdfcb7a
docs(input.md): document 'extra_headers' param support (#7268)
* docs(input.md): document 'extra_headers' param support

* fix: #7239 to move Nova topK parameter to `additionalModelRequestFields` (#7240)

Co-authored-by: Ryan Hoium <rhoium>

---------

Co-authored-by: ryanh-ai <3118399+ryanh-ai@users.noreply.github.com>
2024-12-17 07:19:14 -08:00
Ishaan Jaff
f3b13a9af3
(feat) Add Bedrock knowledge base pass through endpoints (#7267)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 56s
* bugfix: Proxy Routing for Bedrock Knowledgebase URLs are incorrect (#7097)

* Fixing routing bug where bedrock knowledgebase urls were being generated incorrectly

* Preparing for PR

* Preparing for PR

* Preparing for PR

---------

Co-authored-by: Luke Birk <lb0737@att.com>

* fix _is_bedrock_agent_runtime_route

* docs - Query Knowledge Base

* test_is_bedrock_agent_runtime_route

* fix bedrock_proxy_route

---------

Co-authored-by: LBirk <2731718+LBirk@users.noreply.github.com>
Co-authored-by: Luke Birk <lb0737@att.com>
2024-12-16 22:19:34 -08:00
Ishaan Jaff
3c984ed60e
(feat) Add Azure Blob Storage Logging Integration (#7265)
* add path to http handler

* AzureBlobStorageLogger

* test_azure_blob_storage

* use constants for Azure storage

* use helper get_azure_ad_token_from_entrata_id

* azure blob storage support

* get_azure_ad_token_from_azure_storage

* fix import

* azure logging

* docs azure storage

* add docs on azure blobs

* add premium user check

* add azure_storage  as identified logging callback

* async_upload_payload_to_azure_blob_storage

* docs azure storage

* callback_class_str_to_classType
2024-12-16 22:18:22 -08:00
Ishaan Jaff
d1124a736c docs add response format on main pages
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 46s
2024-12-16 08:41:12 -08:00
Ishaan Jaff
5a4b6cd9f5 docs update 2024-12-16 08:06:06 -08:00
Ishaan Jaff
7103198805
(feat) Add Tag-based budgets on litellm router / proxy (#7236)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 46s
* add BudgetConfig

* add _get_tags_from_request_kwargs

* test_tag_budgets_e2e_test_expect_to_fail

* add a check for request tags

* fix _async_get_cache_keys_for_router_budget_limiting

* fix test

* fix _sync_in_memory_spend_with_redis

* _async_get_cache_keys_for_router_budget_limiting

* fix _init_tag_budgets

* fix type casting

* docs show error for tag budget limit hit

* fix _get_tags_from_request_kwargs

* fix undo change
2024-12-14 17:28:36 -08:00
Krish Dholakia
ec36353b41
fix(main.py): fix retries being multiplied when using openai sdk (#7221)
* fix(main.py): fix retries being multiplied when using openai sdk

Closes https://github.com/BerriAI/litellm/pull/7130

* docs(prompt_management.md): add langfuse prompt management doc

* feat(team_endpoints.py): allow teams to add their own models

Enables teams to call their own finetuned models via the proxy

* test: add better enforcement check testing for `/model/new` now that teams can add their own models

* docs(team_model_add.md): tutorial for allowing teams to add their own models

* test: fix test
2024-12-14 11:56:55 -08:00
Ishaan Jaff
163529b40b
(feat - Router / Proxy ) Allow setting budget limits per LLM deployment (#7220)
* fix test_deployment_budget_limits_e2e_test

* refactor async_log_success_event to track spend for provider + deployment

* fix format

* rename class to RouterBudgetLimiting

* rename func

* rename types used for budgets

* add new types for deployment budgets

* add budget limits for deployments

* fix checking budgets set for provider

* update file names

* fix linting error

* _track_provider_remaining_budget_prometheus

* async_filter_deployments

* fix model list passed to router

* update error

* test_deployment_budgets_e2e_test_expect_to_fail

* fix test case

* run deployment budget limits
2024-12-13 19:15:51 -08:00
Krish Dholakia
b150faff90
Litellm dev 12 13 2024 p1 (#7219)
* fix(litellm_logging.py): pass user metadata to langsmith on sdk calls

* fix(litellm_logging.py): pass nested user metadata to logging integration - e.g. langsmith

* fix(exception_mapping_utils.py): catch and clarify watsonx `/text/chat` endpoint not supported error message.

Closes https://github.com/BerriAI/litellm/issues/7213

* fix(watsonx/common_utils.py): accept new 'WATSONX_IAM_URL' env var

allows user to use local watsonx

Fixes https://github.com/BerriAI/litellm/issues/4991

* fix(litellm_logging.py): cleanup unused function

* test: skip bad ibm test
2024-12-13 19:01:28 -08:00
Ishaan Jaff
621c713400
(docs) Document StandardLoggingPayload Spec (#7201)
* add slp spec to docs

* docs slp

* test slp enforcement
2024-12-12 14:00:42 -08:00
Ishaan Jaff
b45777c268
(Feat) DataDog Logger - Add HOSTNAME and POD_NAME to DataDog logs (#7189)
* add unit test for test_datadog_static_methods

* docs dd vars

* test_datadog_payload_environment_variables

* test_datadog_static_methods

* docs env vars

* fix table
2024-12-12 12:06:26 -08:00
Ishaan Jaff
153ab055d6
(feat) add response_time to StandardLoggingPayload - logged on datadog, gcs_bucket, s3_bucket etc (#7199)
* feat - add response_time to slp

* test_get_response_time

* docs slp

* fix test_datadog_logging_http_request
2024-12-12 12:04:43 -08:00
dependabot[bot]
b328d42ebc
build(deps): bump nanoid from 3.3.7 to 3.3.8 in /docs/my-website (#7159)
Bumps [nanoid](https://github.com/ai/nanoid) from 3.3.7 to 3.3.8.
- [Release notes](https://github.com/ai/nanoid/releases)
- [Changelog](https://github.com/ai/nanoid/blob/main/CHANGELOG.md)
- [Commits](https://github.com/ai/nanoid/compare/3.3.7...3.3.8)

---
updated-dependencies:
- dependency-name: nanoid
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-10 23:51:05 -08:00
Krish Dholakia
350cfc36f7
Litellm merge pr (#7161)
* build: merge branch

* test: fix openai naming

* fix(main.py): fix openai renaming

* style: ignore function length for config factory

* fix(sagemaker/): fix routing logic

* fix: fix imports

* fix: fix override
2024-12-10 22:49:26 -08:00
Krish Dholakia
0c0498dd60
Litellm dev 12 07 2024 (#7086)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* fix(main.py): support passing max retries to azure/openai embedding integrations

Fixes https://github.com/BerriAI/litellm/issues/7003

* feat(team_endpoints.py): allow updating team model aliases

Closes https://github.com/BerriAI/litellm/issues/6956

* feat(router.py): allow specifying model id as fallback - skips any cooldown check

Allows a default model to be checked if all models in cooldown

s/o @micahjsmith

* docs(reliability.md): add fallback to specific model to docs

* fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util

Allows user to identify if messages/tools have prompt caching

Related issue: https://github.com/BerriAI/litellm/issues/6784

* feat(router.py): store model id for prompt caching valid prompt

Allows routing to that model id on subsequent requests

* fix(router.py): only cache if prompt is valid prompt caching prompt

prevents storing unnecessary items in cache

* feat(router.py): support routing prompt caching enabled models to previous deployments

Closes https://github.com/BerriAI/litellm/issues/6784

* test: fix linting errors

* feat(databricks/): convert basemodel to dict and exclude none values

allow passing pydantic message to databricks

* fix(utils.py): ensure all chat completion messages are dict

* (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081)

* add custom_llm_provider to SpendLogsPayload

* add custom_llm_provider to SpendLogs

* add custom llm provider to SpendLogs payload

* test_spend_logs_payload

* Add MLflow to the side bar (#7031)

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* (bug fix) SpendLogs update DB catch all possible DB errors for retrying  (#7082)

* catch DB_CONNECTION_ERROR_TYPES

* fix DB retry mechanism for SpendLog updates

* use DB_CONNECTION_ERROR_TYPES in auth checks

* fix exp back off for writing SpendLogs

* use _raise_failed_update_spend_exception to ensure errors print as NON blocking

* test_update_spend_logs_multiple_batches_with_failure

* (Feat) Add StructuredOutputs support for Fireworks.AI (#7085)

* fix model cost map fireworks ai "supports_response_schema": true,

* fix supports_response_schema

* fix map openai params fireworks ai

* test_map_response_format

* test_map_response_format

* added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084)

* bump: version 1.53.9 → 1.54.0

* fix deepinfra

* litellm db fixes LiteLLM_UserTable (#7089)

* ci/cd queue new release

* fix llama-3.3-70b-versatile

* refactor - use consistent file naming convention `AI21/` -> `ai21`  (#7090)

* fix refactor - use consistent file naming convention

* ci/cd run again

* fix naming structure

* fix use consistent naming (#7092)

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>
2024-12-08 00:30:33 -08:00