* feat(internal_user_endpoints.py): return 'total_tokens' in `/user/daily/analytics`
* test(test_internal_user_endpoints.py): add unit test to assert spend metrics and dailyspend metadata always report the same fields
* build(schema.prisma): record success + failure calls to daily user table
allows understanding why model requests might exceed provider requests (e.g. user hit rate limit error)
* fix(internal_user_endpoints.py): report success / failure requests in API
* fix(proxy/utils.py): default to success
status can be missing or none at times for successful requests
* feat(new_usage.tsx): show success/failure calls on UI
* style(new_usage.tsx): ui cleanup
* fix: fix linting error
* fix: fix linting error
* feat(litellm-proxy-extras/): add new migration files
* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
* feat(spend_management_endpoints.py): expose new endpoint for querying user's usage at 1m+ spend logs
Allows user to view their spend at 1m+ spend logs
* build(schema.prisma): add api_requests to dailyuserspend table
* build(migration.sql): add migration file for new column to daily user spend table
* build(prisma_client.py): add logic for copying over migration folder, if deploy/migrations present in expected location
enables easier testing of prisma migration flow
* build(ui/): initial commit successfully using the dailyuserspend table on the UI
* refactor(internal_user_endpoints.py): refactor `/user/daily/activity` to give breakdowns by provider/model/key
* feat: feature parity (cost page) with existing 'usage' page
* build(ui/): add activity tab to new_usage.tsx
gets to feature parity on 'All Up' page of 'usage.tsx'
* fix(proxy/utils.py): count number of api requests in daily user spend table
allows us to see activity by model on new usage tab
* style(new_usage.tsx): fix y-axis to be in ascending order of date
* fix: fix linting errors
* fix: fix ruff check errors
* fix test_moderations_bad_model
* use async_post_call_failure_hook
* basic logging errors in DB
* show status on ui
* show status on ui
* ui show request / response side by side
* stash fixes
* working, track raw request
* track error info in metadata
* fix showing error / request / response logs
* show traceback on error viewer
* ui with traceback of error
* fix async_post_call_failure_hook
* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads
* test_get_error_information
* fix code quality
* rename proxy track cost callback test
* _should_store_errors_in_spend_logs
* feature flag error logs
* Revert "_should_store_errors_in_spend_logs"
This reverts commit 7f345df477.
* Revert "feature flag error logs"
This reverts commit 0e90c022bb.
* test_spend_logs_payload
* fix OTEL log_db_metrics
* fix import json
* fix ui linting error
* test_async_post_call_failure_hook
* test_chat_completion_bad_model_with_spend_logs
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* use class ResetBudgetJob
* refactor reset budget job
* update reset_budget job
* refactor reset budget job
* fix LiteLLM_UserTable
* refactor reset budget job
* add telemetry for reset budget job
* dd - log service success/failure on DD
* add detailed reset budget reset info on DD
* initialize_scheduled_background_jobs
* refactor reset budget job
* trigger service failure hook when fails to reset a budget for team, key, user
* fix resetBudgetJob
* unit testing for ResetBudgetJob
* test_duration_in_seconds_basic
* testing for triggering service logging
* fix logs on test teams fail
* remove unused imports
* fix import duration in s
* duration_in_seconds
* fix(client_initialization_utils.py): handle custom llm provider set with valid value not from model name
* fix(handle_jwt.py): handle groups not existing in jwt token
if user not in group, this won't exist
* fix(handle_jwt.py): add new `enforce_team_based_model_access` flag to jwt auth
allows proxy admin to enforce user can only call model if team has access
* feat(navbar.tsx): expose new dropdown in navbar - allow org admin to create teams within org context
* fix(navbar.tsx): remove non-functional cogicon
* fix(proxy/utils.py): include user-org memberships in `/user/info` response
return orgs user is a member of and the user role within org
* feat(organization_endpoints.py): allow internal user to query `/organizations/list` and get all orgs they belong to
enables org admin to select org they belong to, to create teams
* fix(navbar.tsx): show change in ui when org switcher clicked
* feat(page.tsx): update user role based on org they're in
allows org admin to create teams in the org context
* feat(teams.tsx): working e2e flow for allowing org admin to add new teams
* style(navbar.tsx): clarify switching orgs on UI is in BETA
* fix(organization_endpoints.py): handle getting but not setting members
* test: fix test
* fix(client_initialization_utils.py): revert custom llm provider handling fix - causing unintended issues
* docs(token_auth.md): cleanup docs
* track org id in spend logs
* read org id from team table
* show user_api_key_org_id in spend logs
* test_spend_logs_payload
* test_spend_logs_with_org_id
* test_spend_logs_with_org_id
* fix(http_handler.py): support passing ssl verify dynamically and using the correct httpx client based on passed ssl verify param
Fixes https://github.com/BerriAI/litellm/issues/6499
* feat(llm_http_handler.py): support passing `ssl_verify=False` dynamically in call args
Closes https://github.com/BerriAI/litellm/issues/6499
* fix(proxy/utils.py): prevent bad logs from breaking all cost tracking + reset list regardless of success/failure
prevents malformed logs from causing all spend tracking to break since they're constantly retried
* test(test_proxy_utils.py): add test to ensure bad log is dropped
* test(test_proxy_utils.py): ensure in-memory spend logs reset after bad log error
* test(test_user_api_key_auth.py): add unit test to ensure end user id as str works
* fix(auth_utils.py): ensure extracted end user id is always a str
prevents db cost tracking errors
* test(test_auth_utils.py): ensure get end user id from request body always returns a string
* test: update tests
* test: skip bedrock test- behaviour now supported
* test: fix testing
* refactor(spend_tracking_utils.py): reduce size of get_logging_payload
* test: fix test
* bump: version 1.59.4 → 1.59.5
* Revert "bump: version 1.59.4 → 1.59.5"
This reverts commit 1182b46b2e.
* fix(utils.py): fix spend logs retry logic
* fix(spend_tracking_utils.py): fix get tags
* fix(spend_tracking_utils.py): fix end user id spend tracking on pass-through endpoints
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls
* fix(utils.py): fix split model handling
Fixes bedrock cost calculation when region name is given
* feat(_health_endpoints.py): support health checking datadog integration
Closes https://github.com/BerriAI/litellm/issues/7921
* feat(proxy/utils.py): get associated litellm budget from db in combined_view for key
allows user to create rate limit tiers and associate those to keys
* feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set
allows rate limit tiers to be easily applied to keys
* docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers
make feature discoverable
* feat(key_management_endpoints.py): return litellm_budget_table value in key generate
make it easy for user to know associated budget on key creation
* fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate`
* docs(key_management_endpoints.py): document budget_id usage
* refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it
* docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs
* fix(customer_endpoints.py): use new pydantic obj name
* docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm
* Litellm dev 12 26 2024 p2 (#7432)
* (Feat) Add logging for `POST v1/fine_tuning/jobs` (#7426)
* init commit ft jobs logging
* add ft logging
* add logging for FineTuningJob
* simple FT Job create test
* (docs) - show all supported Azure OpenAI endpoints in overview (#7428)
* azure batches
* update doc
* docs azure endpoints
* docs endpoints on azure
* docs azure batches api
* docs azure batches api
* fix(key_management_endpoints.py): fix key update to actually work
* test(test_key_management.py): add e2e test asserting ui key update call works
* fix: proxy/_types - fix linting erros
* test: update test
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix: test
* fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers
* fix: fix linting errors
* test: fix test
* fix: remove unused import
* test: update test
* docs(customer_endpoints.py): document new model_max_budget param
* test: specify unique key alias
* docs(budget_management_endpoints.py): document new model_max_budget param
* test: fix test
* test: fix tests
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix(main.py): fix retries being multiplied when using openai sdk
Closes https://github.com/BerriAI/litellm/pull/7130
* docs(prompt_management.md): add langfuse prompt management doc
* feat(team_endpoints.py): allow teams to add their own models
Enables teams to call their own finetuned models via the proxy
* test: add better enforcement check testing for `/model/new` now that teams can add their own models
* docs(team_model_add.md): tutorial for allowing teams to add their own models
* test: fix test
* fix(main.py): support passing max retries to azure/openai embedding integrations
Fixes https://github.com/BerriAI/litellm/issues/7003
* feat(team_endpoints.py): allow updating team model aliases
Closes https://github.com/BerriAI/litellm/issues/6956
* feat(router.py): allow specifying model id as fallback - skips any cooldown check
Allows a default model to be checked if all models in cooldown
s/o @micahjsmith
* docs(reliability.md): add fallback to specific model to docs
* fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util
Allows user to identify if messages/tools have prompt caching
Related issue: https://github.com/BerriAI/litellm/issues/6784
* feat(router.py): store model id for prompt caching valid prompt
Allows routing to that model id on subsequent requests
* fix(router.py): only cache if prompt is valid prompt caching prompt
prevents storing unnecessary items in cache
* feat(router.py): support routing prompt caching enabled models to previous deployments
Closes https://github.com/BerriAI/litellm/issues/6784
* test: fix linting errors
* feat(databricks/): convert basemodel to dict and exclude none values
allow passing pydantic message to databricks
* fix(utils.py): ensure all chat completion messages are dict
* (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081)
* add custom_llm_provider to SpendLogsPayload
* add custom_llm_provider to SpendLogs
* add custom llm provider to SpendLogs payload
* test_spend_logs_payload
* Add MLflow to the side bar (#7031)
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* (bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082)
* catch DB_CONNECTION_ERROR_TYPES
* fix DB retry mechanism for SpendLog updates
* use DB_CONNECTION_ERROR_TYPES in auth checks
* fix exp back off for writing SpendLogs
* use _raise_failed_update_spend_exception to ensure errors print as NON blocking
* test_update_spend_logs_multiple_batches_with_failure
* (Feat) Add StructuredOutputs support for Fireworks.AI (#7085)
* fix model cost map fireworks ai "supports_response_schema": true,
* fix supports_response_schema
* fix map openai params fireworks ai
* test_map_response_format
* test_map_response_format
* added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084)
* bump: version 1.53.9 → 1.54.0
* fix deepinfra
* litellm db fixes LiteLLM_UserTable (#7089)
* ci/cd queue new release
* fix llama-3.3-70b-versatile
* refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090)
* fix refactor - use consistent file naming convention
* ci/cd run again
* fix naming structure
* fix use consistent naming (#7092)
---------
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>
* catch DB_CONNECTION_ERROR_TYPES
* fix DB retry mechanism for SpendLog updates
* use DB_CONNECTION_ERROR_TYPES in auth checks
* fix exp back off for writing SpendLogs
* use _raise_failed_update_spend_exception to ensure errors print as NON blocking
* test_update_spend_logs_multiple_batches_with_failure
* feat(langfuse/): support langfuse prompt management
Initial working commit for langfuse prompt management support
Closes https://github.com/BerriAI/litellm/issues/6269
* test: update test
* fix(litellm_logging.py): suppress linting error
* fix get_standard_logging_object_payload
* fix async_post_call_failure_hook
* fix post_call_failure_hook
* fix change
* fix _is_proxy_only_error
* fix async_post_call_failure_hook
* fix getting request body
* remove redundant code
* use a well named original function name for auth errors
* fix logging auth fails on DD
* fix using request body
* use helper for _handle_logging_proxy_only_error
* feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc.
* fix(utils.py): allow disabling end user cost tracking with new param
Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small
* docs(configs.md): add disable_end_user_cost_tracking reference to docs
* feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role
Enables admin to restrict key creation, and assign team admins to handle distributing keys
* test(test_key_management.py): add unit testing for personal / team key restriction checks
* docs: add docs on restricting key creation
* docs(finetuned_models.md): add new guide on calling finetuned models
* docs(input.md): cleanup anthropic supported params
Closes https://github.com/BerriAI/litellm/issues/6856
* test(test_embedding.py): add test for passing extra headers via embedding
* feat(cohere/embed): pass client to async embedding
* feat(rerank.py): add `/v1/rerank` if missing for cohere base url
Closes https://github.com/BerriAI/litellm/issues/6844
* fix(main.py): pass extra_headers param to openai
Fixes https://github.com/BerriAI/litellm/issues/6836
* fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set
Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically
* fix(handler.py): fix linting error
* fix: fix typing
* build: add conftest to proxy_admin_ui_tests/
* test: fix test
* fix: fix linting errors
* test: fix test
* fix: fix pass through testing
* feat(customer_endpoints.py): support passing budget duration via `/customer/new` endpoint
Closes https://github.com/BerriAI/litellm/issues/5651
* docs: add missing params to swagger + api documentation test
* docs: add documentation for all key endpoints
documents all params on swagger
* docs(internal_user_endpoints.py): document all /user/new params
Ensures all params are documented
* docs(team_endpoints.py): add missing documentation for team endpoints
Ensures 100% param documentation on swagger
* docs(organization_endpoints.py): document all org params
Adds documentation for all params in org endpoint
* docs(customer_endpoints.py): add coverage for all params on /customer endpoints
ensures all /customer/* params are documented
* ci(config.yml): add endpoint doc testing to ci/cd
* fix: fix internal_user_endpoints.py
* fix(internal_user_endpoints.py): support 'duration' param
* fix(partner_models/main.py): fix anthropic re-raise exception on vertex
* fix: fix pydantic obj