Commit graph

450 commits

Author SHA1 Message Date
Krish Dholakia
20495cb415 test(base_llm_unit_tests.py): add test to ensure drop params is respe… (#8224)
* test(base_llm_unit_tests.py): add test to ensure drop params is respected

* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility

* build: add cherry picked commits
2025-02-03 16:04:44 -08:00
Krish Dholakia
8900b18504 Complete o3 model support (#8183)
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param

Closes https://github.com/BerriAI/litellm/issues/8182

* fix(main.py): ensure `reasoning_effort` is a mapped openai param

* refactor(azure/): rename o1_[x] files to o_series_[x]

* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort

* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests

* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers

Makes it easy to switch from openai to anthropic

* fix: fix linting errors

* fix(base_llm_unit_tests.py): fix test

* fix(main.py): add missing param
2025-02-02 22:36:37 -08:00
Krish Dholakia
16b5de07af Doc updates + management endpoint fixes (#8138)
* Litellm dev 01 29 2025 p4 (#8107)

* fix(key_management_endpoints.py): always get db team

Fixes https://github.com/BerriAI/litellm/issues/7983

* test(test_key_management.py): add unit test enforcing check_db_only is always true on key generate checks

* test: fix test

* test: skip gemini thinking

* Litellm dev 01 29 2025 p3 (#8106)

* fix(__init__.py): reduces size of __init__.py and reduces scope for errors by using correct param

* refactor(__init__.py): refactor init by cleaning up redundant params

* refactor(__init__.py): move more constants into constants.py

cleanup root

* refactor(__init__.py): more cleanup

* feat(__init__.py): expose new 'disable_hf_tokenizer_download' param

enables hf model usage in offline env

* docs(config_settings.md): document new disable_hf_tokenizer_download param

* fix: fix linting error

* fix: fix unsafe comparison

* test: fix test

* docs(public_teams.md): add doc showing how to expose public teams for users to join

* docs: add beta disclaimer on public teams

* test: update tests
2025-01-30 22:56:41 -08:00
Ishaan Jaff
9d8769fa1c (Feat) pass through vertex - allow using credentials defined on litellm router for vertex pass through (#8100)
* test_add_vertex_pass_through_deployment

* VertexPassThroughRouter

* fix use_in_pass_through

* VertexPassThroughRouter

* fix vertex_credentials

* allow using _initialize_deployment_for_pass_through

* test_add_vertex_pass_through_deployment

* _set_default_vertex_config

* fix verbose_proxy_logger

* fix use_in_pass_through

* fix _get_token_and_url

* test_get_vertex_location_from_url

* test_get_vertex_credentials_none

* run pt unit testing again

* fix add_vertex_credentials

* test_adding_deployments.py

* rename file
2025-01-29 17:54:02 -08:00
Krish Dholakia
e092635838 Bedrock document processing fixes (#8005)
* refactor(factory.py): refactor async bedrock message transformation to use async get request for image url conversion

improve latency of bedrock call

* test(test_bedrock_completion.py): add unit testing to ensure async image url get called for async bedrock call

* refactor(factory.py): refactor bedrock translation to use BedrockImageProcessor

reduces duplicate code

* fix(factory.py): fix bug not allowing pdf's to be processed

* fix(factory.py): fix bedrock converse document understanding with image url

* docs(bedrock.md): clarify all bedrock document types are supported

* refactor: cleanup redundant test + unused imports

* perf: improve perf with reusable clients

* test: fix test
2025-01-28 17:48:32 -08:00
Krish Dholakia
e96788ac0b Litellm dev 01 25 2025 p4 (#8006)
* feat(main.py): use asyncio.sleep for mock_Timeout=true on async request

adds unit testing to ensure proxy does not fail if specific Openai requests hang (e.g. recent o1 outage)

* fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming

Fixes https://github.com/BerriAI/litellm/issues/7942

* Revert "fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming"

This reverts commit 7a052a64e3.

* fix(deepseek-r-1): return reasoning_content as a top-level param

ensures compatibility with existing tools that use it

* fix: fix linting error
2025-01-26 08:01:05 -08:00
Ishaan Jaff
d1bc955d97 (Feat) - allow setting default_on guardrails (#7973)
* test_default_on_guardrail

* update debug on custom guardrail

* refactor guardrails init

* guardrail registry

* allow switching guardrails default_on

* fix circle import issue

* fix bedrock applying guardrails where content is a list

* fix unused import

* docs default on guardrail

* docs fix per api key
2025-01-24 10:14:05 -08:00
Krish Dholakia
fe460f19f5 Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958)
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls

* fix(utils.py): fix split model handling

Fixes bedrock cost calculation when region name is given

* feat(_health_endpoints.py): support health checking datadog integration

Closes https://github.com/BerriAI/litellm/issues/7921
2025-01-23 22:17:09 -08:00
Ishaan Jaff
fda318ea30 (Feat) - emit litellm_team_budget_reset_at_metric and litellm_api_key_budget_remaining_hours_metric on prometheus (#7946)
* set litellm_team_budget_reset_at_metric

* add _get_team_info_from_db_lru_cached

* _set_team_budget_metrics

* e2e test_team_budget_metrics

* update doc string

* add _get_remaining_hours_for_budget_reset

* fix team endpoints

* _get_remaining_hours_for_budget_reset

* _set_key_budget_metrics  on startup

* test_key_budget_metrics

* prom fixes for emitting key / team metrics

* fix _set_api_key_budget_metrics_after_api_request

* test_increment_remaining_budget_metrics

* unit test test_increment_remaining_budget_metrics

* test_initialize_remaining_budget_metrics
2025-01-23 18:12:47 -08:00
Ishaan Jaff
a89fbe0e6f add litellm_team_budget_reset_at_metric 2025-01-23 08:54:19 -08:00
Krish Dholakia
b286bab075 Add attempted-retries and timeout values to response headers + more testing (#7926)
* feat(router.py): add retry headers to response

makes it easy to add testing to ensure model-specific retries are respected

* fix(add_retry_headers.py): clarify attempted retries vs. max retries

* test(test_fallbacks.py): add test for checking if max retries set for model is respected

* test(test_fallbacks.py): assert values for attempted retries and max retries are as expected

* fix(utils.py): return timeout in litellm proxy response headers

* test(test_fallbacks.py): add test to assert model specific timeout used on timeout error

* test: add bad model with timeout to proxy

* fix: fix linting error

* fix(router.py): fix get model list from model alias

* test: loosen test restriction - account for other events on proxy
2025-01-22 22:19:44 -08:00
Krish Dholakia
bf1639cb92 Litellm dev 01 22 2025 p4 (#7932)
* feat(main.py): add new 'provider_specific_header' param

allows passing extra header for specific provider

* fix(litellm_pre_call_utils.py): add unit test for pre call utils

* test(test_bedrock_completion.py): skip test now that bedrock supports this
2025-01-22 21:52:07 -08:00
Krish Dholakia
4d89da9c97 Deepseek r1 support + watsonx qa improvements (#7907)
* fix(types/utils.py): support returning 'reasoning_content' for deepseek models

Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218

* fix(convert_dict_to_response.py): return deepseek response in provider_specific_field

allows for separating openai vs. non-openai params in model response

* fix(utils.py): support 'provider_specific_field' in delta chunk as well

allows deepseek reasoning content chunk to be returned to user from stream as well

Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218

* fix(watsonx/chat/handler.py): fix passing space id to watsonx on chat route

* fix(watsonx/): fix watsonx_text/ route with space id

* fix(watsonx/): qa item - also adds better unit testing for watsonx embedding calls

* fix(utils.py): rename to '..fields'

* fix: fix linting errors

* fix(utils.py): fix typing - don't show provider-specific field if none or empty - prevents default respons
e from being non-oai compatible

* fix: cleanup unused imports

* docs(deepseek.md): add docs for deepseek reasoning model
2025-01-21 23:13:15 -08:00
Ishaan Jaff
d1f86ad111 (Feat - prometheus) - emit litellm_overhead_latency_metric (#7913)
* add track_llm_api_timing

* add track_llm_api_timing

* test_litellm_overhead

* use ResponseMetadata class for setting hidden params and response overhead

* instrument http handler

* fix track_llm_api_timing

* track_llm_api_timing

* emit response overhead on hidden params

* fix resp metadata

* fix make_sync_openai_embedding_request

* test_aaaaatext_completion_endpoint fixes

* _get_value_from_hidden_params

* set_hidden_params

* test_litellm_overhead

* test_litellm_overhead

* test_litellm_overhead

* fix import

* test_litellm_overhead_stream

* add LiteLLMLoggingObject

* use diff folder for testing

* use diff folder for overhead testing

* test litellm overhead

* use typing

* clear typing

* test_litellm_overhead

* fix async_streaming

* update_response_metadata

* move test file

* emit litellm_overhead_latency_metric on prometheus

* add prometheus callback

* litellm_overhead_latency_metric_bucket

* fix apply hidden params

* fix StandardLoggingHiddenParams
2025-01-21 20:36:30 -08:00
Krish Dholakia
dec558ba4c Litellm dev 01 21 2025 p1 (#7898)
* fix(utils.py): don't pass 'anthropic-beta' header to vertex - will cause request to fail

* fix(utils.py): add flag to allow user to disable filtering invalid headers

ensure user can control behaviour

* style(utils.py): cleanup message

* test(test_utils.py): add unit test to cover invalid header filtering

* fix(proxy_server.py): fix custom openapi schema generation

* fix(utils.py): pass extra headers if set

* fix(main.py): fix image variation to use 'client' param
2025-01-21 20:36:11 -08:00
Ishaan Jaff
a4d3276bed (Feat) datadog_llm_observability callback - emit request_tags on logs (#7883)
* dd - emit tags on llm obs payload

* dd  - show requester tags on traces

* test_get_datadog_tags

* _get_datadog_tags

* fix dd POD_NAME

* test_get_datadog_tags
2025-01-20 20:36:27 -08:00
Krish Dholakia
37d89887e6 LiteLLM Minor Fixes & Improvements (01/18/2025) - p1 (#7857)
* OllamaChatConfig supports JSON schema response format in optional parameters (#7832)

* fix(types/router.py): handle none values for bool types

Fixes https://github.com/BerriAI/litellm/issues/7855#issuecomment-2599781974

* test: handle no hf token in env

---------

Co-authored-by: trislaz <35226192+trislaz@users.noreply.github.com>
2025-01-18 19:03:50 -08:00
Krish Dholakia
2b58f16fda refactor: make bedrock image transformation requests async (#7840)
* refactor: initial commit for using separate sync vs. async transformation routes for bedrock

ensures no blocking calls e.g. when converting image url to b64

* perf(converse_transformation.py): make bedrock converse transformation async

asyncify's the bedrock message transformation - useful for handling image urls for bedrock

* fix(converse_handler.py): fix logging for async streaming

* style: cleanup unused imports
2025-01-17 20:14:15 -08:00
Ishaan Jaff
b6d1ab6152 test_completion_mistral_api_mistral_large_function_call 2025-01-16 22:27:48 -08:00
Ishaan Jaff
2177bdc836 (datadog llm observability) - fixes + improvements for using datadog llm observability logging integration (#7824)
* dd llm obs fixes

* _ensure_string_content

* fix _get_dd_llm_obs_payload_metadata
2025-01-16 22:02:24 -08:00
Krish Dholakia
000d3152a8 Litellm dev 01 14 2025 p1 (#7771)
* First-class Aim Guardrails support (#7738)

* initial aim support

* add tests

* docs(langsmith_integration.md): cleanup

* style: cleanup unused imports

---------

Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
2025-01-14 16:18:21 -08:00
Krish Dholakia
8ee79dd5d9 [BETA] Add OpenAI /images/variations + Topaz API support (#7700)
* feat(main.py): initial commit for `/image/variations` endpoint support

* refactor(base_llm/): introduce new base llm base config for image variation endpoints

* refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler

* fix: test

* feat(openai/): working openai `/image/variation` endpoint calls via sdk

* feat(topaz/): topaz sync image variation call support

Addresses https://github.com/BerriAI/litellm/issues/7593

'

* fix(topaz/transformation.py): fix linting errors

* fix(openai/image_variations/handler.py): fix passing json data

* fix(main.py): image_variation/

support async image variation route - `aimage_variation`

* fix(test_get_model_info.py): fix test

* fix: cleanup unused imports

* feat(openai/): add async `/image/variations` endpoint support

* feat(topaz/): support async `/image/variations` calls

* fix: test

* fix(utils.py): fix get_model_info_helper for no model info w/ provider config

handles situation where model info is not known but provider config exists

* test(test_router_fallbacks.py): mark flaky test

* fix: fix unused imports

* test: bump otel load test perf threshold - accounts for current load tests hitting same server
2025-01-11 23:27:46 -08:00
Krish Dholakia
953c021aa7 Litellm dev 01 10 2025 p3 (#7682)
* feat(langfuse.py): log the used prompt when prompt management used

* test: fix test

* docs(self_serve.md): add doc on restricting personal key creation on ui

* feat(s3.py): support s3 logging with team alias prefixes (if available)

New preview feature

* fix(main.py): remove old if block - simplify to just await if coroutine returned

fixes lm_studio async embedding error

* fix(langfuse.py): handle get prompt check
2025-01-10 21:56:42 -08:00
Ishaan Jaff
9e2b1101c0 (litellm sdk - perf improvement) - use O(1) set lookups for checking llm providers / models (#7672)
* fix get model info logic to use O(1) lookups

* perf - use O(1) lookup for get llm provider
2025-01-10 14:16:30 -08:00
Krish Dholakia
75c3ddfc9e fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… (#7660)
* fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url

* refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic

deduplicates code

* fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages

* docs(prompt_management.md): update prompt management to be in beta

given feedback - this still needs to be revised (e.g. passing in user message, not ignoring)

* refactor(prompt_management_base.py): introduce base class for prompt management

allows consistent behaviour across prompt management integrations

* feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base

* fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set

allows tracking what prompt was used for what purpose

* feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse

allows logging prompt id / prompt variables to langfuse

* test: fix test

* fix(router.py): cleanup unused imports

* fix: fix linting error

* fix: fix trace param typing

* fix: fix linting errors

* fix: fix code qa check
2025-01-10 07:31:59 -08:00
Krish Dholakia
6d8cfeaf14 LiteLLM Minor Fixes & Improvements (01/08/2025) - p2 (#7643)
* fix(streaming_chunk_builder_utils.py): add test for groq tool calling + streaming + combine chunks

Addresses https://github.com/BerriAI/litellm/issues/7621

* fix(streaming_utils.py): fix modelresponseiterator for openai like chunk parser

ensures chunk parser uses the correct tool call id when translating the chunk

 Fixes https://github.com/BerriAI/litellm/issues/7621

* build(model_hub.tsx): display cost pricing on model hub

* build(model_hub.tsx): show cost per token pricing + complete model information

* fix(types/utils.py): fix usage object handling
2025-01-08 19:45:19 -08:00
Ishaan Jaff
81d1826c25 [Feature]: - allow print alert log to console (#7534)
* update send_to_webhook

* test_print_alerting_payload_warning

* add alerting_args spec

* test_alerting.py
2025-01-03 17:48:13 -08:00
Krish Dholakia
796913fb30 Fix langfuse prompt management on proxy (#7535)
* fix(types/utils.py): support langfuse + humanloop routes on llm router

* fix(main.py): remove acompletion elif block

just await if coroutine returned
2025-01-03 12:42:37 -08:00
Ishaan Jaff
3a454ee2ce (perf) use aiohttp for custom_openai (#7514)
* use aiohttp handler

* BaseLLMAIOHTTPHandler

* use CustomOpenAIChatConfig

* CustomOpenAIChatConfig

* CustomOpenAIChatConfig

* fix linting

* AiohttpOpenAIChatConfig

* fix order

* aiohttp_openai
2025-01-02 22:15:17 -08:00
Krish Dholakia
02ff7b0a8a Litellm dev 01 01 2025 p1 (#7498)
* refactor(prometheus.py): refactor to remove `_tag` metrics and incorporate in regular metrics

* fix(prometheus.py): handle label values not set in enum values

* feat(prometheus.py): working e2e custom metadata labels

* docs(prometheus.md): update docs to clarify how custom metrics would work

* test(test_prometheus_unit_tests.py): fix test

* test: add unit testing
2025-01-01 18:59:28 -08:00
Krish Dholakia
b0f570ee16 Litellm dev 12 30 2024 p2 (#7495)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* fix(types/utils.py): handle none logprobs

Fixes https://github.com/BerriAI/litellm/issues/328

* fix(exception_mapping_utils.py): fix error str unbound error

* refactor(azure_ai/): move to openai_like chat completion handler

allows for easy swapping of api base url's (e.g. ai.services.com)

Fixes https://github.com/BerriAI/litellm/issues/7275

* refactor(azure_ai/): move to base llm http handler

* fix(azure_ai/): handle differing api endpoints

* fix(azure_ai/): make sure all unit tests are passing

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting error

* fix: fix linting errors

* fix(azure_ai/transformation.py): handle extra body param

* fix(azure_ai/transformation.py): fix max retries param handling

* fix: fix test

* test(test_azure_o1.py): fix test

* fix(llm_http_handler.py): support handling azure ai unprocessable entity error

* fix(llm_http_handler.py): handle sync invalid param error for azure ai

* fix(azure_ai/): streaming support with base_llm_http_handler

* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai

* fix: fix linting errors

* fix(llm_http_handler.py): fix linting error

* fix(azure_ai/): handle cohere tool call invalid index param error
2025-01-01 18:57:29 -08:00
Ishaan Jaff
0cbecbe185 (Feat) - LiteLLM Use UsernamePasswordCredential for Azure OpenAI (#7496)
* add get_azure_ad_token_from_username_password

* docs azure use username / password for auth

* update doc

* get_azure_ad_token_from_username_password

* test test_get_azure_ad_token_from_username_password
2025-01-01 14:11:27 -08:00
Ishaan Jaff
0b4d529af8 (feat) POST /fine_tuning/jobs support passing vertex specific hyper params (#7490)
* update convert_openai_request_to_vertex

* test_create_vertex_fine_tune_jobs_mocked

* fix order of methods

* update LiteLLMFineTuningJobCreate

* update OpenAIFineTuningHyperparameters

* update vertex hyper params in response

* _transform_openai_hyperparameters_to_vertex_hyperparameters

* supervised_tuning_spec["hyperParameters"] fix

* fix mapping for ft params testing

* docs fine tuning apis

* fix test_convert_basic_openai_request_to_vertex_request

* update hyperparams for create fine tuning

* fix linting

* test_create_vertex_fine_tune_jobs_mocked_with_hyperparameters

* run ci/cd again

* test_convert_basic_openai_request_to_vertex_request
2025-01-01 07:44:48 -08:00
Krish Dholakia
c46c1e6ea0 Prometheus - custom metrics support + other improvements (#7489)
* fix(prometheus.py): refactor litellm_input_tokens_metric to use label factory

makes adding new metrics easier

* feat(prometheus.py): add 'request_model' to 'litellm_input_tokens_metric'

* refactor(prometheus.py): refactor 'litellm_output_tokens_metric' to use label factory

makes adding new metrics easier

* feat(prometheus.py): emit requested model in 'litellm_output_tokens_metric'

* feat(prometheus.py): support tracking success events with custom metrics

* refactor(prometheus.py): refactor '_set_latency_metrics' to just use the initially created enum values dictionary

reduces scope for missing values

* feat(prometheus.py): refactor all tags to support custom metadata tags

enables metadata tags to be used across for e2e tracking

* fix(prometheus.py): fix requested model on success event enum_values

* test: fix test

* test: fix test

* test: handle filenotfound error

* docs(prometheus.md): add new values to prometheus

* docs(prometheus.md): document adding custom metrics on prometheus

* bump: version 1.56.5 → 1.56.6
2025-01-01 07:41:50 -08:00
Ishaan Jaff
a39cac313c (Feat) - Add PagerDuty Alerting Integration (#7478)
* define basic types

* fix verbose_logger.exception statement

* fix basic alerting

* test pager duty alerting

* test_pagerduty_alerting_high_failure_rate

* PagerDutyAlerting

* async_log_failure_event

* use pre_call_hook

* add _request_is_completed helper util

* update AlertingConfig

* rename PagerDutyInternalEvent

* _send_alert_if_thresholds_crossed

* use pagerduty as _custom_logger_compatible_callbacks_literal

* fix slack alerting imports

* fix imports in slack alerting

* PagerDutyAlerting

* fix _load_alerting_settings

* test_pagerduty_hanging_request_alerting

* working pager duty alerting

* fix linting

* doc pager duty alerting

* update hanging_response_handler

* fix import location

* update failure_threshold

* update async_pre_call_hook

* docs pagerduty

* test - callback_class_str_to_classType

* fix linting errors

* fix linting + testing error

* PagerDutyAlerting

* test_pagerduty_hanging_request_alerting

* fix unused imports

* docs pager duty

* @pytest.mark.flaky(retries=6, delay=2)

* test_model_info_bedrock_converse_enforcement
2025-01-01 07:12:51 -08:00
Krish Dholakia
03fa654b97 Litellm dev 12 31 2024 p1 (#7488)
* fix(internal_user_endpoints.py): fix team list sort - handle team_alias being set + None

* fix(key_management_endpoints.py): allow team admin to create key for member via admin ui

Fixes https://github.com/BerriAI/litellm/issues/7482

* fix(proxy_server.py): allow querying info on specific model group via `/model_group/info`

allows client-side user to get model info from proxy

* fix(proxy_server.py): add docstring on `/model_group/info` showing how to filter by model name

* test(test_proxy_utils.py): add unit test for returning model group info filtered

* fix(proxy_server.py): fix query param

* fix(test_Get_model_info.py): handle no whitelisted bedrock modells
2024-12-31 23:21:51 -08:00
Krish Dholakia
39a11ad272 Fix team-based logging to langfuse + allow custom tokenizer on /token_counter endpoint (#7493)
* fix(langfuse_prompt_management.py): migrate dynamic logging to langfuse custom logger compatible class

* fix(langfuse_prompt_management.py): support failure callback logging to langfuse as well

* feat(proxy_server.py): support setting custom tokenizer on config.yaml

Allows customizing value for `/utils/token_counter`

* fix(proxy_server.py): fix linting errors

* test: skip if file not found

* style: cleanup unused import

* docs(configs.md): add docs on setting custom tokenizer
2024-12-31 23:18:41 -08:00
Krish Dholakia
77c13df55d HumanLoop integration for Prompt Management (#7479)
* feat(humanloop.py): initial commit for humanloop prompt management integration

Closes https://github.com/BerriAI/litellm/issues/213

* feat(humanloop.py): working e2e humanloop prompt management integration

Closes https://github.com/BerriAI/litellm/issues/213

* fix(humanloop.py): fix linting errors

* fix: fix linting erro

* fix: fix test

* test: handle filenotfound error
2024-12-30 22:26:03 -08:00
Krish Dholakia
0178e75cd9 Litellm dev 12 30 2024 p1 (#7480)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* test: fix azure o1 test

* test: fix tests

* fix: fix test
2024-12-30 21:52:52 -08:00
Ishaan Jaff
eac201598b test_rerank_response_assertions (#7476) 2024-12-30 10:12:56 -08:00
Krish Dholakia
9150722a00 Litellm dev 12 28 2024 p2 (#7458)
* docs(sidebar.js): docs for support model access groups for wildcard routes

* feat(key_management_endpoints.py): add check if user is premium_user when adding model access group for wildcard route

* refactor(docs/): make control model access a root-level doc in proxy sidebar

easier to discover how to control model access on litellm

* docs: more cleanup

* feat(fireworks_ai/): add document inlining support

Enables user to call non-vision models with images/pdfs/etc.

* test(test_fireworks_ai_translation.py): add unit testing for fireworks ai transform inline helper util

* docs(docs/): add document inlining details to fireworks ai docs

* feat(fireworks_ai/): allow user to dynamically disable auto add transform inline

allows client-side disabling of this feature for proxy users

* feat(fireworks_ai/): return 'supports_vision' and 'supports_pdf_input' true on all fireworks ai models

now true as fireworks ai supports document inlining

* test: fix tests

* fix(router.py): add unit testing for _is_model_access_group_for_wildcard_route
2024-12-28 19:38:06 -08:00
Krish Dholakia
ebc28b1921 Litellm dev 12 28 2024 p3 (#7464)
* feat(deepgram/): initial e2e support for deepgram stt

Uses deepgram's `/listen` endpoint to transcribe speech to text

 Closes https://github.com/BerriAI/litellm/issues/4875

* fix: fix linting errors

* test: fix test
2024-12-28 19:18:58 -08:00
Ishaan Jaff
6ec5ed8b3c (Feat) Log Guardrails run, guardrail response on logging integrations (#7445)
* add guardrail_information to SLP

* use standard_logging_guardrail_information

* track StandardLoggingGuardrailInformation

* use log_guardrail_information

* use log_guardrail_information

* docs guardrails

* docs guardrails

* update quick start

* fix presidio logging for sync functions

* update Guardrail type

* enforce add_standard_logging_guardrail_information_to_request_data

* update gd docs
2024-12-27 15:01:56 -08:00
Ishaan Jaff
e65cc581b3 (feat) /guardrails/list show guardrail info params (#7442)
* add GuardrailInfoResponse

* add list_guardrails

* test_get_guardrails_list_response
2024-12-27 14:35:00 -08:00
Krish Dholakia
f30260343b Litellm dev 12 26 2024 p3 (#7434)
* build(model_prices_and_context_window.json): update groq models to specify 'supports_vision' parameter

Closes https://github.com/BerriAI/litellm/issues/7433

* docs(groq.md): add groq vision example to docs

Closes https://github.com/BerriAI/litellm/issues/7433

* fix(prometheus.py): refactor self.litellm_proxy_failed_requests_metric to use label factory

* feat(prometheus.py): new 'litellm_proxy_failed_requests_by_tag_metric'

allows tracking failed requests by tag on proxy

* fix(prometheus.py): fix exception logging

* feat(prometheus.py): add new 'litellm_request_total_latency_by_tag_metric'

enables tracking latency by use-case

* feat(prometheus.py): add new llm api latency by tag metric

* feat(prometheus.py): new litellm_deployment_latency_per_output_token_by_tag metric

allows tracking deployment latency by tag

* fix(prometheus.py): refactor 'litellm_requests_metric' to use enum values + label factory

* feat(prometheus.py): new litellm_proxy_total_requests_by_tag metric

allows tracking total requests by tag

* feat(prometheus.py): new metric litellm_deployment_successful_fallbacks_by_tag

allows tracking deployment fallbacks by tag

* fix(prometheus.py): new 'litellm_deployment_failed_fallbacks_by_tag' metric

allows tracking failed fallbacks on deployment by custom tag

* test: fix test

* test: rename test to run earlier

* test: skip flaky test
2024-12-26 21:21:16 -08:00
Krish Dholakia
d6a2beb342 Support budget/rate limit tiers for keys (#7429)
* feat(proxy/utils.py): get associated litellm budget from db in combined_view for key

allows user to create rate limit tiers and associate those to keys

* feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set

allows rate limit tiers to be easily applied to keys

* docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers

make feature discoverable

* feat(key_management_endpoints.py): return litellm_budget_table value in key generate

make it easy for user to know associated budget on key creation

* fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate`

* docs(key_management_endpoints.py): document budget_id usage

* refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it

* docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs

* fix(customer_endpoints.py): use new pydantic obj name

* docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm

* Litellm dev 12 26 2024 p2 (#7432)

* (Feat) Add logging for `POST v1/fine_tuning/jobs`  (#7426)

* init commit ft jobs logging

* add ft logging

* add logging for FineTuningJob

* simple FT Job create test

* (docs) - show all supported Azure OpenAI endpoints in overview  (#7428)

* azure batches

* update doc

* docs azure endpoints

* docs endpoints on azure

* docs azure batches api

* docs azure batches api

* fix(key_management_endpoints.py): fix key update to actually work

* test(test_key_management.py): add e2e test asserting ui key update call works

* fix: proxy/_types - fix linting erros

* test: update test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* fix: test

* fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers

* fix: fix linting errors

* test: fix test

* fix: remove unused import

* test: update test

* docs(customer_endpoints.py): document new model_max_budget param

* test: specify unique key alias

* docs(budget_management_endpoints.py): document new model_max_budget param

* test: fix test

* test: fix tests

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-12-26 19:05:27 -08:00
Ishaan Jaff
aff0f974bc (Feat) Add logging for POST v1/fine_tuning/jobs (#7426)
* init commit ft jobs logging

* add ft logging

* add logging for FineTuningJob

* simple FT Job create test
2024-12-26 08:58:47 -08:00
Krish Dholakia
8567342bd4 Litellm dev 12 25 2024 p3 (#7421)
* refactor(prometheus.py): refactor to use a factory method for setting label values

allows for enforcing end user id disabling on prometheus e2e

* fix: fix linting error

* fix(prometheus.py): ensure label factory drops end-user value if disabled by user

* fix(prometheus.py): specify service_type in end user tracking get

* test: fix test

* test: add unit test for prometheus factory

* test: improve test (cover flag not set scenario)

* test(test_prometheus.py): e2e test covering if 'end_user_id' shows up in testing if disabled

scrapes the `/metrics` endpoint and scans text to check if id appears in emitted metrics

* fix(prometheus.py): stringify status code before logging it
2024-12-25 18:54:24 -08:00
Krish Dholakia
7af1f8a0c7 Litellm dev 12 25 2025 p2 (#7420)
* test: add new test image embedding to base llm unit tests

Addresses https://github.com/BerriAI/litellm/issues/6515

* fix(bedrock/embed/multimodal-embeddings): strip data prefix from image urls for bedrock multimodal embeddings

Fix https://github.com/BerriAI/litellm/issues/6515

* feat: initial commit for fireworks ai audio transcription support

Relevant issue: https://github.com/BerriAI/litellm/issues/7134

* test: initial fireworks ai test

* feat(fireworks_ai/): implemented fireworks ai audio transcription config

* fix(utils.py): register fireworks ai audio transcription config, in config manager

* fix(utils.py): add fireworks ai param translation to 'get_optional_params_transcription'

* refactor(fireworks_ai/): define text completion route with model name handling

moves model name handling to specific fireworks routes, as required by their api

* refactor(fireworks_ai/chat): define transform_Request - allows fixing model if accounts/ is missing

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(handler.py): fix linting errors

* fix(main.py): fix tgai text completion route

* refactor(together_ai/completion): refactors together ai text completion route to just use provider transform request

* refactor: move test_fine_tuning_api out of local_testing

reduces local testing ci/cd time
2024-12-25 18:35:34 -08:00
Ishaan Jaff
5612103ea3 (feat) Support Dynamic Params for guardrails (#7415)
* update CustomGuardrail

* unit test custom guardrails

* add dynamic params for aporia

* add dynamic params to bedrock guard

* add dynamic params for all guardrails

* fix linting

* fix should_run_guardrail

* _validate_premium_user

* update guardrail doc

* doc update

* update code q

* should_run_guardrail
2024-12-25 16:07:29 -08:00