Commit graph

2779 commits

Author SHA1 Message Date
Krish Dholakia
1e403a8447
Litellm dev 10 29 2024 (#6502)
* fix(core_helpers.py): return None, instead of raising kwargs is None error

Closes https://github.com/BerriAI/litellm/issues/6500

* docs(cost_tracking.md): cleanup doc

* fix(vertex_and_google_ai_studio.py): handle function call with no params passed in

Closes https://github.com/BerriAI/litellm/issues/6495

* test(test_router_timeout.py): add test for router timeout + retry logic

* test: update test to use module level values

* (fix) Prometheus - Log Postgres DB latency, status on prometheus  (#6484)

* fix logging DB fails on prometheus

* unit testing log to otel wrapper

* unit testing for service logger + prometheus

* use LATENCY buckets for service logging

* fix service logging

* docs clarify vertex vs gemini

* (router_strategy/) ensure all async functions use async cache methods (#6489)

* fix router strat

* use async set / get cache in router_strategy

* add coverage for router strategy

* fix imports

* fix batch_get_cache

* use async methods for least busy

* fix least busy use async methods

* fix test_dual_cache_increment

* test async_get_available_deployment when routing_strategy="least-busy"

* (fix) proxy - fix when `STORE_MODEL_IN_DB` should be set (#6492)

* set store_model_in_db at the top

* correctly use store_model_in_db global

* (fix) `PrometheusServicesLogger` `_get_metric` should return metric in Registry  (#6486)

* fix logging DB fails on prometheus

* unit testing log to otel wrapper

* unit testing for service logger + prometheus

* use LATENCY buckets for service logging

* fix service logging

* fix _get_metric in prom services logger

* add clear doc string

* unit testing for prom service logger

* bump: version 1.51.0 → 1.51.1

* Add `azure/gpt-4o-mini-2024-07-18` to model_prices_and_context_window.json (#6477)

* Update utils.py (#6468)

Fixed missing keys

* (perf) Litellm redis router fix - ~100ms improvement (#6483)

* docs(exception_mapping.md): add missing exception types

Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183

* fix(main.py): register custom model pricing with specific key

Ensure custom model pricing is registered to the specific model+provider key combination

* test: make testing more robust for custom pricing

* fix(redis_cache.py): instrument otel logging for sync redis calls

ensures complete coverage for all redis cache calls

* refactor: pass parent_otel_span for redis caching calls in router

allows for more observability into what calls are causing latency issues

* test: update tests with new params

* refactor: ensure e2e otel tracing for router

* refactor(router.py): add more otel tracing acrosss router

catch all latency issues for router requests

* fix: fix linting error

* fix(router.py): fix linting error

* fix: fix test

* test: fix tests

* fix(dual_cache.py): pass ttl to redis cache

* fix: fix param

* perf(cooldown_cache.py): improve cooldown cache, to store cache results in memory for 5s, prevents redis call from being made on each request

reduces 100ms latency per call with caching enabled on router

* fix: fix test

* fix(cooldown_cache.py): handle if a result is None

* fix(cooldown_cache.py): add debug statements

* refactor(dual_cache.py): move to using an in-memory check for batch get cache, to prevent redis from being hit for every call

* fix(cooldown_cache.py): fix linting erropr

* refactor(prometheus.py): move to using standard logging payload for reading the remaining request / tokens

Ensures prometheus token tracking works for anthropic as well

* fix: fix linting error

* fix(redis_cache.py): make sure ttl is always int (handle float values)

Fixes issue where redis_client.ex was not working correctly due to float ttl

* fix: fix linting error

* test: update test

* fix: fix linting error

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: vibhanshu-ob <115142120+vibhanshu-ob@users.noreply.github.com>
2024-10-29 22:04:16 -07:00
Krish Dholakia
6b9be5092f
LiteLLM Minor Fixes & Improvements (10/28/2024) (#6475)
* fix(anthropic/chat/transformation.py): support anthropic disable_parallel_tool_use param

Fixes https://github.com/BerriAI/litellm/issues/6456

* feat(anthropic/chat/transformation.py): support anthropic computer tool use

Closes https://github.com/BerriAI/litellm/issues/6427

* fix(vertex_ai/common_utils.py): parse out '$schema' when calling vertex ai

Fixes issue when trying to call vertex from vercel sdk

* fix(main.py): add 'extra_headers' support for azure on all translation endpoints

Fixes https://github.com/BerriAI/litellm/issues/6465

* fix: fix linting errors

* fix(transformation.py): handle no beta headers for anthropic

* test: cleanup test

* fix: fix linting error

* fix: fix linting errors

* fix: fix linting errors

* fix(transformation.py): handle dummy tool call

* fix(main.py): fix linting error

* fix(azure.py): pass required param

* LiteLLM Minor Fixes & Improvements (10/24/2024) (#6441)

* fix(azure.py): handle /openai/deployment in azure api base

* fix(factory.py): fix faulty anthropic tool result translation check

Fixes https://github.com/BerriAI/litellm/issues/6422

* fix(gpt_transformation.py): add support for parallel_tool_calls to azure

Fixes https://github.com/BerriAI/litellm/issues/6440

* fix(factory.py): support anthropic prompt caching for tool results

* fix(vertex_ai/common_utils): don't pop non-null required field

Fixes https://github.com/BerriAI/litellm/issues/6426

* feat(vertex_ai.py): support code_execution tool call for vertex ai + gemini

Closes https://github.com/BerriAI/litellm/issues/6434

* build(model_prices_and_context_window.json): Add 'supports_assistant_prefill' for bedrock claude-3-5-sonnet v2 models

Closes https://github.com/BerriAI/litellm/issues/6437

* fix(types/utils.py): fix linting

* test: update test to include required fields

* test: fix test

* test: handle flaky test

* test: remove e2e test - hitting gemini rate limits

* Litellm dev 10 26 2024 (#6472)

* docs(exception_mapping.md): add missing exception types

Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183

* fix(main.py): register custom model pricing with specific key

Ensure custom model pricing is registered to the specific model+provider key combination

* test: make testing more robust for custom pricing

* fix(redis_cache.py): instrument otel logging for sync redis calls

ensures complete coverage for all redis cache calls

* (Testing) Add unit testing for DualCache - ensure in memory cache is used when expected  (#6471)

* test test_dual_cache_get_set

* unit testing for dual cache

* fix async_set_cache_sadd

* test_dual_cache_local_only

* redis otel tracing + async support for latency routing (#6452)

* docs(exception_mapping.md): add missing exception types

Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183

* fix(main.py): register custom model pricing with specific key

Ensure custom model pricing is registered to the specific model+provider key combination

* test: make testing more robust for custom pricing

* fix(redis_cache.py): instrument otel logging for sync redis calls

ensures complete coverage for all redis cache calls

* refactor: pass parent_otel_span for redis caching calls in router

allows for more observability into what calls are causing latency issues

* test: update tests with new params

* refactor: ensure e2e otel tracing for router

* refactor(router.py): add more otel tracing acrosss router

catch all latency issues for router requests

* fix: fix linting error

* fix(router.py): fix linting error

* fix: fix test

* test: fix tests

* fix(dual_cache.py): pass ttl to redis cache

* fix: fix param

* fix(dual_cache.py): set default value for parent_otel_span

* fix(transformation.py): support 'response_format' for anthropic calls

* fix(transformation.py): check for cache_control inside 'function' block

* fix: fix linting error

* fix: fix linting errors

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-10-29 17:20:24 -07:00
Ishaan Jaff
f9ba74ef87 docs clarify vertex vs gemini 2024-10-29 13:14:26 +05:30
Krish Dholakia
70111a7abd
Litellm dev 10 26 2024 (#6472)
* docs(exception_mapping.md): add missing exception types

Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183

* fix(main.py): register custom model pricing with specific key

Ensure custom model pricing is registered to the specific model+provider key combination

* test: make testing more robust for custom pricing

* fix(redis_cache.py): instrument otel logging for sync redis calls

ensures complete coverage for all redis cache calls
2024-10-28 15:05:43 -07:00
Ishaan Jaff
030ece8c3f
(Feat) New Logging integration - add Datadog LLM Observability support (#6449)
* add type for dd llm obs request ob

* working dd llm obs

* datadog use well defined type

* clean up

* unit test test_create_llm_obs_payload

* fix linting

* add datadog_llm_observability

* add datadog_llm_observability

* docs DD LLM obs

* run testing again

* document DD_ENV

* test_create_llm_obs_payload
2024-10-28 22:01:32 +05:30
Krish Dholakia
c03e5da41f
LiteLLM Minor Fixes & Improvements (10/24/2024) (#6421)
* fix(utils.py): support passing dynamic api base to validate_environment

Returns True if just api base is required and api base is passed

* fix(litellm_pre_call_utils.py): feature flag sending client headers to llm api

Fixes https://github.com/BerriAI/litellm/issues/6410

* fix(anthropic/chat/transformation.py): return correct error message

* fix(http_handler.py): add error response text in places where we expect it

* fix(factory.py): handle base case of no non-system messages to bedrock

Fixes https://github.com/BerriAI/litellm/issues/6411

* feat(cohere/embed): Support cohere image embeddings

Closes https://github.com/BerriAI/litellm/issues/6413

* fix(__init__.py): fix linting error

* docs(supported_embedding.md): add image embedding example to docs

* feat(cohere/embed): use cohere embedding returned usage for cost calc

* build(model_prices_and_context_window.json): add embed-english-v3.0 details (image cost + 'supports_image_input' flag)

* fix(cohere_transformation.py): fix linting error

* test(test_proxy_server.py): cleanup test

* test: cleanup test

* fix: fix linting errors
2024-10-25 15:55:56 -07:00
Krish Dholakia
1cd1d23fdf
LiteLLM Minor Fixes & Improvements (10/23/2024) (#6407)
* docs(bedrock.md): clarify bedrock auth in litellm docs

* fix(convert_dict_to_response.py): Fixes https://github.com/BerriAI/litellm/issues/6387

* feat(pattern_match_deployments.py): more robust handling for wildcard routes (model_name: custom_route/* -> openai/*)

Enables user to expose custom routes to users with dynamic handling

* test: add more testing

* docs(custom_pricing.md): add debug tutorial for custom pricing

* test: skip codestral test - unreachable backend

* test: fix test

* fix(pattern_matching_deployments.py): fix typing

* test: cleanup codestral tests - backend api unavailable

* (refactor) prometheus async_log_success_event to be under 100 LOC  (#6416)

* unit testig for prometheus

* unit testing for success metrics

* use 1 helper for _increment_token_metrics

* use helper for _increment_remaining_budget_metrics

* use _increment_remaining_budget_metrics

* use _increment_top_level_request_and_spend_metrics

* use helper for _set_latency_metrics

* remove noqa violation

* fix test prometheus

* test prometheus

* unit testing for all prometheus helper functions

* fix prom unit tests

* fix unit tests prometheus

* fix unit test prom

* (refactor) router - use static methods for client init utils  (#6420)

* use InitalizeOpenAISDKClient

* use InitalizeOpenAISDKClient static method

* fix  # noqa: PLR0915

* (code cleanup) remove unused and undocumented logging integrations - litedebugger, berrispend  (#6406)

* code cleanup remove unused and undocumented code files

* fix unused logging integrations cleanup

* bump: version 1.50.3 → 1.50.4

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-10-24 19:01:41 -07:00
Ishaan Jaff
807e9dcea8
(docs + testing) Correctly document the timeout value used by litellm proxy is 6000 seconds + add to best practices for prod (#6339)
* fix docs use documented timeout

* document request timeout

* add test for litellm.request_timeout

* add test for checking value of timeout
2024-10-23 14:09:35 +05:30
dependabot[bot]
64c3d3210c
build(deps): bump http-proxy-middleware in /docs/my-website (#6395)
Bumps [http-proxy-middleware](https://github.com/chimurai/http-proxy-middleware) from 2.0.6 to 2.0.7.
- [Release notes](https://github.com/chimurai/http-proxy-middleware/releases)
- [Changelog](https://github.com/chimurai/http-proxy-middleware/blob/v2.0.7/CHANGELOG.md)
- [Commits](https://github.com/chimurai/http-proxy-middleware/compare/v2.0.6...v2.0.7)

---
updated-dependencies:
- dependency-name: http-proxy-middleware
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-23 13:30:12 +05:30
Krish Dholakia
cb2563e3c0
Litellm dev 10 22 2024 (#6384)
* fix(utils.py): add 'disallowed_special' for token counting on .encode()

Fixes error when '<
endoftext
>' in string

* Revert "(fix) standard logging metadata + add unit testing  (#6366)" (#6381)

This reverts commit 8359cb6fa9.

* add new 35 mode lcard (#6378)

* Add claude 3 5 sonnet 20241022 models for all provides (#6380)

* Add Claude 3.5 v2 on Amazon Bedrock and Vertex AI.

* added anthropic/claude-3-5-sonnet-20241022

* add new 35 mode lcard

---------

Co-authored-by: Paul Gauthier <paul@paulg.com>
Co-authored-by: lowjiansheng <15527690+lowjiansheng@users.noreply.github.com>

* test(skip-flaky-google-context-caching-test): google is not reliable. their sample code is also not working

* Fix metadata being overwritten in speech() (#6295)

* fix: adding missing redis cluster kwargs (#6318)

Co-authored-by: Ali Arian <ali.arian@breadfinancial.com>

* Add support for `max_completion_tokens` in Azure OpenAI (#6376)

Now that Azure supports `max_completion_tokens`, no need for special handling for this param and let it pass thru. More details: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#api-support

* build(model_prices_and_context_window.json): add voyage-finance-2 pricing

Closes https://github.com/BerriAI/litellm/issues/6371

* build(model_prices_and_context_window.json): fix llama3.1 pricing model name on map

Closes https://github.com/BerriAI/litellm/issues/6310

* feat(realtime_streaming.py): just log specific events

Closes https://github.com/BerriAI/litellm/issues/6267

* fix(utils.py): more robust checking if unmapped vertex anthropic model belongs to that family of models

Fixes https://github.com/BerriAI/litellm/issues/6383

* Fix Ollama stream handling for tool calls with None content (#6155)

* test(test_max_completions): update test now that azure supports 'max_completion_tokens'

* fix(handler.py): fix linting error

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Paul Gauthier <paul@paulg.com>
Co-authored-by: John HU <hszqqq12@gmail.com>
Co-authored-by: Ali Arian <113945203+ali-arian@users.noreply.github.com>
Co-authored-by: Ali Arian <ali.arian@breadfinancial.com>
Co-authored-by: Anand Taralika <46954145+taralika@users.noreply.github.com>
Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
2024-10-22 21:18:54 -07:00
Ishaan Jaff
b75019c1a5
(feat) Arize - Allow using Arize HTTP endpoint (#6364)
* arize use helper for get_arize_opentelemetry_config

* use helper to get Arize OTEL config

* arize add helpers for arize

* docs allow using arize http endpoint

* fix importing OTEL for Arize

* use static methods for ArizeLogger

* fix ArizeLogger tests
2024-10-23 09:38:35 +05:30
Ishaan Jaff
7853cb791d fix docs configs.md 2024-10-22 18:16:47 +05:30
Ishaan Jaff
7a5f997fc9
(refactor) remove berrispendLogger - unused logging integration (#6363)
* fix remove berrispendLogger

* remove unused clickhouse logger
2024-10-22 16:53:25 +05:30
Krrish Dholakia
e6e518ad58 docs(sidebars.js): add jina ai to left nav 2024-10-21 21:48:06 -07:00
Krrish Dholakia
e30a2743b7 docs(sidebars.js): add jina ai embedding to docs 2024-10-21 21:47:10 -07:00
Krish Dholakia
2b9db05e08
feat(proxy_cli.py): add new 'log_config' cli param (#6352)
* feat(proxy_cli.py): add new 'log_config' cli param

Allows passing logging.conf to uvicorn on startup

* docs(cli.md): add logging conf to uvicorn cli docs

* fix(get_llm_provider_logic.py): fix default api base for litellm_proxy

Fixes https://github.com/BerriAI/litellm/issues/6332

* feat(openai_like/embedding): Add support for jina ai embeddings

Closes https://github.com/BerriAI/litellm/issues/6337

* docs(deploy.md): update entrypoint.sh filepath post-refactor

Fixes outdated docs

* feat(prometheus.py): emit time_to_first_token metric on prometheus

Closes https://github.com/BerriAI/litellm/issues/6334

* fix(prometheus.py): only emit time to first token metric if stream is True

enables more accurate ttft usage

* test: handle vertex api instability

* fix(get_llm_provider_logic.py): fix import

* fix(openai.py): fix deepinfra default api base

* fix(anthropic/transformation.py): remove anthropic beta header (#6361)
2024-10-21 21:25:58 -07:00
Krish Dholakia
905ebeb924
feat(custom_logger.py): expose new async_dataset_hook for modifying… (#6331)
* feat(custom_logger.py): expose new `async_dataset_hook` for modifying/rejecting argilla items before logging

Allows user more control on what gets logged to argilla for annotations

* feat(google_ai_studio_endpoints.py): add new `/azure/*` pass through route

enables pass-through for azure provider

* feat(utils.py): support checking ollama `/api/show` endpoint for retrieving ollama model info

Fixes https://github.com/BerriAI/litellm/issues/6322

* fix(user_api_key_auth.py): add `/key/delete` to an allowed_ui_routes

Fixes https://github.com/BerriAI/litellm/issues/6236

* fix(user_api_key_auth.py): remove type ignore

* fix(user_api_key_auth.py): route ui vs. api token checks differently

Fixes https://github.com/BerriAI/litellm/issues/6238

* feat(internal_user_endpoints.py): support setting models as a default internal user param

Closes https://github.com/BerriAI/litellm/issues/6239

* fix(user_api_key_auth.py): fix exception string

* fix(user_api_key_auth.py): fix error string

* fix: fix test
2024-10-20 09:00:04 -07:00
Ishaan Jaff
4cbdad9fc5
doc - using gpt-4o-audio-preview (#6326)
* doc on audio models

* doc supports vision

* doc audio input / output
2024-10-19 09:34:56 +05:30
Ishaan Jaff
e35fc3203e
doc fix Turn on / off caching per Key. (#6297) 2024-10-18 15:43:04 +05:30
Krrish Dholakia
6e7e96211c docs(argilla.md): add sampling rate to argilla calls 2024-10-17 22:54:12 -07:00
Krrish Dholakia
4f5ff65882 docs(argilla.md): add doc on argilla logging 2024-10-17 22:51:55 -07:00
Krrish Dholakia
87f256bec1 docs(user_keys.md): add regex doc for clientside auth params 2024-10-17 22:42:29 -07:00
Krish Dholakia
f252350881
LiteLLM Minor Fixes & Improvements (10/17/2024) (#6293)
* fix(ui_sso.py): fix faulty admin only check

Fixes https://github.com/BerriAI/litellm/issues/6286

* refactor(sso_helper_utils.py): refactor /sso/callback to use helper utils, covered by unit testing

Prevent future regressions

* feat(prompt_factory): support 'ensure_alternating_roles' param

Closes https://github.com/BerriAI/litellm/issues/6257

* fix(proxy/utils.py): add dailytagspend to expected views

* feat(auth_utils.py): support setting regex for clientside auth credentials

Fixes https://github.com/BerriAI/litellm/issues/6203

* build(cookbook): add tutorial for mlflow + langchain + litellm proxy tracing

* feat(argilla.py): add argilla logging integration

Closes https://github.com/BerriAI/litellm/issues/6201

* fix: fix linting errors

* fix: fix ruff error

* test: fix test

* fix: update vertex ai assumption - parts not always guaranteed (#6296)

* docs(configs.md): add argila env var to docs
2024-10-17 22:09:11 -07:00
Krish Dholakia
38a9a106d2
LiteLLM Minor Fixes & Improvements (10/16/2024) (#6265)
* fix(caching_handler.py): handle positional arguments in add cache logic

Fixes https://github.com/BerriAI/litellm/issues/6264

* feat(litellm_pre_call_utils.py): allow forwarding openai org id to backend client

https://github.com/BerriAI/litellm/issues/6237

* docs(configs.md): add 'forward_openai_org_id' to docs

* fix(proxy_server.py): return model info if user_model is set

Fixes https://github.com/BerriAI/litellm/issues/6233

* fix(hosted_vllm/chat/transformation.py): don't set tools unless non-none

* fix(openai.py): improve debug log for openai 'str' error

Addresses https://github.com/BerriAI/litellm/issues/6272

* fix(proxy_server.py): fix linting error

* fix(proxy_server.py): fix linting errors

* test: skip WIP test

* docs(openai.md): add docs on passing openai org id from client to openai
2024-10-16 22:16:23 -07:00
yujonglee
43878bd2a0
remove ask mode (#6271) 2024-10-16 22:01:50 -07:00
Krish Dholakia
54ebdbf7ce
LiteLLM Minor Fixes & Improvements (10/15/2024) (#6242)
* feat(litellm_pre_call_utils.py): support forwarding request headers to backend llm api

* fix(litellm_pre_call_utils.py): handle custom litellm key header

* test(router_code_coverage.py): check if all router functions are dire… (#6186)

* test(router_code_coverage.py): check if all router functions are directly tested

prevent regressions

* docs(configs.md): document all environment variables (#6185)

* docs: make it easier to find anthropic/openai prompt caching doc

* aded codecov yml (#6207)

* fix codecov.yaml

* run ci/cd again

* (refactor) caching use LLMCachingHandler for async_get_cache and set_cache  (#6208)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* fix test_embedding_caching_azure_individual_items_reordered

* (feat) prometheus have well defined latency buckets (#6211)

* fix prometheus have well defined latency buckets

* use a well define latency bucket

* use types file for prometheus logging

* add test for LATENCY_BUCKETS

* fix prom testing

* fix config.yml

* (refactor caching) use LLMCachingHandler for caching streaming responses  (#6210)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* refactor async set stream cache

* fix linting

* bump (#6187)

* update code cov yaml

* fix config.yml

* add caching component to code cov

* fix config.yml ci/cd

* add coverage for proxy auth

* (refactor caching) use common `_retrieve_from_cache` helper  (#6212)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* refactor async set stream cache

* fix linting

* refactor - use _retrieve_from_cache

* refactor use _convert_cached_result_to_model_response

* fix linting errors

* bump: version 1.49.2 → 1.49.3

* fix code cov components

* test(test_router_helpers.py): add router component unit tests

* test: add additional router tests

* test: add more router testing

* test: add more router testing + more mock functions

* ci(router_code_coverage.py): fix check

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>

* bump: version 1.49.3 → 1.49.4

* (refactor) use helper function `_assemble_complete_response_from_streaming_chunks` to assemble complete responses in caching and logging callbacks (#6220)

* (refactor) use _assemble_complete_response_from_streaming_chunks

* add unit test for test_assemble_complete_response_from_streaming_chunks_1

* fix assemble complete_streaming_response

* config add logging_testing

* add logging_coverage in codecov

* test test_assemble_complete_response_from_streaming_chunks_3

* add unit tests for _assemble_complete_response_from_streaming_chunks

* fix remove unused / junk function

* add test for streaming_chunks when error assembling

* (refactor) OTEL - use safe_set_attribute for setting attributes (#6226)

* otel - use safe_set_attribute for setting attributes

* fix OTEL only use safe_set_attribute

* (fix) prompt caching cost calculation OpenAI, Azure OpenAI  (#6231)

* fix prompt caching cost calculation

* fix testing for prompt cache cost calc

* fix(allowed_model_region): allow us as allowed region (#6234)

* test(router_code_coverage.py): check if all router functions are dire… (#6186)

* test(router_code_coverage.py): check if all router functions are directly tested

prevent regressions

* docs(configs.md): document all environment variables (#6185)

* docs: make it easier to find anthropic/openai prompt caching doc

* aded codecov yml (#6207)

* fix codecov.yaml

* run ci/cd again

* (refactor) caching use LLMCachingHandler for async_get_cache and set_cache  (#6208)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* fix test_embedding_caching_azure_individual_items_reordered

* (feat) prometheus have well defined latency buckets (#6211)

* fix prometheus have well defined latency buckets

* use a well define latency bucket

* use types file for prometheus logging

* add test for LATENCY_BUCKETS

* fix prom testing

* fix config.yml

* (refactor caching) use LLMCachingHandler for caching streaming responses  (#6210)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* refactor async set stream cache

* fix linting

* bump (#6187)

* update code cov yaml

* fix config.yml

* add caching component to code cov

* fix config.yml ci/cd

* add coverage for proxy auth

* (refactor caching) use common `_retrieve_from_cache` helper  (#6212)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* refactor async set stream cache

* fix linting

* refactor - use _retrieve_from_cache

* refactor use _convert_cached_result_to_model_response

* fix linting errors

* bump: version 1.49.2 → 1.49.3

* fix code cov components

* test(test_router_helpers.py): add router component unit tests

* test: add additional router tests

* test: add more router testing

* test: add more router testing + more mock functions

* ci(router_code_coverage.py): fix check

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>

* bump: version 1.49.3 → 1.49.4

* (refactor) use helper function `_assemble_complete_response_from_streaming_chunks` to assemble complete responses in caching and logging callbacks (#6220)

* (refactor) use _assemble_complete_response_from_streaming_chunks

* add unit test for test_assemble_complete_response_from_streaming_chunks_1

* fix assemble complete_streaming_response

* config add logging_testing

* add logging_coverage in codecov

* test test_assemble_complete_response_from_streaming_chunks_3

* add unit tests for _assemble_complete_response_from_streaming_chunks

* fix remove unused / junk function

* add test for streaming_chunks when error assembling

* (refactor) OTEL - use safe_set_attribute for setting attributes (#6226)

* otel - use safe_set_attribute for setting attributes

* fix OTEL only use safe_set_attribute

* fix(allowed_model_region): allow us as allowed region

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>

* fix(litellm_pre_call_utils.py): support 'us' region routing + fix header forwarding to filter on `x-` headers

* docs(customer_routing.md): fix region-based routing example

* feat(azure.py): handle empty arguments function call - azure

Closes https://github.com/BerriAI/litellm/issues/6241

* feat(guardrails_ai.py): support guardrails ai integration

Adds support for on-prem guardrails via guardrails ai

* fix(proxy/utils.py): prevent sql injection attack

Fixes https://huntr.com/bounties/a4f6d357-5b44-4e00-9cac-f1cc351211d2

* fix: fix linting errors

* fix(litellm_pre_call_utils.py): don't log litellm api key in proxy server request headers

* fix(litellm_pre_call_utils.py): don't forward stainless headers

* docs(guardrails_ai.md): add guardrails ai quick start to docs

* test: handle flaky test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Marcus Elwin <marcus@elwin.com>
2024-10-16 07:32:06 -07:00
Krish Dholakia
39486e2003
Litellm dev 10 14 2024 (#6221)
* fix(__init__.py): expose DualCache, RedisCache, InMemoryCache on root

abstract internal file refactors from impacting users

* feat(utils.py): handle invalid openai parallel tool calling response

Fixes https://community.openai.com/t/model-tries-to-call-unknown-function-multi-tool-use-parallel/490653

* docs(bedrock.md): clarify all bedrock models are supported

Closes https://github.com/BerriAI/litellm/issues/6168#issuecomment-2412082236
2024-10-14 22:11:14 -07:00
yujonglee
4132a97787
bump (#6187) 2024-10-14 18:22:54 +05:30
Ishaan Jaff
4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208)
* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Krrish Dholakia
806a1c4acc docs: make it easier to find anthropic/openai prompt caching doc 2024-10-13 18:34:13 -07:00
Krish Dholakia
15b44c3221
docs(configs.md): document all environment variables (#6185) 2024-10-13 09:57:03 -07:00
Krish Dholakia
2acb0c0675
Litellm Minor Fixes & Improvements (10/12/2024) (#6179)
* build(model_prices_and_context_window.json): add bedrock llama3.2 pricing

* build(model_prices_and_context_window.json): add bedrock cross region inference pricing

* Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)"

This reverts commit 2a5624af47.

* add azure/gpt-4o-2024-05-13 (#6174)

* LiteLLM Minor Fixes & Improvements (10/10/2024)  (#6158)

* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic

* fix(vertex_ai/): support passing custom api base to partner models

Fixes https://github.com/BerriAI/litellm/issues/4317

* fix(proxy_server.py): Fix prometheus premium user check logic

* docs(prometheus.md): update quick start docs

* fix(custom_llm.py): support passing dynamic api key + api base

* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints

Closes https://github.com/BerriAI/litellm/issues/6081

* feat(openai/realtime): add openai realtime api logging

Closes https://github.com/BerriAI/litellm/issues/6081

* fix(realtime_streaming.py): fix linting errors

* fix(realtime_streaming.py): fix linting errors

* fix: fix linting errors

* fix pattern match router

* Add literalai in the sidebar observability category (#6163)

* fix: add literalai in the sidebar

* fix: typo

* update (#6160)

* Feat: Add Langtrace integration (#5341)

* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict

* docs fix

* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)

* fix move s3 to use customLogger

* add basic s3 logging test

* add s3 to custom logger compatible

* use batch logger for s3

* s3 set flush interval and batch size

* fix s3 logging

* add notes on s3 logging

* fix s3 logging

* add basic s3 logging test

* fix s3 type errors

* add test for sync logging on s3

* fix: fix to debug log

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>

* docs(custom_llm_server.md): update doc on passing custom params

* fix(pass_through_endpoints.py): don't require headers

Fixes https://github.com/BerriAI/litellm/issues/6128

* feat(utils.py): add support for caching rerank endpoints

Closes https://github.com/BerriAI/litellm/issues/6144

* feat(litellm_logging.py'): add response headers for failed requests

Closes https://github.com/BerriAI/litellm/issues/6159

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-12 11:48:34 -07:00
Krish Dholakia
11f9df923a
LiteLLM Minor Fixes & Improvements (10/10/2024) (#6158)
* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic

* fix(vertex_ai/): support passing custom api base to partner models

Fixes https://github.com/BerriAI/litellm/issues/4317

* fix(proxy_server.py): Fix prometheus premium user check logic

* docs(prometheus.md): update quick start docs

* fix(custom_llm.py): support passing dynamic api key + api base

* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints

Closes https://github.com/BerriAI/litellm/issues/6081

* feat(openai/realtime): add openai realtime api logging

Closes https://github.com/BerriAI/litellm/issues/6081

* fix(realtime_streaming.py): fix linting errors

* fix(realtime_streaming.py): fix linting errors

* fix: fix linting errors

* fix pattern match router

* Add literalai in the sidebar observability category (#6163)

* fix: add literalai in the sidebar

* fix: typo

* update (#6160)

* Feat: Add Langtrace integration (#5341)

* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict

* docs fix

* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)

* fix move s3 to use customLogger

* add basic s3 logging test

* add s3 to custom logger compatible

* use batch logger for s3

* s3 set flush interval and batch size

* fix s3 logging

* add notes on s3 logging

* fix s3 logging

* add basic s3 logging test

* fix s3 type errors

* add test for sync logging on s3

* fix: fix to debug log

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-11 23:04:36 -07:00
Ishaan Jaff
4e1c892dfc docs fix 2024-10-11 19:32:59 +05:30
Ali Waleed
7ec414a3cf
Feat: Add Langtrace integration (#5341)
* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict
2024-10-11 19:19:53 +05:30
yujonglee
42174fde4e
update (#6160) 2024-10-11 19:18:56 +05:30
Willy Douhard
8b00d2a25f
Add literalai in the sidebar observability category (#6163)
* fix: add literalai in the sidebar

* fix: typo
2024-10-11 19:18:47 +05:30
Jacques Verré
4064bfc6dd
[Feat] Observability integration - Opik by Comet (#6062)
* Added Opik logging and evaluation

* Updated doc examples

* Default tags should be [] in case appending

* WIP

* Work in progress

* Opik integration

* Opik integration

* Revert changes on litellm_logging.py

* Updated Opik integration for synchronous API calls

* Updated Opik documentation

---------

Co-authored-by: Douglas Blank <doug@comet.com>
Co-authored-by: Doug Blank <doug.blank@gmail.com>
2024-10-10 18:27:50 +05:30
Ishaan Jaff
89506053a4
(feat) use regex pattern matching for wildcard routing (#6150)
* use pattern matching for llm deployments

* code quality fix

* fix linting

* add types to PatternMatchRouter

* docs add example config for regex patterns
2024-10-10 18:24:16 +05:30
Krrish Dholakia
60baa65e0e docs(configs.md): add litellm config / s3 bucket object info in configs.md 2024-10-09 09:07:43 -07:00
Ishaan Jaff
b35da5014b doc onboarding orgs 2024-10-09 19:11:36 +05:30
Ishaan Jaff
5da6863804 docs rbac 2024-10-09 16:46:26 +05:30
Ishaan Jaff
399f50d558 fix rbac doc 2024-10-09 16:44:46 +05:30
Ishaan Jaff
0e83a68a69 doc - move rbac under auth 2024-10-09 15:27:32 +05:30
Ishaan Jaff
1fd437e263
(feat proxy) [beta] add support for organization role based access controls (#6112)
* track LiteLLM_OrganizationMembership

* add add_internal_user_to_organization

* add org membership to schema

* read organization membership when reading user info in auth checks

* add check for valid organization_id

* add test for test_create_new_user_in_organization

* test test_create_new_user_in_organization

* add new ADMIN role

* add test for org admins creating teams

* add test for test_org_admin_create_user_permissions

* test_org_admin_create_user_team_wrong_org_permissions

* test_org_admin_create_user_team_wrong_org_permissions

* fix organization_role_based_access_check

* fix getting user members

* fix TeamBase

* fix types used for use role

* fix type checks

* sync prisma schema

* docs - organization admins

* fix use organization_endpoints for /organization management

* add types for org member endpoints

* fix role name for org admin

* add type for member add response

* add organization/member_add

* add error handling for adding members to an org

* add nice doc string for oranization/member_add

* fix test_create_new_user_in_organization

* linting fix

* use simple route changes

* fix types

* add organization member roles

* add org admin auth checks

* add auth checks for orgs

* test for creating teams as org admin

* simplify org id usage

* fix typo

* test test_org_admin_create_user_team_wrong_org_permissions

* fix type check issue

* code quality fix

* fix schema.prisma
2024-10-09 15:18:18 +05:30
Ishaan Jaff
d1760b1b04
(fix) clean up root repo - move entrypoint.sh and build_admin_ui to /docker (#6110)
* fix move docker files to docker folders

* move check file length

* fix docker hub deploy

* fix clean up root

* fix circle ci config
2024-10-08 11:34:43 +05:30
Krrish Dholakia
cc960da4b6 docs(azure.md): add o1 model support to config 2024-10-07 22:37:49 -07:00
Krish Dholakia
6729c9ca7f
LiteLLM Minor Fixes & Improvements (10/07/2024) (#6101)
* fix(utils.py): support dropping temperature param for azure o1 models

* fix(main.py): handle azure o1 streaming requests

o1 doesn't support streaming, fake it to ensure code works as expected

* feat(utils.py): expose `hosted_vllm/` endpoint, with tool handling for vllm

Fixes https://github.com/BerriAI/litellm/issues/6088

* refactor(internal_user_endpoints.py): cleanup unused params + update docstring

Closes https://github.com/BerriAI/litellm/issues/6100

* fix(main.py): expose custom image generation api support

Fixes https://github.com/BerriAI/litellm/issues/6097

* fix: fix linting errors

* docs(custom_llm_server.md): add docs on custom api for image gen calls

* fix(types/utils.py): handle dict type

* fix(types/utils.py): fix linting errors
2024-10-07 22:17:22 -07:00
Ishaan Jaff
ef815f3a84
(docs) add remaining litellm settings on configs.md doc (#6108)
* docs add litellm settings configs

* docs langfuse tags on config
2024-10-08 07:57:04 +05:30
Ishaan Jaff
2b370f8e9e
(docs) key based callbacks (#6107) 2024-10-08 07:12:01 +05:30