Commit graph

18075 commits

Author SHA1 Message Date
Ishaan Jaff
cf8c76d24b fix RerankResponse make meta optional (#6248) 2024-10-16 11:47:44 +05:30
Ishaan Jaff
a278c7600e bump: version 1.49.4 → 1.49.5 2024-10-16 11:46:33 +05:30
Ishaan Jaff
4eea0652eb (refactor) caching - use _sync_set_cache (#6224)
* caching - use _sync_set_cache

* add sync _sync_add_streaming_response_to_cache

* use caching class for cache storage
2024-10-16 10:38:07 +05:30
dependabot[bot]
a04fc1a921 Bump hono from 4.5.8 to 4.6.5 in /litellm-js/spend-logs (#6245)
Bumps [hono](https://github.com/honojs/hono) from 4.5.8 to 4.6.5.
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](https://github.com/honojs/hono/compare/v4.5.8...v4.6.5)

---
updated-dependencies:
- dependency-name: hono
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-16 10:37:31 +05:30
Ishaan Jaff
faa1fd07f8 fix arize handle optional params (#6243) 2024-10-16 08:33:40 +05:30
Ishaan Jaff
c25733e28e (fix) prompt caching cost calculation OpenAI, Azure OpenAI (#6231)
* fix prompt caching cost calculation

* fix testing for prompt cache cost calc
2024-10-15 18:55:31 +05:30
Ishaan Jaff
700a87204a (refactor) OTEL - use safe_set_attribute for setting attributes (#6226)
* otel - use safe_set_attribute for setting attributes

* fix OTEL only use safe_set_attribute
2024-10-15 13:39:29 +05:30
Ishaan Jaff
29ac8b1de9 (refactor) use helper function _assemble_complete_response_from_streaming_chunks to assemble complete responses in caching and logging callbacks (#6220)
* (refactor) use _assemble_complete_response_from_streaming_chunks

* add unit test for test_assemble_complete_response_from_streaming_chunks_1

* fix assemble complete_streaming_response

* config add logging_testing

* add logging_coverage in codecov

* test test_assemble_complete_response_from_streaming_chunks_3

* add unit tests for _assemble_complete_response_from_streaming_chunks

* fix remove unused / junk function

* add test for streaming_chunks when error assembling
2024-10-15 12:45:12 +05:30
Krrish Dholakia
d6e0e1ac2e bump: version 1.49.3 → 1.49.4 2024-10-14 22:44:14 -07:00
Krish Dholakia
8705f1af92 test(router_code_coverage.py): check if all router functions are dire… (#6186)
* test(router_code_coverage.py): check if all router functions are directly tested

prevent regressions

* docs(configs.md): document all environment variables (#6185)

* docs: make it easier to find anthropic/openai prompt caching doc

* aded codecov yml (#6207)

* fix codecov.yaml

* run ci/cd again

* (refactor) caching use LLMCachingHandler for async_get_cache and set_cache  (#6208)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* fix test_embedding_caching_azure_individual_items_reordered

* (feat) prometheus have well defined latency buckets (#6211)

* fix prometheus have well defined latency buckets

* use a well define latency bucket

* use types file for prometheus logging

* add test for LATENCY_BUCKETS

* fix prom testing

* fix config.yml

* (refactor caching) use LLMCachingHandler for caching streaming responses  (#6210)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* refactor async set stream cache

* fix linting

* bump (#6187)

* update code cov yaml

* fix config.yml

* add caching component to code cov

* fix config.yml ci/cd

* add coverage for proxy auth

* (refactor caching) use common `_retrieve_from_cache` helper  (#6212)

* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* refactor async set stream cache

* fix linting

* refactor - use _retrieve_from_cache

* refactor use _convert_cached_result_to_model_response

* fix linting errors

* bump: version 1.49.2 → 1.49.3

* fix code cov components

* test(test_router_helpers.py): add router component unit tests

* test: add additional router tests

* test: add more router testing

* test: add more router testing + more mock functions

* ci(router_code_coverage.py): fix check

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
2024-10-14 22:44:00 -07:00
Krish Dholakia
bcd1a52834 Litellm dev 10 14 2024 (#6221)
* fix(__init__.py): expose DualCache, RedisCache, InMemoryCache on root

abstract internal file refactors from impacting users

* feat(utils.py): handle invalid openai parallel tool calling response

Fixes https://community.openai.com/t/model-tries-to-call-unknown-function-multi-tool-use-parallel/490653

* docs(bedrock.md): clarify all bedrock models are supported

Closes https://github.com/BerriAI/litellm/issues/6168#issuecomment-2412082236
2024-10-14 22:11:14 -07:00
Ishaan Jaff
2296e0d363 fix importing Cache from litellm (#6219) 2024-10-15 08:47:23 +05:30
Ishaan Jaff
ece65164fb (refactor router.py ) - PR 3 - Ensure all functions under 100 lines (#6181)
* add flake 8 check

* split up litellm _acompletion

* fix get model client

* refactor use commong func to add metadata to kwargs

* use common func to get timeout

* re-use helper to _get_async_model_client

* use _handle_mock_testing_rate_limit_error

* fix docstring for _handle_mock_testing_rate_limit_error

* fix function_with_retries

* use helper for mock testing fallbacks

* router - use 1 func for simple_shuffle

* add doc string for simple_shuffle

* use 1 function for filtering cooldown deployments

* fix use common helper to _get_fallback_model_group_from_fallbacks
2024-10-14 21:27:54 +05:30
Ishaan Jaff
a2844c851b fix code cov components 2024-10-14 20:35:41 +05:30
Ishaan Jaff
17f09257a8 bump: version 1.49.2 → 1.49.3 2024-10-14 19:13:01 +05:30
Ishaan Jaff
7a8934127e (refactor caching) use common _retrieve_from_cache helper (#6212)
* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* refactor async set stream cache

* fix linting

* refactor - use _retrieve_from_cache

* refactor use _convert_cached_result_to_model_response

* fix linting errors
2024-10-14 19:12:41 +05:30
Ishaan Jaff
8c48d1215b add coverage for proxy auth 2024-10-14 18:50:10 +05:30
Ishaan Jaff
e62d827f6b fix config.yml ci/cd 2024-10-14 18:42:31 +05:30
Ishaan Jaff
4a2ae50a69 add caching component to code cov 2024-10-14 18:40:24 +05:30
Ishaan Jaff
b3a062ce07 fix config.yml 2024-10-14 18:38:07 +05:30
Ishaan Jaff
80b113366b update code cov yaml 2024-10-14 18:31:22 +05:30
yujonglee
d298c8fdd0 bump (#6187) 2024-10-14 18:22:54 +05:30
Ishaan Jaff
046c6db99b (refactor caching) use LLMCachingHandler for caching streaming responses (#6210)
* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* refactor async set stream cache

* fix linting
2024-10-14 17:46:45 +05:30
Ishaan Jaff
779d4cebfe fix config.yml 2024-10-14 17:28:52 +05:30
Ishaan Jaff
2e5debbfd2 fix prom testing 2024-10-14 17:23:40 +05:30
Ishaan Jaff
1ee5194e03 (feat) prometheus have well defined latency buckets (#6211)
* fix prometheus have well defined latency buckets

* use a well define latency bucket

* use types file for prometheus logging

* add test for LATENCY_BUCKETS
2024-10-14 17:16:01 +05:30
Ishaan Jaff
ba56e37244 (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208)
* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Ishaan Jaff
49f6a7660b run ci/cd again 2024-10-14 11:50:42 +05:30
Ishaan Jaff
dd81840d88 fix codecov.yaml 2024-10-14 11:40:51 +05:30
Ishaan Jaff
c1856d2ab4 aded codecov yml (#6207) 2024-10-14 11:38:43 +05:30
Krrish Dholakia
19a4973f4b docs: make it easier to find anthropic/openai prompt caching doc 2024-10-13 18:34:13 -07:00
Krish Dholakia
15a0c90a7a docs(configs.md): document all environment variables (#6185) 2024-10-13 09:57:03 -07:00
Krish Dholakia
aef5e9ad30 ci(config.yml): add further testing coverage to codecov (#6184)
Improve visibility into current testing
2024-10-12 22:37:46 -07:00
Krish Dholakia
a76aa256bb ci(config.yml): add local_testing tests to codecov coverage check (#6183)
* ci(config.yml): add local_testing tests to codecov coverage check

* fix(config.yml): wait for local_testing to complete before uploading coverage
2024-10-12 18:51:40 -07:00
Krrish Dholakia
b99dd96e1d ci(config.yml): run all router tests 2024-10-12 16:01:57 -07:00
Krish Dholakia
d7abcc0d54 build(config.yml): add codecov to repo (#6172)
* build(config.yml): add codecov to repo

ensures all commits have testing coverage

* build(config.yml): fix ci config

* build: fix .yml

* build(config.yml): fix ci/cd

* ci(config.yml): specify module to measure code coverage for

* ci(config.yml): update config.yml version

* ci: trigger new run

* ci(config.yml): store combine

* build(config.yml): check files before combine

* ci(config.yml): fix check

* ci(config.yml): add codecov coverage to ci/cd

* ci(config.yml): add codecov to router tests

* ci(config.yml): wait for router testing to complete before running codecov upload

* ci(config.yml): handle multiple coverage.xml's

* fix(router.py): cleanup print stack

* ci(config.yml): fix config

* ci(config.yml): fix config
2024-10-12 14:48:17 -07:00
Krish Dholakia
85dc4873ed Litellm Minor Fixes & Improvements (10/12/2024) (#6179)
* build(model_prices_and_context_window.json): add bedrock llama3.2 pricing

* build(model_prices_and_context_window.json): add bedrock cross region inference pricing

* Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)"

This reverts commit 2a5624af47.

* add azure/gpt-4o-2024-05-13 (#6174)

* LiteLLM Minor Fixes & Improvements (10/10/2024)  (#6158)

* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic

* fix(vertex_ai/): support passing custom api base to partner models

Fixes https://github.com/BerriAI/litellm/issues/4317

* fix(proxy_server.py): Fix prometheus premium user check logic

* docs(prometheus.md): update quick start docs

* fix(custom_llm.py): support passing dynamic api key + api base

* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints

Closes https://github.com/BerriAI/litellm/issues/6081

* feat(openai/realtime): add openai realtime api logging

Closes https://github.com/BerriAI/litellm/issues/6081

* fix(realtime_streaming.py): fix linting errors

* fix(realtime_streaming.py): fix linting errors

* fix: fix linting errors

* fix pattern match router

* Add literalai in the sidebar observability category (#6163)

* fix: add literalai in the sidebar

* fix: typo

* update (#6160)

* Feat: Add Langtrace integration (#5341)

* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict

* docs fix

* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)

* fix move s3 to use customLogger

* add basic s3 logging test

* add s3 to custom logger compatible

* use batch logger for s3

* s3 set flush interval and batch size

* fix s3 logging

* add notes on s3 logging

* fix s3 logging

* add basic s3 logging test

* fix s3 type errors

* add test for sync logging on s3

* fix: fix to debug log

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>

* docs(custom_llm_server.md): update doc on passing custom params

* fix(pass_through_endpoints.py): don't require headers

Fixes https://github.com/BerriAI/litellm/issues/6128

* feat(utils.py): add support for caching rerank endpoints

Closes https://github.com/BerriAI/litellm/issues/6144

* feat(litellm_logging.py'): add response headers for failed requests

Closes https://github.com/BerriAI/litellm/issues/6159

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-12 11:48:34 -07:00
Ishaan Jaff
1d0b5647e8 bump: version 1.49.1 → 1.49.2 2024-10-12 16:02:17 +05:30
Ishaan Jaff
048a46c5da (fix) provider wildcard routing - when models specificed without provider prefix (#6173)
* fix wildcard routing scenario

* fix pattern matching hits
2024-10-12 16:01:21 +05:30
Ishaan Jaff
7faac40aec (fix) batch_completion fails with bedrock due to extraneous [max_workers] key (#6176)
* fix batch_completion

* fix import batch completion

* fix batch completion usage
2024-10-12 14:10:24 +05:30
Krish Dholakia
17fa7c17ec LiteLLM Minor Fixes & Improvements (10/10/2024) (#6158)
* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic

* fix(vertex_ai/): support passing custom api base to partner models

Fixes https://github.com/BerriAI/litellm/issues/4317

* fix(proxy_server.py): Fix prometheus premium user check logic

* docs(prometheus.md): update quick start docs

* fix(custom_llm.py): support passing dynamic api key + api base

* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints

Closes https://github.com/BerriAI/litellm/issues/6081

* feat(openai/realtime): add openai realtime api logging

Closes https://github.com/BerriAI/litellm/issues/6081

* fix(realtime_streaming.py): fix linting errors

* fix(realtime_streaming.py): fix linting errors

* fix: fix linting errors

* fix pattern match router

* Add literalai in the sidebar observability category (#6163)

* fix: add literalai in the sidebar

* fix: typo

* update (#6160)

* Feat: Add Langtrace integration (#5341)

* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict

* docs fix

* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)

* fix move s3 to use customLogger

* add basic s3 logging test

* add s3 to custom logger compatible

* use batch logger for s3

* s3 set flush interval and batch size

* fix s3 logging

* add notes on s3 logging

* fix s3 logging

* add basic s3 logging test

* fix s3 type errors

* add test for sync logging on s3

* fix: fix to debug log

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-11 23:04:36 -07:00
Ishaan Jaff
6afb3e4bf5 add azure/gpt-4o-2024-05-13 (#6174) 2024-10-12 10:47:45 +05:30
Ishaan Jaff
78110c008d Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)"
This reverts commit 2a5624af47.
2024-10-12 07:08:30 +05:30
Ishaan Jaff
2e1cd56cb3 (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)
* fix move s3 to use customLogger

* add basic s3 logging test

* add s3 to custom logger compatible

* use batch logger for s3

* s3 set flush interval and batch size

* fix s3 logging

* add notes on s3 logging

* fix s3 logging

* add basic s3 logging test

* fix s3 type errors

* add test for sync logging on s3
2024-10-11 19:49:03 +05:30
Ishaan Jaff
99d3ee0434 docs fix 2024-10-11 19:32:59 +05:30
Ali Waleed
f3a24d22d5 Feat: Add Langtrace integration (#5341)
* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict
2024-10-11 19:19:53 +05:30
yujonglee
ed7cedeb5b update (#6160) 2024-10-11 19:18:56 +05:30
Willy Douhard
2fc2348a30 Add literalai in the sidebar observability category (#6163)
* fix: add literalai in the sidebar

* fix: typo
2024-10-11 19:18:47 +05:30
Ishaan Jaff
4aaffc6276 fix pattern match router 2024-10-11 12:12:57 +05:30
Ishaan Jaff
5b497c64db bump: version 1.49.0 → 1.49.1 2024-10-11 00:14:03 +05:30