Krish Dholakia
|
54ebdbf7ce
|
LiteLLM Minor Fixes & Improvements (10/15/2024) (#6242)
* feat(litellm_pre_call_utils.py): support forwarding request headers to backend llm api
* fix(litellm_pre_call_utils.py): handle custom litellm key header
* test(router_code_coverage.py): check if all router functions are dire… (#6186)
* test(router_code_coverage.py): check if all router functions are directly tested
prevent regressions
* docs(configs.md): document all environment variables (#6185)
* docs: make it easier to find anthropic/openai prompt caching doc
* aded codecov yml (#6207)
* fix codecov.yaml
* run ci/cd again
* (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208)
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
* (feat) prometheus have well defined latency buckets (#6211)
* fix prometheus have well defined latency buckets
* use a well define latency bucket
* use types file for prometheus logging
* add test for LATENCY_BUCKETS
* fix prom testing
* fix config.yml
* (refactor caching) use LLMCachingHandler for caching streaming responses (#6210)
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* refactor async set stream cache
* fix linting
* bump (#6187)
* update code cov yaml
* fix config.yml
* add caching component to code cov
* fix config.yml ci/cd
* add coverage for proxy auth
* (refactor caching) use common `_retrieve_from_cache` helper (#6212)
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* refactor async set stream cache
* fix linting
* refactor - use _retrieve_from_cache
* refactor use _convert_cached_result_to_model_response
* fix linting errors
* bump: version 1.49.2 → 1.49.3
* fix code cov components
* test(test_router_helpers.py): add router component unit tests
* test: add additional router tests
* test: add more router testing
* test: add more router testing + more mock functions
* ci(router_code_coverage.py): fix check
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
* bump: version 1.49.3 → 1.49.4
* (refactor) use helper function `_assemble_complete_response_from_streaming_chunks` to assemble complete responses in caching and logging callbacks (#6220)
* (refactor) use _assemble_complete_response_from_streaming_chunks
* add unit test for test_assemble_complete_response_from_streaming_chunks_1
* fix assemble complete_streaming_response
* config add logging_testing
* add logging_coverage in codecov
* test test_assemble_complete_response_from_streaming_chunks_3
* add unit tests for _assemble_complete_response_from_streaming_chunks
* fix remove unused / junk function
* add test for streaming_chunks when error assembling
* (refactor) OTEL - use safe_set_attribute for setting attributes (#6226)
* otel - use safe_set_attribute for setting attributes
* fix OTEL only use safe_set_attribute
* (fix) prompt caching cost calculation OpenAI, Azure OpenAI (#6231)
* fix prompt caching cost calculation
* fix testing for prompt cache cost calc
* fix(allowed_model_region): allow us as allowed region (#6234)
* test(router_code_coverage.py): check if all router functions are dire… (#6186)
* test(router_code_coverage.py): check if all router functions are directly tested
prevent regressions
* docs(configs.md): document all environment variables (#6185)
* docs: make it easier to find anthropic/openai prompt caching doc
* aded codecov yml (#6207)
* fix codecov.yaml
* run ci/cd again
* (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208)
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
* (feat) prometheus have well defined latency buckets (#6211)
* fix prometheus have well defined latency buckets
* use a well define latency bucket
* use types file for prometheus logging
* add test for LATENCY_BUCKETS
* fix prom testing
* fix config.yml
* (refactor caching) use LLMCachingHandler for caching streaming responses (#6210)
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* refactor async set stream cache
* fix linting
* bump (#6187)
* update code cov yaml
* fix config.yml
* add caching component to code cov
* fix config.yml ci/cd
* add coverage for proxy auth
* (refactor caching) use common `_retrieve_from_cache` helper (#6212)
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* refactor async set stream cache
* fix linting
* refactor - use _retrieve_from_cache
* refactor use _convert_cached_result_to_model_response
* fix linting errors
* bump: version 1.49.2 → 1.49.3
* fix code cov components
* test(test_router_helpers.py): add router component unit tests
* test: add additional router tests
* test: add more router testing
* test: add more router testing + more mock functions
* ci(router_code_coverage.py): fix check
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
* bump: version 1.49.3 → 1.49.4
* (refactor) use helper function `_assemble_complete_response_from_streaming_chunks` to assemble complete responses in caching and logging callbacks (#6220)
* (refactor) use _assemble_complete_response_from_streaming_chunks
* add unit test for test_assemble_complete_response_from_streaming_chunks_1
* fix assemble complete_streaming_response
* config add logging_testing
* add logging_coverage in codecov
* test test_assemble_complete_response_from_streaming_chunks_3
* add unit tests for _assemble_complete_response_from_streaming_chunks
* fix remove unused / junk function
* add test for streaming_chunks when error assembling
* (refactor) OTEL - use safe_set_attribute for setting attributes (#6226)
* otel - use safe_set_attribute for setting attributes
* fix OTEL only use safe_set_attribute
* fix(allowed_model_region): allow us as allowed region
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
* fix(litellm_pre_call_utils.py): support 'us' region routing + fix header forwarding to filter on `x-` headers
* docs(customer_routing.md): fix region-based routing example
* feat(azure.py): handle empty arguments function call - azure
Closes https://github.com/BerriAI/litellm/issues/6241
* feat(guardrails_ai.py): support guardrails ai integration
Adds support for on-prem guardrails via guardrails ai
* fix(proxy/utils.py): prevent sql injection attack
Fixes https://huntr.com/bounties/a4f6d357-5b44-4e00-9cac-f1cc351211d2
* fix: fix linting errors
* fix(litellm_pre_call_utils.py): don't log litellm api key in proxy server request headers
* fix(litellm_pre_call_utils.py): don't forward stainless headers
* docs(guardrails_ai.md): add guardrails ai quick start to docs
* test: handle flaky test
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Marcus Elwin <marcus@elwin.com>
|
2024-10-16 07:32:06 -07:00 |
|
Krish Dholakia
|
fac3b2ee42
|
Add pyright to ci/cd + Fix remaining type-checking errors (#6082)
* fix: fix type-checking errors
* fix: fix additional type-checking errors
* fix: additional type-checking error fixes
* fix: fix additional type-checking errors
* fix: additional type-check fixes
* fix: fix all type-checking errors + add pyright to ci/cd
* fix: fix incorrect import
* ci(config.yml): use mypy on ci/cd
* fix: fix type-checking errors in utils.py
* fix: fix all type-checking errors on main.py
* fix: fix mypy linting errors
* fix(anthropic/cost_calculator.py): fix linting errors
* fix: fix mypy linting errors
* fix: fix linting errors
|
2024-10-05 17:04:00 -04:00 |
|
Krrish Dholakia
|
6e9f048618
|
fix: move to using pydantic obj for setting values
|
2024-07-11 13:18:36 -07:00 |
|
Krrish Dholakia
|
8fa2cf15ee
|
fix(watsonx.py): fix watson process response
Fixes https://github.com/BerriAI/litellm/issues/4654
|
2024-07-11 09:34:46 -07:00 |
|
Krrish Dholakia
|
bb905d7243
|
fix(utils.py): support 'drop_params' for 'parallel_tool_calls'
Closes https://github.com/BerriAI/litellm/issues/4584
OpenAI-only param
|
2024-07-08 07:36:41 -07:00 |
|
Simon Sanchez Viloria
|
8caec84cc2
|
(fix - watsonx.py) Added missing imports, moved/fixed scope of process_response
|
2024-07-07 18:55:42 +02:00 |
|
Simon Sanchez Viloria
|
06e6f52358
|
(fix - watsonx) Fixed issues with watsonx embedding/async endpoints
|
2024-07-07 17:59:37 +02:00 |
|
Krish Dholakia
|
6a5cf7f83a
|
Revert "(fix) fixed bug with the watsonx embedding endpoint"
|
2024-07-05 11:40:59 -07:00 |
|
Simon Sanchez Viloria
|
e4014d3e0e
|
fix linting error
|
2024-07-04 14:26:17 +02:00 |
|
Simon Sanchez Viloria
|
574de1d7fd
|
(fix) fixed watsonx embedding returning a coroutine
|
2024-07-04 14:12:14 +02:00 |
|
Krish Dholakia
|
86d0c0ae4e
|
Merge pull request #3582 from BerriAI/litellm_explicit_region_name_setting
feat(router.py): allow setting model_region in litellm_params
|
2024-05-11 11:36:22 -07:00 |
|
Krrish Dholakia
|
6714854bb7
|
feat(router.py): support region routing for bedrock, vertex ai, watsonx
|
2024-05-11 11:04:00 -07:00 |
|
Simon Sanchez Viloria
|
e1372de9ee
|
Merge branch 'main' into feature/watsonx-integration
|
2024-05-10 12:09:09 +02:00 |
|
Simon Sanchez Viloria
|
170fd11c82
|
(fix) watsonx.py: Fixed linting errors and make sure stream chunk always return usage
|
2024-05-10 11:53:33 +02:00 |
|
Krish Dholakia
|
8015bc1c47
|
Revert "Add support for async streaming to watsonx provider "
|
2024-05-09 07:44:15 -07:00 |
|
Krish Dholakia
|
3f13251241
|
Merge pull request #3479 from simonsanvil/feature/watsonx-integration
Add support for async streaming to watsonx provider
|
2024-05-08 22:19:05 -07:00 |
|
Krrish Dholakia
|
6575143460
|
feat(proxy_server.py): return litellm version in response headers
|
2024-05-08 16:00:08 -07:00 |
|
Simon Sanchez Viloria
|
6181d1eaad
|
Merge branch 'main' into feature/watsonx-integration
|
2024-05-06 17:27:14 +02:00 |
|
Simon Sanchez Viloria
|
83a274b54b
|
(feat) support for async stream to watsonx provider
|
2024-05-06 17:08:40 +02:00 |
|
Krrish Dholakia
|
48f19cf839
|
feat(utils.py): unify common auth params across azure/vertex_ai/bedrock/watsonx
|
2024-04-27 11:06:18 -07:00 |
|
Krrish Dholakia
|
c9d7437d16
|
fix(watsonx.py): use common litellm params for api key, api base, etc.
|
2024-04-27 10:15:27 -07:00 |
|
Simon Sanchez Viloria
|
777b4b2bbc
|
(feat) make manage_response work with request.request instead of httpx.Request
|
2024-04-24 12:55:25 +02:00 |
|
Simon Sanchez Viloria
|
e64aceea91
|
(feat) Update WatsonX credentials and variable names
|
2024-04-23 12:16:04 +02:00 |
|
Simon Sanchez Viloria
|
74d2ba0a23
|
feat - watsonx refractoring, removed dependency, and added support for embedding calls
|
2024-04-23 12:01:13 +02:00 |
|
Simon Sanchez Viloria
|
6edb133733
|
Added support for IBM watsonx.ai models
|
2024-04-20 20:06:46 +02:00 |
|