Ishaan Jaff
a69c670baa
(refactor) use helper function _assemble_complete_response_from_streaming_chunks
to assemble complete responses in caching and logging callbacks ( #6220 )
...
* (refactor) use _assemble_complete_response_from_streaming_chunks
* add unit test for test_assemble_complete_response_from_streaming_chunks_1
* fix assemble complete_streaming_response
* config add logging_testing
* add logging_coverage in codecov
* test test_assemble_complete_response_from_streaming_chunks_3
* add unit tests for _assemble_complete_response_from_streaming_chunks
* fix remove unused / junk function
* add test for streaming_chunks when error assembling
2024-10-15 12:45:12 +05:30
Krish Dholakia
1eb435e50a
test(router_code_coverage.py): check if all router functions are dire… ( #6186 )
...
* test(router_code_coverage.py): check if all router functions are directly tested
prevent regressions
* docs(configs.md): document all environment variables (#6185 )
* docs: make it easier to find anthropic/openai prompt caching doc
* aded codecov yml (#6207 )
* fix codecov.yaml
* run ci/cd again
* (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208 )
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
* (feat) prometheus have well defined latency buckets (#6211 )
* fix prometheus have well defined latency buckets
* use a well define latency bucket
* use types file for prometheus logging
* add test for LATENCY_BUCKETS
* fix prom testing
* fix config.yml
* (refactor caching) use LLMCachingHandler for caching streaming responses (#6210 )
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* refactor async set stream cache
* fix linting
* bump (#6187 )
* update code cov yaml
* fix config.yml
* add caching component to code cov
* fix config.yml ci/cd
* add coverage for proxy auth
* (refactor caching) use common `_retrieve_from_cache` helper (#6212 )
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* refactor async set stream cache
* fix linting
* refactor - use _retrieve_from_cache
* refactor use _convert_cached_result_to_model_response
* fix linting errors
* bump: version 1.49.2 → 1.49.3
* fix code cov components
* test(test_router_helpers.py): add router component unit tests
* test: add additional router tests
* test: add more router testing
* test: add more router testing + more mock functions
* ci(router_code_coverage.py): fix check
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
2024-10-14 22:44:00 -07:00
Krish Dholakia
39486e2003
Litellm dev 10 14 2024 ( #6221 )
...
* fix(__init__.py): expose DualCache, RedisCache, InMemoryCache on root
abstract internal file refactors from impacting users
* feat(utils.py): handle invalid openai parallel tool calling response
Fixes https://community.openai.com/t/model-tries-to-call-unknown-function-multi-tool-use-parallel/490653
* docs(bedrock.md): clarify all bedrock models are supported
Closes https://github.com/BerriAI/litellm/issues/6168#issuecomment-2412082236
2024-10-14 22:11:14 -07:00
Ishaan Jaff
cda0a993e2
fix importing Cache from litellm ( #6219 )
2024-10-15 08:47:23 +05:30
Ishaan Jaff
d0a3052937
(refactor router.py ) - PR 3 - Ensure all functions under 100 lines ( #6181 )
...
* add flake 8 check
* split up litellm _acompletion
* fix get model client
* refactor use commong func to add metadata to kwargs
* use common func to get timeout
* re-use helper to _get_async_model_client
* use _handle_mock_testing_rate_limit_error
* fix docstring for _handle_mock_testing_rate_limit_error
* fix function_with_retries
* use helper for mock testing fallbacks
* router - use 1 func for simple_shuffle
* add doc string for simple_shuffle
* use 1 function for filtering cooldown deployments
* fix use common helper to _get_fallback_model_group_from_fallbacks
2024-10-14 21:27:54 +05:30
Ishaan Jaff
603299e3c8
(feat) prometheus have well defined latency buckets ( #6211 )
...
* fix prometheus have well defined latency buckets
* use a well define latency bucket
* use types file for prometheus logging
* add test for LATENCY_BUCKETS
2024-10-14 17:16:01 +05:30
Ishaan Jaff
4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache ( #6208 )
...
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Ishaan Jaff
20e50d7002
run ci/cd again
2024-10-14 11:50:42 +05:30
Krish Dholakia
15b44c3221
docs(configs.md): document all environment variables ( #6185 )
2024-10-13 09:57:03 -07:00
Krish Dholakia
fc13c023b7
build(config.yml): add codecov to repo ( #6172 )
...
* build(config.yml): add codecov to repo
ensures all commits have testing coverage
* build(config.yml): fix ci config
* build: fix .yml
* build(config.yml): fix ci/cd
* ci(config.yml): specify module to measure code coverage for
* ci(config.yml): update config.yml version
* ci: trigger new run
* ci(config.yml): store combine
* build(config.yml): check files before combine
* ci(config.yml): fix check
* ci(config.yml): add codecov coverage to ci/cd
* ci(config.yml): add codecov to router tests
* ci(config.yml): wait for router testing to complete before running codecov upload
* ci(config.yml): handle multiple coverage.xml's
* fix(router.py): cleanup print stack
* ci(config.yml): fix config
* ci(config.yml): fix config
2024-10-12 14:48:17 -07:00
Krish Dholakia
2acb0c0675
Litellm Minor Fixes & Improvements (10/12/2024) ( #6179 )
...
* build(model_prices_and_context_window.json): add bedrock llama3.2 pricing
* build(model_prices_and_context_window.json): add bedrock cross region inference pricing
* Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165 )"
This reverts commit 2a5624af47
.
* add azure/gpt-4o-2024-05-13 (#6174 )
* LiteLLM Minor Fixes & Improvements (10/10/2024) (#6158 )
* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic
* fix(vertex_ai/): support passing custom api base to partner models
Fixes https://github.com/BerriAI/litellm/issues/4317
* fix(proxy_server.py): Fix prometheus premium user check logic
* docs(prometheus.md): update quick start docs
* fix(custom_llm.py): support passing dynamic api key + api base
* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints
Closes https://github.com/BerriAI/litellm/issues/6081
* feat(openai/realtime): add openai realtime api logging
Closes https://github.com/BerriAI/litellm/issues/6081
* fix(realtime_streaming.py): fix linting errors
* fix(realtime_streaming.py): fix linting errors
* fix: fix linting errors
* fix pattern match router
* Add literalai in the sidebar observability category (#6163 )
* fix: add literalai in the sidebar
* fix: typo
* update (#6160 )
* Feat: Add Langtrace integration (#5341 )
* Feat: Add Langtrace integration
* add langtrace service name
* fix timestamps for traces
* add tests
* Discard Callback + use existing otel logger
* cleanup
* remove print statments
* remove callback
* add docs
* docs
* add logging docs
* format logging
* remove emoji and add litellm proxy example
* format logging
* format `logging.md`
* add langtrace docs to logging.md
* sync conflict
* docs fix
* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165 )
* fix move s3 to use customLogger
* add basic s3 logging test
* add s3 to custom logger compatible
* use batch logger for s3
* s3 set flush interval and batch size
* fix s3 logging
* add notes on s3 logging
* fix s3 logging
* add basic s3 logging test
* fix s3 type errors
* add test for sync logging on s3
* fix: fix to debug log
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
* docs(custom_llm_server.md): update doc on passing custom params
* fix(pass_through_endpoints.py): don't require headers
Fixes https://github.com/BerriAI/litellm/issues/6128
* feat(utils.py): add support for caching rerank endpoints
Closes https://github.com/BerriAI/litellm/issues/6144
* feat(litellm_logging.py'): add response headers for failed requests
Closes https://github.com/BerriAI/litellm/issues/6159
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-12 11:48:34 -07:00
Ishaan Jaff
80ecf0829c
(fix) provider wildcard routing - when models specificed without provider prefix ( #6173 )
...
* fix wildcard routing scenario
* fix pattern matching hits
2024-10-12 16:01:21 +05:30
Ishaan Jaff
b032e898c2
(fix) batch_completion fails with bedrock due to extraneous [max_workers] key ( #6176 )
...
* fix batch_completion
* fix import batch completion
* fix batch completion usage
2024-10-12 14:10:24 +05:30
Krish Dholakia
11f9df923a
LiteLLM Minor Fixes & Improvements (10/10/2024) ( #6158 )
...
* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic
* fix(vertex_ai/): support passing custom api base to partner models
Fixes https://github.com/BerriAI/litellm/issues/4317
* fix(proxy_server.py): Fix prometheus premium user check logic
* docs(prometheus.md): update quick start docs
* fix(custom_llm.py): support passing dynamic api key + api base
* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints
Closes https://github.com/BerriAI/litellm/issues/6081
* feat(openai/realtime): add openai realtime api logging
Closes https://github.com/BerriAI/litellm/issues/6081
* fix(realtime_streaming.py): fix linting errors
* fix(realtime_streaming.py): fix linting errors
* fix: fix linting errors
* fix pattern match router
* Add literalai in the sidebar observability category (#6163 )
* fix: add literalai in the sidebar
* fix: typo
* update (#6160 )
* Feat: Add Langtrace integration (#5341 )
* Feat: Add Langtrace integration
* add langtrace service name
* fix timestamps for traces
* add tests
* Discard Callback + use existing otel logger
* cleanup
* remove print statments
* remove callback
* add docs
* docs
* add logging docs
* format logging
* remove emoji and add litellm proxy example
* format logging
* format `logging.md`
* add langtrace docs to logging.md
* sync conflict
* docs fix
* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165 )
* fix move s3 to use customLogger
* add basic s3 logging test
* add s3 to custom logger compatible
* use batch logger for s3
* s3 set flush interval and batch size
* fix s3 logging
* add notes on s3 logging
* fix s3 logging
* add basic s3 logging test
* fix s3 type errors
* add test for sync logging on s3
* fix: fix to debug log
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-11 23:04:36 -07:00
Ishaan Jaff
91ecb36277
Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] ( #6165 )"
...
This reverts commit 2a5624af47
.
2024-10-12 07:08:30 +05:30
Ishaan Jaff
2a5624af47
(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] ( #6165 )
...
* fix move s3 to use customLogger
* add basic s3 logging test
* add s3 to custom logger compatible
* use batch logger for s3
* s3 set flush interval and batch size
* fix s3 logging
* add notes on s3 logging
* fix s3 logging
* add basic s3 logging test
* fix s3 type errors
* add test for sync logging on s3
2024-10-11 19:49:03 +05:30
Jacques Verré
4064bfc6dd
[Feat] Observability integration - Opik by Comet ( #6062 )
...
* Added Opik logging and evaluation
* Updated doc examples
* Default tags should be [] in case appending
* WIP
* Work in progress
* Opik integration
* Opik integration
* Revert changes on litellm_logging.py
* Updated Opik integration for synchronous API calls
* Updated Opik documentation
---------
Co-authored-by: Douglas Blank <doug@comet.com>
Co-authored-by: Doug Blank <doug.blank@gmail.com>
2024-10-10 18:27:50 +05:30
Ishaan Jaff
89506053a4
(feat) use regex pattern matching for wildcard routing ( #6150 )
...
* use pattern matching for llm deployments
* code quality fix
* fix linting
* add types to PatternMatchRouter
* docs add example config for regex patterns
2024-10-10 18:24:16 +05:30
Krish Dholakia
6005450c8f
LiteLLM Minor Fixes & Improvements (10/09/2024) ( #6139 )
...
* fix(utils.py): don't return 'none' response headers
Fixes https://github.com/BerriAI/litellm/issues/6123
* fix(vertex_and_google_ai_studio_gemini.py): support parsing out additional properties and strict value for tool calls
Fixes https://github.com/BerriAI/litellm/issues/6136
* fix(cost_calculator.py): set default character value to none
Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403290196
* fix(google.py): fix cost per token / cost per char conversion
Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403370287
* build(model_prices_and_context_window.json): update gemini pricing
Fixes https://github.com/BerriAI/litellm/issues/6133
* build(model_prices_and_context_window.json): update gemini pricing
* fix(litellm_logging.py): fix streaming caching logging when 'turn_off_message_logging' enabled
Stores unredacted response in cache
* build(model_prices_and_context_window.json): update gemini-1.5-flash pricing
* fix(cost_calculator.py): fix default prompt_character count logic
Fixes error in gemini cost calculation
* fix(cost_calculator.py): fix cost calc for tts models
2024-10-10 00:42:11 -07:00
Ishaan Jaff
1fd437e263
(feat proxy) [beta] add support for organization role based access controls ( #6112 )
...
* track LiteLLM_OrganizationMembership
* add add_internal_user_to_organization
* add org membership to schema
* read organization membership when reading user info in auth checks
* add check for valid organization_id
* add test for test_create_new_user_in_organization
* test test_create_new_user_in_organization
* add new ADMIN role
* add test for org admins creating teams
* add test for test_org_admin_create_user_permissions
* test_org_admin_create_user_team_wrong_org_permissions
* test_org_admin_create_user_team_wrong_org_permissions
* fix organization_role_based_access_check
* fix getting user members
* fix TeamBase
* fix types used for use role
* fix type checks
* sync prisma schema
* docs - organization admins
* fix use organization_endpoints for /organization management
* add types for org member endpoints
* fix role name for org admin
* add type for member add response
* add organization/member_add
* add error handling for adding members to an org
* add nice doc string for oranization/member_add
* fix test_create_new_user_in_organization
* linting fix
* use simple route changes
* fix types
* add organization member roles
* add org admin auth checks
* add auth checks for orgs
* test for creating teams as org admin
* simplify org id usage
* fix typo
* test test_org_admin_create_user_team_wrong_org_permissions
* fix type check issue
* code quality fix
* fix schema.prisma
2024-10-09 15:18:18 +05:30
Krish Dholakia
9695c1af10
LiteLLM Minor Fixes & Improvements (10/08/2024) ( #6119 )
...
* refactor(cost_calculator.py): move error line to debug - https://github.com/BerriAI/litellm/issues/5683#issuecomment-2398599498
* fix(migrate-hidden-params-to-read-from-standard-logging-payload): Fixes https://github.com/BerriAI/litellm/issues/5546#issuecomment-2399994026
* fix(types/utils.py): mark weight as a litellm param
Fixes https://github.com/BerriAI/litellm/issues/5781
* feat(internal_user_endpoints.py): fix /user/info + show user max budget as default max budget
Fixes https://github.com/BerriAI/litellm/issues/6117
* feat: support returning team member budget in `/user/info`
Sets user max budget in team as max budget on ui
Closes https://github.com/BerriAI/litellm/issues/6117
* bug fix for optional parameter passing to replicate (#6067 )
Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>
* fix(o1_transformation.py): handle o1 temperature=0
o1 doesn't support temp=0, allow admin to drop this param
* test: fix test
---------
Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>
Co-authored-by: Mandana Vaziri <mvaziri@us.ibm.com>
2024-10-08 21:57:03 -07:00
Ishaan Jaff
92a1924112
trigger ci/cd run
2024-10-08 20:16:37 +05:30
Ishaan Jaff
d1760b1b04
(fix) clean up root repo - move entrypoint.sh and build_admin_ui to /docker ( #6110 )
...
* fix move docker files to docker folders
* move check file length
* fix docker hub deploy
* fix clean up root
* fix circle ci config
2024-10-08 11:34:43 +05:30
Krish Dholakia
6729c9ca7f
LiteLLM Minor Fixes & Improvements (10/07/2024) ( #6101 )
...
* fix(utils.py): support dropping temperature param for azure o1 models
* fix(main.py): handle azure o1 streaming requests
o1 doesn't support streaming, fake it to ensure code works as expected
* feat(utils.py): expose `hosted_vllm/` endpoint, with tool handling for vllm
Fixes https://github.com/BerriAI/litellm/issues/6088
* refactor(internal_user_endpoints.py): cleanup unused params + update docstring
Closes https://github.com/BerriAI/litellm/issues/6100
* fix(main.py): expose custom image generation api support
Fixes https://github.com/BerriAI/litellm/issues/6097
* fix: fix linting errors
* docs(custom_llm_server.md): add docs on custom api for image gen calls
* fix(types/utils.py): handle dict type
* fix(types/utils.py): fix linting errors
2024-10-07 22:17:22 -07:00
Ishaan Jaff
2c8bba293f
(bug fix) TTL not being set for embedding caching requests ( #6095 )
...
* fix ttl for cache pipeline settings
* add test for caching
* add test for setting ttls on redis caching
2024-10-07 15:53:18 +05:30
Ishaan Jaff
a7628317cd
(proxy ui) - fix view user pagination ( #6094 )
...
* ui - fix view user pagination
* add new internal user test
2024-10-07 12:31:55 +05:30
Krish Dholakia
49d8b2be46
fix(utils.py): fix fix pydantic obj to schema creation for vertex en… ( #6071 )
...
* fix(utils.py): fix fix pydantic obj to schema creation for vertex endpoints
Fixes https://github.com/BerriAI/litellm/issues/6027
* test(test_completion.pyu): skip test - avoid hitting gemini rate limits
* fix(common_utils.py): fix ruff linting error
2024-10-06 00:25:55 -04:00
Krish Dholakia
04e5963b65
Litellm expose disable schema update flag ( #6085 )
...
* fix: enable new 'disable_prisma_schema_update' flag
* build(config.yml): remove setup remote docker step
* ci(config.yml): give container time to start up
* ci(config.yml): update test
* build(config.yml): actually start docker
* build(config.yml): simplify grep check
* fix(prisma_client.py): support reading disable_schema_update via env vars
* ci(config.yml): add test to check if all general settings are documented
* build(test_General_settings.py): check available dir
* ci: check ../ repo path
* build: check ./
* build: fix test
2024-10-05 21:26:51 -04:00
Krish Dholakia
f2c0a31e3c
LiteLLM Minor Fixes & Improvements (10/05/2024) ( #6083 )
...
* docs(prompt_caching.md): add prompt caching cost calc example to docs
* docs(prompt_caching.md): add proxy examples to docs
* feat(utils.py): expose new helper `supports_prompt_caching()` to check if a model supports prompt caching
* docs(prompt_caching.md): add docs on checking model support for prompt caching
* build: fix invalid json
2024-10-05 18:59:11 -04:00
Krish Dholakia
fac3b2ee42
Add pyright to ci/cd + Fix remaining type-checking errors ( #6082 )
...
* fix: fix type-checking errors
* fix: fix additional type-checking errors
* fix: additional type-checking error fixes
* fix: fix additional type-checking errors
* fix: additional type-check fixes
* fix: fix all type-checking errors + add pyright to ci/cd
* fix: fix incorrect import
* ci(config.yml): use mypy on ci/cd
* fix: fix type-checking errors in utils.py
* fix: fix all type-checking errors on main.py
* fix: fix mypy linting errors
* fix(anthropic/cost_calculator.py): fix linting errors
* fix: fix mypy linting errors
* fix: fix linting errors
2024-10-05 17:04:00 -04:00
Ishaan Jaff
3cb04480fb
(code clean up) use a folder for gcs bucket logging + add readme in folder ( #6080 )
...
* refactor gcs bucket
* add readme
2024-10-05 16:58:10 +05:30
Ishaan Jaff
ab0b536143
(feat) add azure openai cost tracking for prompt caching ( #6077 )
...
* add azure o1 models to model cost map
* add azure o1 cost tracking
* fix azure cost calc
* add get llm provider test
2024-10-05 15:04:18 +05:30
Ishaan Jaff
3682f661d8
(feat) add cost tracking for OpenAI prompt caching ( #6055 )
...
* add cache_read_input_token_cost for prompt caching models
* add prompt caching for latest models
* add openai cost calculator
* add openai prompt caching test
* fix lint check
* add not on how usage._cache_read_input_tokens is used
* fix cost calc whisper openai
* use output_cost_per_second
* add input_cost_per_second
2024-10-05 14:20:15 +05:30
Ishaan Jaff
c84cfe977e
(feat) add /key/health endpoint to test key based logging ( #6073 )
...
* add /key/health endpoint
* add /key/health endpoint
* fix return from /key/health
* update doc string
* fix doc string for /key/health
* add test for /key/health
* fix linting
* docs /key/health
2024-10-05 11:56:55 +05:30
Krish Dholakia
2e5c46ef6d
LiteLLM Minor Fixes & Improvements (10/04/2024) ( #6064 )
...
* fix(litellm_logging.py): ensure cache hits are scrubbed if 'turn_off_message_logging' is enabled
* fix(sagemaker.py): fix streaming to raise error immediately
Fixes https://github.com/BerriAI/litellm/issues/6054
* (fixes) gcs bucket key based logging (#6044 )
* fixes for gcs bucket logging
* fix StandardCallbackDynamicParams
* fix - gcs logging when payload is not serializable
* add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket
* working success callbacks
* linting fixes
* fix linting error
* add type hints to functions
* fixes for dynamic success and failure logging
* fix for test_async_chat_openai_stream
* fix handle case when key based logging vars are set as os.environ/ vars
* fix prometheus track cooldown events on custom logger (#6060 )
* (docs) add 1k rps load test doc (#6059 )
* docs 1k rps load test
* docs load testing
* docs load testing litellm
* docs load testing
* clean up load test doc
* docs prom metrics for load testing
* docs using prometheus on load testing
* doc load testing with prometheus
* (fixes) docs + qa - gcs key based logging (#6061 )
* fixes for required values for gcs bucket
* docs gcs bucket logging
* bump: version 1.48.12 → 1.48.13
* ci/cd run again
* bump: version 1.48.13 → 1.48.14
* update load test doc
* (docs) router settings - on litellm config (#6037 )
* add yaml with all router settings
* add docs for router settings
* docs router settings litellm settings
* (feat) OpenAI prompt caching models to model cost map (#6063 )
* add prompt caching for latest models
* add cache_read_input_token_cost for prompt caching models
* fix(litellm_logging.py): check if param is iterable
Fixes https://github.com/BerriAI/litellm/issues/6025#issuecomment-2393929946
* fix(factory.py): support passing an 'assistant_continue_message' to prevent bedrock error
Fixes https://github.com/BerriAI/litellm/issues/6053
* fix(databricks/chat): handle streaming responses
* fix(factory.py): fix linting error
* fix(utils.py): unify anthropic + deepseek prompt caching information to openai format
Fixes https://github.com/BerriAI/litellm/issues/6069
* test: fix test
* fix(types/utils.py): support all openai roles
Fixes https://github.com/BerriAI/litellm/issues/6052
* test: fix test
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-10-04 21:28:53 -04:00
Ishaan Jaff
3c59d188ef
ci/cd run again
2024-10-04 17:19:26 +05:30
Ishaan Jaff
670ecda4e2
(fixes) gcs bucket key based logging ( #6044 )
...
* fixes for gcs bucket logging
* fix StandardCallbackDynamicParams
* fix - gcs logging when payload is not serializable
* add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket
* working success callbacks
* linting fixes
* fix linting error
* add type hints to functions
* fixes for dynamic success and failure logging
* fix for test_async_chat_openai_stream
2024-10-04 11:56:10 +05:30
Krish Dholakia
09f0c09ba4
fix(utils.py): return openai streaming prompt caching tokens ( #6051 )
...
* fix(utils.py): return openai streaming prompt caching tokens
Closes https://github.com/BerriAI/litellm/issues/6038
* fix(main.py): fix error in finish_reason updates
2024-10-03 22:20:13 -04:00
Krish Dholakia
5c33d1c9af
Litellm Minor Fixes & Improvements (10/03/2024) ( #6049 )
...
* fix(proxy_server.py): remove spendlog fixes from proxy startup logic
Moves https://github.com/BerriAI/litellm/pull/4794 to `/db_scripts` and cleans up some caching-related debug info (easier to trace debug logs)
* fix(langfuse_endpoints.py): Fixes https://github.com/BerriAI/litellm/issues/6041
* fix(azure.py): fix health checks for azure audio transcription models
Fixes https://github.com/BerriAI/litellm/issues/5999
* Feat: Add Literal AI Integration (#5653 )
* feat: add Literal AI integration
* update readme
* Update README.md
* fix: address comments
* fix: remove literalai sdk
* fix: use HTTPHandler
* chore: add test
* fix: add asyncio lock
* fix(literal_ai.py): fix linting errors
* fix(literal_ai.py): fix linting errors
* refactor: cleanup
---------
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
2024-10-03 18:02:28 -04:00
Ishaan Jaff
4e88fd65e1
(feat) openai prompt caching (non streaming) - add prompt_tokens_details in usage response ( #6039 )
...
* add prompt_tokens_details in usage response
* use _prompt_tokens_details as a param in Usage
* fix linting errors
* fix type error
* fix ci/cd deps
* bump deps for openai
* bump deps openai
* fix llm translation testing
* fix llm translation embedding
2024-10-03 23:31:10 +05:30
Krish Dholakia
9fccb4a0da
fix(factory.py): bedrock: merge consecutive tool + user messages ( #6028 )
...
* fix(factory.py): bedrock: merge consecutive tool + user messages
Fixes https://github.com/BerriAI/litellm/issues/6007
* LiteLLM Minor Fixes & Improvements (10/02/2024) (#6023 )
* feat(together_ai/completion): handle together ai completion calls
* fix: handle list of int / list of list of int for text completion calls
* fix(utils.py): check if base model in bedrock converse model list
Fixes https://github.com/BerriAI/litellm/issues/6003
* test(test_optional_params.py): add unit tests for bedrock optional param mapping
Fixes https://github.com/BerriAI/litellm/issues/6003
* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist
Fixes https://github.com/BerriAI/litellm/issues/5388
* fixed an issue with tool use of claude models with anthropic and bedrock (#6013 )
* fix(utils.py): handle empty schema for anthropic/bedrock
Fixes https://github.com/BerriAI/litellm/issues/6012
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting errors
* fix(proxy_cli.py): fix import route for app + health checks path (#6026 )
* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018 )
* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav
Fixes https://github.com/BerriAI/litellm/issues/5999
---------
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
---------
Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
* fix(factory.py): correctly handle content in tool block
---------
Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-10-03 09:16:25 -04:00
Ishaan Jaff
1ab886f80d
(contributor PRs) oct 3rd, 2024 ( #6034 )
...
* Do not skip important tests for OIDC. (#6017 )
* [Bug] Skip monthly slack alert if there was no spend (#6015 )
* Fix: skip slack alert if there was no spend
* Skip monthly report when there was no spend
---------
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
---------
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Paz <paz@tryolabs.com>
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
2024-10-03 17:12:34 +05:30
Ishaan Jaff
d92696a303
(feat) add nvidia nim embeddings ( #6032 )
...
* nvidia nim support embedding config
* add nvidia config in init
* nvidia nim embeddings
* docs nvidia nim embeddings
* docs embeddings on nvidia nim
* fix llm translation test
2024-10-03 17:12:14 +05:30
Ishaan Jaff
21e05a0f3e
(feat proxy) add key based logging for GCS bucket ( #6031 )
...
* init litellm langfuse / gcs credentials in litellm logging obj
* add gcs key based test
* rename vars
* save standard_callback_dynamic_params in model call details
* add working gcs bucket key based logging
* test_basic_gcs_logging_per_request
* linting fix
* add doc on gcs bucket team based logging
2024-10-03 15:24:31 +05:30
Ishaan Jaff
835db6ae98
(load testing) add vertex_ai embeddings load test ( #6004 )
...
* use vertex llm as base class for embeddings
* use correct vertex class in main.py
* set_headers in vertex llm base
* add types for vertex embedding requests
* add embedding handler for vertex
* use async mode for vertex embedding tests
* use vertexAI textEmbeddingConfig
* fix linting
* add sync and async mode testing for vertex ai embeddings
* add basic load test
* add vertex ai load test on ci cd
2024-10-03 14:39:15 +05:30
Krish Dholakia
14165d3648
LiteLLM Minor Fixes & Improvements (10/02/2024) ( #6023 )
...
* feat(together_ai/completion): handle together ai completion calls
* fix: handle list of int / list of list of int for text completion calls
* fix(utils.py): check if base model in bedrock converse model list
Fixes https://github.com/BerriAI/litellm/issues/6003
* test(test_optional_params.py): add unit tests for bedrock optional param mapping
Fixes https://github.com/BerriAI/litellm/issues/6003
* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist
Fixes https://github.com/BerriAI/litellm/issues/5388
* fixed an issue with tool use of claude models with anthropic and bedrock (#6013 )
* fix(utils.py): handle empty schema for anthropic/bedrock
Fixes https://github.com/BerriAI/litellm/issues/6012
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting errors
* fix(proxy_cli.py): fix import route for app + health checks path (#6026 )
* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018 )
* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav
Fixes https://github.com/BerriAI/litellm/issues/5999
---------
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
---------
Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-10-02 22:00:28 -04:00
David Manouchehri
8995ff49ae
(testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. ( #6018 )
2024-10-02 12:02:22 -04:00
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement ( #5992 )
...
* ci(config.yml): add a 'check_code_quality' step
Addresses https://github.com/BerriAI/litellm/issues/5991
* ci(config.yml): check why circle ci doesn't pick up this test
* ci(config.yml): fix to run 'check_code_quality' tests
* fix(__init__.py): fix unprotected import
* fix(__init__.py): don't remove unused imports
* build(ruff.toml): update ruff.toml to ignore unused imports
* fix: fix: ruff + pyright - fix linting + type-checking errors
* fix: fix linting errors
* fix(lago.py): fix module init error
* fix: fix linting errors
* ci(config.yml): cd into correct dir for checks
* fix(proxy_server.py): fix linting error
* fix(utils.py): fix bare except
causes ruff linting errors
* fix: ruff - fix remaining linting errors
* fix(clickhouse.py): use standard logging object
* fix(__init__.py): fix unprotected import
* fix: ruff - fix linting errors
* fix: fix linting errors
* ci(config.yml): cleanup code qa step (formatting handled in local_testing)
* fix(_health_endpoints.py): fix ruff linting errors
* ci(config.yml): just use ruff in check_code_quality pipeline for now
* build(custom_guardrail.py): include missing file
* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Ishaan Jaff
eef9bad9a6
(performance improvement - vertex embeddings) ~111.11% faster ( #6000 )
...
* use vertex llm as base class for embeddings
* use correct vertex class in main.py
* set_headers in vertex llm base
* add types for vertex embedding requests
* add embedding handler for vertex
* use async mode for vertex embedding tests
* use vertexAI textEmbeddingConfig
* fix linting
* add sync and async mode testing for vertex ai embeddings
2024-10-01 14:16:21 -07:00
Ishaan Jaff
045ecf3ffb
(feat proxy slack alerting) - allow opting in to getting key / internal user alerts ( #5990 )
...
* define all slack alert types
* use correct type hints for alert type
* use correct defaults on slack alerting
* add readme for slack alerting
* fix linting error
* update readme
* docs all alert types
* update slack alerting docs
* fix slack alerting docs
* handle new testing dir structure
* fix config for testing
* fix testing folder related imports
* fix /tests import errors
* fix import stream_chunk_testdata
* docs alert types
* fix test test_langfuse_trace_id
* fix type checks for slack alerting
* fix outage alerting test slack
2024-10-01 10:49:22 -07:00