Krish Dholakia
47e811d6ce
fix(llm_http_handler.py): fix fake streaming ( #10061 )
...
* fix(llm_http_handler.py): fix fake streaming
allows groq to work with llm_http_handler
* fix(groq.py): migrate groq to openai like config
ensures json mode handling works correctly
2025-04-16 10:15:11 -07:00
Krish Dholakia
c603680d2a
fix(stream_chunk_builder_utils.py): don't set index on modelresponse ( #10063 )
...
* fix(stream_chunk_builder_utils.py): don't set index on modelresponse
* test: update tests
2025-04-16 10:11:47 -07:00
dependabot[bot]
7b7b43e1a7
build(deps): bump http-proxy-middleware in /docs/my-website ( #10064 )
...
Bumps [http-proxy-middleware](https://github.com/chimurai/http-proxy-middleware ) from 2.0.7 to 2.0.9.
- [Release notes](https://github.com/chimurai/http-proxy-middleware/releases )
- [Changelog](https://github.com/chimurai/http-proxy-middleware/blob/v2.0.9/CHANGELOG.md )
- [Commits](https://github.com/chimurai/http-proxy-middleware/compare/v2.0.7...v2.0.9 )
---
updated-dependencies:
- dependency-name: http-proxy-middleware
dependency-version: 2.0.9
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-16 09:55:44 -07:00
Ishaan Jaff
a9e8a36f89
[Bug Fix] Azure Blob Storage fixes ( #10059 )
...
* Simple fix for #9339 - upgrade the underlying library and cache the azure storage client (#9965 )
* fix - use constants for caching azure storage client
---------
Co-authored-by: Adrian Lyjak <adrian@chatmeter.com>
2025-04-16 09:47:10 -07:00
Krrish Dholakia
a743b6fc1f
fix(bedrock/common_utils.py): add us-west-1 to us regions
2025-04-16 08:00:39 -07:00
Krrish Dholakia
6b7d20c911
test: fix test
2025-04-16 07:57:10 -07:00
ChaoFu Yang
c07eea864e
/utils/token_counter: get model_info from deployment directly ( #10047 )
2025-04-16 07:53:18 -07:00
Michael Leshchinsky
e19d05980c
Add litellm call id passing to Aim guardrails on pre and post-hooks calls ( #10021 )
...
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 19s
* Add litellm_call_id passing to aim guardrails on pre and post-hooks
* Add test that ensures that pre_call_hook receives litellm call id when common_request_processing called
2025-04-16 07:41:28 -07:00
Ishaan Jaff
ca593e003a
bump litellm-proxy-extras==0.1.9
Publish Prisma Migrations / publish-migrations (push) Failing after 36s
Read Version from pyproject.toml / read-version (push) Successful in 45s
Helm unit test / unit-test (push) Successful in 54s
2025-04-15 22:49:24 -07:00
Ishaan Jaff
1d4fea509d
ui new build
2025-04-15 22:36:44 -07:00
Ishaan Jaff
ad09d250ef
test fix azure deprecated mistral
2025-04-15 22:32:14 -07:00
Ishaan Jaff
dcc43e797a
[Docs] Auto prompt caching ( #10044 )
...
* docs prompt cache controls
* doc fix auto prompt caching
2025-04-15 22:29:47 -07:00
Krish Dholakia
fdfa1108a6
Add property ordering for vertex ai schema ( #9828 ) + Fix combining multiple tool calls ( #10040 )
...
* fix #9783 : Retain schema field ordering for google gemini and vertex (#9828 )
* test: update test
* refactor(groq.py): initial commit migrating groq to base_llm_http_handler
* fix(streaming_chunk_builder_utils.py): fix how tool content is combined
Fixes https://github.com/BerriAI/litellm/issues/10034
* fix(vertex_ai/common_utils.py): prevent infinite loop in helper function
* fix(groq/chat/transformation.py): handle groq streaming errors correctly
* fix(groq/chat/transformation.py): handle max_retries
---------
Co-authored-by: Adrian Lyjak <adrian@chatmeter.com>
2025-04-15 22:29:25 -07:00
Krish Dholakia
1b9b745cae
Fix gcs pub sub logging with env var GCS_PROJECT_ID ( #10042 )
...
* fix(pub_sub.py): fix passing project id in pub sub call
Fixes issue where GCS_PUBSUB_PROJECT_ID was not being used
* test(test_pub_sub.py): add unit test to prevent future regressions
* test: fix test
2025-04-15 21:50:48 -07:00
Ishaan Jaff
b3f37b860d
test fix azure deprecated mistral ai
2025-04-15 21:42:40 -07:00
Ishaan Jaff
bd88263b29
[Feat - Cost Tracking improvement] Track prompt caching metrics in DailyUserSpendTransactions ( #10029 )
...
* stash changes
* emit cache read/write tokens to daily spend update
* emit cache read/write tokens on daily activity
* update types.ts
* docs prompt caching
* undo ui change
* fix activity metrics
* fix prompt caching metrics
* fix typed dict fields
* fix get_aggregated_daily_spend_update_transactions
* fix aggregating cache tokens
* test_cache_token_fields_aggregation
* daily_transaction
* add cache_creation_input_tokens and cache_read_input_tokens to LiteLLM_DailyUserSpend
* test_daily_spend_update_queue.py
2025-04-15 21:40:57 -07:00
Ishaan Jaff
d32d6fe03e
[UI] Bug Fix - Show created_at and updated_at for Users Page ( #10033 )
...
* add created_at and updated_at as fields for internal user table
* test_get_users_includes_timestamps
2025-04-15 21:15:44 -07:00
Ishaan Jaff
70d740332f
[UI Polish] UI fixes for cache control injection settings ( #10031 )
...
* ui fixes for cache control
* docs inject cache control settings
2025-04-15 21:10:08 -07:00
Ishaan Jaff
65f8015221
test fix - azure deprecated azure ai mistral
2025-04-15 21:08:55 -07:00
Krish Dholakia
9b77559ccf
Add aggregate team based usage logging ( #10039 )
...
* feat(schema.prisma): initial commit adding aggregate table for team spend
allows team spend to be visible at 1m+ logs
* feat(db_spend_update_writer.py): support logging aggregate team spend
allows usage dashboard to work at 1m+ logs
* feat(litellm-proxy-extras/): add new migration file
* fix(db_spend_update_writer.py): fix return type
* build: bump requirements
* fix: fix ruff error
2025-04-15 20:58:48 -07:00
Krish Dholakia
d3e7a137ad
Revert "fix #9783 : Retain schema field ordering for google gemini and vertex …" ( #10038 )
...
This reverts commit e3729f9855
.
2025-04-15 19:21:33 -07:00
Adrian Lyjak
e3729f9855
fix #9783 : Retain schema field ordering for google gemini and vertex ( #9828 )
2025-04-15 19:12:02 -07:00
Marc Abramowitz
837a6948d8
Fix typo: Entrata -> Entra in code ( #9922 )
...
* Fix typo: Entrata -> Entra
* Fix a few more
2025-04-15 17:31:18 -07:00
dependabot[bot]
81e7741107
build(deps): bump @babel/runtime in /ui/litellm-dashboard ( #10001 )
...
Bumps [@babel/runtime](https://github.com/babel/babel/tree/HEAD/packages/babel-runtime ) from 7.23.9 to 7.27.0.
- [Release notes](https://github.com/babel/babel/releases )
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md )
- [Commits](https://github.com/babel/babel/commits/v7.27.0/packages/babel-runtime )
---
updated-dependencies:
- dependency-name: "@babel/runtime"
dependency-version: 7.27.0
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-15 16:35:26 -07:00
Joakim Lorentz
c9cf43df5b
chore(docs): Update logging.md ( #10006 )
...
Fixes a missing slash in OTEL_ENDPOINT example
2025-04-15 16:34:55 -07:00
Ishaan Jaff
09df3815b8
docs cache control injection points
2025-04-15 15:43:58 -07:00
Krrish Dholakia
ef80d25f16
bump: version 1.66.1 → 1.66.2
Read Version from pyproject.toml / read-version (push) Successful in 15s
Helm unit test / unit-test (push) Successful in 23s
2025-04-15 13:52:46 -07:00
Krrish Dholakia
8424171c2a
fix(config_settings.md): cleanup
2025-04-15 13:41:22 -07:00
Krish Dholakia
6b5f093087
Revert "Fix case where only system messages are passed to Gemini ( #9992 )" ( #10027 )
...
This reverts commit 2afd922f8c
.
2025-04-15 13:34:03 -07:00
Nolan Tremelling
2afd922f8c
Fix case where only system messages are passed to Gemini ( #9992 )
2025-04-15 13:30:49 -07:00
Michael Schmid
14bcc9a6c9
feat: update region configuration in AmazonBedrockGlobalConfig ( #9430 )
2025-04-15 09:59:32 -07:00
Krrish Dholakia
aff0d1a18c
docs(cohere.md): add cohere cost tracking support to docs
Read Version from pyproject.toml / read-version (push) Successful in 17s
Helm unit test / unit-test (push) Successful in 25s
2025-04-14 23:46:58 -07:00
Krish Dholakia
33ead69c0a
Support checking provider /models
endpoints on proxy /v1/models
endpoint ( #9958 )
...
* feat(utils.py): support global flag for 'check_provider_endpoints'
enables setting this for `/models` on proxy
* feat(utils.py): add caching to 'get_valid_models'
Prevents checking endpoint repeatedly
* fix(utils.py): ensure mutations don't impact cached results
* test(test_utils.py): add unit test to confirm cache invalidation logic
* feat(utils.py): get_valid_models - support passing litellm params dynamically
Allows for checking endpoints based on received credentials
* test: update test
* feat(model_checks.py): pass router credentials to get_valid_models - ensures it checks correct credentials
* refactor(utils.py): refactor for simpler functions
* fix: fix linting errors
* fix(utils.py): fix test
* fix(utils.py): set valid providers to custom_llm_provider, if given
* test: update test
* fix: fix ruff check error
2025-04-14 23:23:20 -07:00
Eoous
e94eb4ec70
env for litellm.modify_params
( #9964 )
Read Version from pyproject.toml / read-version (push) Successful in 17s
Helm unit test / unit-test (push) Successful in 23s
2025-04-14 22:33:56 -07:00
Ishaan Jaff
4f9bcd9b94
fix mock tests ( #10003 )
2025-04-14 22:09:22 -07:00
Krish Dholakia
9b0f871129
Add /vllm/*
and /mistral/*
passthrough endpoints (adds support for Mistral OCR via passthrough)
...
* feat(llm_passthrough_endpoints.py): support mistral passthrough
Closes https://github.com/BerriAI/litellm/issues/9051
* feat(llm_passthrough_endpoints.py): initial commit for adding vllm passthrough route
* feat(vllm/common_utils.py): add new vllm model info route
make it possible to use vllm passthrough route via factory function
* fix(llm_passthrough_endpoints.py): add all methods to vllm passthrough route
* fix: fix linting error
* fix: fix linting error
* fix: fix ruff check
* fix(proxy/_types.py): add new passthrough routes
* docs(config_settings.md): add mistral env vars to docs
2025-04-14 22:06:33 -07:00
Krish Dholakia
8faf56922c
Fix azure tenant id check from env var + response_format check on api_version 2025+ ( #9993 )
...
* fix(azure/common_utils.py): check for azure tenant id, client id, client secret in env var
Fixes https://github.com/BerriAI/litellm/issues/9598#issuecomment-2801966027
* fix(azure/gpt_transformation.py): fix passing response_format to azure when api year = 2025
Fixes https://github.com/BerriAI/litellm/issues/9703
* test: monkeypatch azure api version in test
* test: update testing
* test: fix test
* test: update test
* docs(config_settings.md): document env vars
2025-04-14 22:02:35 -07:00
Ishaan Jaff
ce2595f56a
bump: version 1.66.0 → 1.66.1
2025-04-14 21:30:07 -07:00
Ishaan Jaff
b210639dce
ui new build
2025-04-14 21:19:21 -07:00
Ishaan Jaff
c1a642ce20
[UI] Allow setting prompt cache_control_injection_points
( #10000 )
...
* test_anthropic_cache_control_hook_system_message
* test_anthropic_cache_control_hook.py
* should_run_prompt_management_hooks
* fix should_run_prompt_management_hooks
* test_anthropic_cache_control_hook_specific_index
* fix test
* fix linting errors
* ChatCompletionCachedContent
* initial commit for cache control
* fixes ui design
* fix inserting cache_control_injection_points
* fix entering cache control points
* fixes for using cache control on ui + backend
* update cache control settings on edit model page
* fix init custom logger compatible class
* fix linting errors
* fix linting errors
* fix get_chat_completion_prompt
2025-04-14 21:17:42 -07:00
Ishaan Jaff
6cfa50d278
[Feat] Add support for cache_control_injection_points
for Anthropic API, Bedrock API ( #9996 )
...
* test_anthropic_cache_control_hook_system_message
* test_anthropic_cache_control_hook.py
* should_run_prompt_management_hooks
* fix should_run_prompt_management_hooks
* test_anthropic_cache_control_hook_specific_index
* fix test
* fix linting errors
* ChatCompletionCachedContent
2025-04-14 20:50:13 -07:00
Krish Dholakia
2ed593e052
Updated cohere v2 passthrough ( #9997 )
...
* Add cohere `/v2/chat` pass-through cost tracking support (#8235 )
* feat(cohere_passthrough_handler.py): initial working commit with cohere passthrough cost tracking
* fix(v2_transformation.py): support cohere /v2/chat endpoint
* fix: fix linting errors
* fix: fix import
* fix(v2_transformation.py): fix linting error
* test: handle openai exception change
2025-04-14 19:51:01 -07:00
Marc Klingen
db857c74d4
chore: ordering of logging & observability docs ( #9994 )
2025-04-14 16:49:04 -07:00
Emerson Gomes
a2bc0c0f36
Fix cost for Phi-4-multimodal output token ( #9880 )
Read Version from pyproject.toml / read-version (push) Successful in 17s
Helm unit test / unit-test (push) Successful in 23s
2025-04-14 14:31:34 -07:00
Ishaan Jaff
24447eb0cd
fix gpt 4.1 costs ( #9991 )
2025-04-14 12:50:14 -07:00
Krish Dholakia
bbb7541c22
build(model_prices_and_context_window.json): add gpt-4.1 pricing ( #9990 )
...
* build(model_prices_and_context_window.json): add gpt-4.1 pricing
* build(model_prices_and_context_window.json): add gpt-4.1-mini and gpt-4.1-nano model support
2025-04-14 12:14:46 -07:00
Ishaan Jaff
64bb89c70f
docs fix
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 22s
2025-04-12 21:20:54 -07:00
Ishaan Jaff
0e99f83cc2
team info fix default index
2025-04-12 21:06:57 -07:00
Ishaan Jaff
999a9b4ac8
bump: version 1.65.8 → 1.66.0
2025-04-12 20:45:20 -07:00
Ishaan Jaff
72c1f7e09a
ui new build
2025-04-12 20:42:43 -07:00