Commit graph

64 commits

Author SHA1 Message Date
Ishaan Jaff
d1536a3a01 ensure passthrough_logging_payload is filled in kwargs 2025-04-21 18:12:20 -07:00
Krish Dholakia
2ed593e052
Updated cohere v2 passthrough (#9997)
* Add cohere `/v2/chat` pass-through cost tracking support (#8235)

* feat(cohere_passthrough_handler.py): initial working commit with cohere passthrough cost tracking

* fix(v2_transformation.py): support cohere /v2/chat endpoint

* fix: fix linting errors

* fix: fix import

* fix(v2_transformation.py): fix linting error

* test: handle openai exception change
2025-04-14 19:51:01 -07:00
Ishaan Jaff
08a3620414
[Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) (#9853)
* http passthrough file handling

* fix make_multipart_http_request

* test_pass_through_file_operations

* unit tests for file handling
2025-04-09 15:29:20 -07:00
Krish Dholakia
9b7ebb6a7d
build(pyproject.toml): add new dev dependencies - for type checking (#9631)
* build(pyproject.toml): add new dev dependencies - for type checking

* build: reformat files to fit black

* ci: reformat to fit black

* ci(test-litellm.yml): make tests run clear

* build(pyproject.toml): add ruff

* fix: fix ruff checks

* build(mypy/): fix mypy linting errors

* fix(hashicorp_secret_manager.py): fix passing cert for tls auth

* build(mypy/): resolve all mypy errors

* test: update test

* fix: fix black formatting

* build(pre-commit-config.yaml): use poetry run black

* fix(proxy_server.py): fix linting error

* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Krrish Dholakia
532af66bbd feat(pass_through_endpoints.py): return api base on pass-through exception
enables easy debugging on backend api errors
2025-03-20 20:19:52 -07:00
Krrish Dholakia
943e036851 feat(pass_through_endpoints.py): support returning api-base on pass-through endpoints
Make it easier to debug what the api base sent to provider was
2025-03-20 20:11:49 -07:00
Ishaan Jaff
7546dfde41 use correct get custom headers 2025-03-12 17:16:51 -07:00
Krrish Dholakia
c7ceeaa4d7 fix(pass_through_endpoints.py): fix linting error 2025-03-12 12:00:05 -07:00
Steve Farthing
ffce48ed3c Merge branch 'stevefarthing/bing-search-pass-thru' of github.com:sfarthin/litellm into stevefarthing/bing-search-pass-thru
# Conflicts:
#	litellm/proxy/pass_through_endpoints/pass_through_endpoints.py
2025-03-11 08:15:23 -04:00
Steve Farthing
dbfb7ebdaf
Merge branch 'main' into stevefarthing/bing-search-pass-thru 2025-03-11 08:06:56 -04:00
Steve Farthing
e8a859720b Feedback 2025-03-11 08:01:31 -04:00
Steve Farthing
75b713974f Bing Search Pass Thru 2025-03-11 08:01:31 -04:00
Ishaan Jaff
378e3d9e4d
(Proxy improvement) - Raise BadRequestError when unknown model passed in request (#8886)
* fix safe access model in request body

* litellm.BadRequestError

* don't pass model in request body

* test_chat_completion_bad_model
2025-02-27 19:30:57 -08:00
Ishaan Jaff
24df2331ec
(fix) Anthropic pass through cost tracking (#8874)
* fix _create_anthropic_response_logging_payload

* fix - pass through don't create standard logging payload

* fix logged key hash

* test_init_kwargs_for_pass_through_endpoint_basic

* test_unit_test_anthropic_pass_through

* fix anthropic pass through logging handler
2025-02-27 15:42:43 -08:00
Ishaan Jaff
81039d8faf
(Bug fix) - allow using Assistants GET, DELETE on /openai pass through routes (#8818)
* test_openai_assistants_e2e_operations

* test openai assistants pass through

* fix GET request on pass through handler

* _make_non_streaming_http_request

* _is_assistants_api_request

* test_openai_assistants_e2e_operations

* test_openai_assistants_e2e_operations

* openai_proxy_route

* docs openai pass through

* docs openai pass through

* docs openai pass through

* test pass through handler

* Potential fix for code scanning alert no. 2240: Incomplete URL substring sanitization

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-02-25 19:19:00 -08:00
Krrish Dholakia
7bfd816d3b build: merge commit 1b15568af7
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Feb 17 21:37:36 2025 -0800

    fix(proxy/_types.py): fix linting error

commit dc4d5cffa6
Author: Krrish Dholakia <krrishdholakia@gmail.com>
2025-02-17 21:56:00 -08:00
Steve Farthing
9724ee94df Feedback 2025-02-04 21:11:19 -05:00
Steve Farthing
fe0f9213af Bing Search Pass Thru 2025-01-27 08:58:04 -05:00
Krish Dholakia
80d6bbec29
Litellm dev 01 14 2025 p2 (#7772)
* feat(pass_through_endpoints.py): fix anthropic end user cost tracking

* fix(anthropic/chat/transformation.py): use returned provider model for anthropic

handles anthropic `-latest` tag in request body throwing cost calculation errors

ensures we can be accurate in our model cost tracking

* feat(model_prices_and_context_window.json): add gemini-2.0-flash-thinking-exp pricing

* test: update test to use assumption that user_api_key_dict can get anthropic user id

* test: fix test

* fix: fix test

* fix(anthropic_pass_through.py): uncomment previous anthropic end-user cost tracking code block

can't guarantee user api key dict always has end user id - too many code paths

* fix(user_api_key_auth.py): this allows end user id from request body to always be read and set in auth object

* fix(auth_check.py): fix linting error

* test: fix auth check

* fix(auth_utils.py): fix get end user id to handle metadata = None
2025-01-15 21:34:50 -08:00
Ishaan Jaff
7c44b9f25f
Add /openai pass through route on litellm proxy (#7412)
* add pt oai route - proxy

* pass through use safe read request body
2024-12-25 20:15:59 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Ishaan Jaff
21003c4337
Code Quality Improvement - use vertex_ai/ as folder name for vertexAI (#7166)
* fix rename vertex ai

* run ci/cd again
2024-12-11 00:32:41 -08:00
Ishaan Jaff
dcea31e50a run ci/cd again for new release 2024-11-26 00:26:27 -08:00
Ishaan Jaff
552c0dd7a4
(fix) pass through endpoints - run logging async + use thread pool executor for sync logging callbacks (#6907)
* run pass through logging async

* fix use thread_pool_executor for pass through logging

* test_pass_through_request_logging_failure_with_stream

* fix anthropic pt logging test

* test_pass_through_request_logging_failure
2024-11-25 22:52:05 -08:00
Ishaan Jaff
f77bf49772
feat - allow sending tags on vertex pass through requests (#6876)
* feat - allow tagging vertex JS SDK request

* add unit testing for passing headers for pass through endpoints

* fix allow using vertex_ai as the primary way for pass through vertex endpoints

* docs on vertex js pass tags

* add e2e test for vertex pass through with spend tags

* add e2e tests for streaming vertex JS with tags

* fix vertex ai testing
2024-11-25 12:12:09 -08:00
Ishaan Jaff
d81ae45827
(Perf / latency improvement) improve pass through endpoint latency to ~50ms (before PR was 400ms) (#6874)
* use correct location for types

* fix types location

* perf improvement for pass through endpoints

* update lint check

* fix import

* fix ensure async clients test

* fix azure.py health check

* fix ollama
2024-11-22 18:47:26 -08:00
Ishaan Jaff
b2b3e40d13
(feat) use @google-cloud/vertexai js sdk with litellm (#6873)
* stash gemini JS test

* add vertex js sdj example

* handle vertex pass through separately

* tes vertex JS sdk

* fix vertex_proxy_route

* use PassThroughStreamingHandler

* fix PassThroughStreamingHandler

* use common _create_vertex_response_logging_payload_for_generate_content

* test vertex js

* add working vertex jest tests

* move basic bass through test

* use good name for test

* test vertex

* test_chunk_processor_yields_raw_bytes

* unit tests for streaming

* test_convert_raw_bytes_to_str_lines

* run unit tests 1st

* simplify local

* docs add usage example for js

* use get_litellm_virtual_key

* add unit tests for vertex pass through
2024-11-22 16:50:10 -08:00
Ishaan Jaff
6717929206
(Feat) Allow passing litellm_metadata to pass through endpoints + Add e2e tests for /anthropic/ usage tracking (#6864)
* allow passing _litellm_metadata in pass through endpoints

* fix _create_anthropic_response_logging_payload

* include litellm_call_id in logging

* add e2e testing for anthropic spend logs

* add testing for spend logs payload

* add example with anthropic python SDK
2024-11-21 21:41:05 -08:00
Ishaan Jaff
b8af46e1a2
(feat) Add usage tracking for streaming /anthropic passthrough routes (#6842)
* use 1 file for AnthropicPassthroughLoggingHandler

* add support for anthropic streaming usage tracking

* ci/cd run again

* fix - add real streaming for anthropic pass through

* remove unused function stream_response

* working anthropic streaming logging

* fix code quality

* fix use 1 file for vertex success handler

* use helper for _handle_logging_vertex_collected_chunks

* enforce vertex streaming to use sse for streaming

* test test_basic_vertex_ai_pass_through_streaming_with_spendlog

* fix type hints

* add comment

* fix linting

* add pass through logging unit testing
2024-11-21 19:36:03 -08:00
Ishaan Jaff
c107bae7ae
(feat) add usage / cost tracking for Anthropic passthrough routes (#6835)
* move _process_response in transformation

* fix AnthropicConfig test

* add AnthropicConfig

* fix anthropic_passthrough_handler

* fix get_response_body

* fix check for streaming response

* use 1 helper to return stream_response on passthrough
2024-11-20 17:25:12 -08:00
Ishaan Jaff
51ffe93e77
(docs) add docstrings for all /key, /user, /team, /customer endpoints (#6804)
* use helper to handle_exception_on_proxy

* add doc string for /key/regenerate

* use 1 helper for handle_exception_on_proxy

* add doc string for /key/block

* add doc string for /key/unblock

* remove deprecated function

* remove deprecated endpoints

* remove incorrect tag for endpoint

* fix linting

* fix /key/regenerate

* fix regen key

* fix use port 4000 for user endpoints

* fix clean up - use separate file for customer endpoints

* add docstring for user/update

* fix imports

* doc string /user/list

* doc string for /team/delete

* fix team block endpoint

* fix import block user

* add doc string for /team/unblock

* add doc string for /team/list

* add doc string for /team/info

* add doc string for key endpoints

* fix customer_endpoints

* add doc string for customer endpoints

* fix import new_end_user

* fix testing

* fix import new_end_user

* fix add check for allow_user_auth
2024-11-18 19:44:06 -08:00
Ishaan Jaff
610974b4fc
(code quality) add ruff check PLR0915 for too-many-statements (#6309)
* ruff add PLR0915

* add noqa for PLR0915

* fix noqa

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915
2024-10-18 15:36:49 +05:30
Krish Dholakia
2acb0c0675
Litellm Minor Fixes & Improvements (10/12/2024) (#6179)
* build(model_prices_and_context_window.json): add bedrock llama3.2 pricing

* build(model_prices_and_context_window.json): add bedrock cross region inference pricing

* Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)"

This reverts commit 2a5624af47.

* add azure/gpt-4o-2024-05-13 (#6174)

* LiteLLM Minor Fixes & Improvements (10/10/2024)  (#6158)

* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic

* fix(vertex_ai/): support passing custom api base to partner models

Fixes https://github.com/BerriAI/litellm/issues/4317

* fix(proxy_server.py): Fix prometheus premium user check logic

* docs(prometheus.md): update quick start docs

* fix(custom_llm.py): support passing dynamic api key + api base

* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints

Closes https://github.com/BerriAI/litellm/issues/6081

* feat(openai/realtime): add openai realtime api logging

Closes https://github.com/BerriAI/litellm/issues/6081

* fix(realtime_streaming.py): fix linting errors

* fix(realtime_streaming.py): fix linting errors

* fix: fix linting errors

* fix pattern match router

* Add literalai in the sidebar observability category (#6163)

* fix: add literalai in the sidebar

* fix: typo

* update (#6160)

* Feat: Add Langtrace integration (#5341)

* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict

* docs fix

* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)

* fix move s3 to use customLogger

* add basic s3 logging test

* add s3 to custom logger compatible

* use batch logger for s3

* s3 set flush interval and batch size

* fix s3 logging

* add notes on s3 logging

* fix s3 logging

* add basic s3 logging test

* fix s3 type errors

* add test for sync logging on s3

* fix: fix to debug log

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>

* docs(custom_llm_server.md): update doc on passing custom params

* fix(pass_through_endpoints.py): don't require headers

Fixes https://github.com/BerriAI/litellm/issues/6128

* feat(utils.py): add support for caching rerank endpoints

Closes https://github.com/BerriAI/litellm/issues/6144

* feat(litellm_logging.py'): add response headers for failed requests

Closes https://github.com/BerriAI/litellm/issues/6159

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-12 11:48:34 -07:00
Krish Dholakia
3933fba41f
LiteLLM Minor Fixes & Improvements (09/19/2024) (#5793)
* fix(model_prices_and_context_window.json): add cost tracking for more vertex llama3.1 model

8b and 70b models

* fix(proxy/utils.py): handle data being none on pre-call hooks

* fix(proxy/): create views on initial proxy startup

fixes base case, where user starts proxy for first time

 Fixes https://github.com/BerriAI/litellm/issues/5756

* build(config.yml): fix vertex version for test

* feat(ui/): support enabling/disabling slack alerting

Allows admin to turn on/off slack alerting through ui

* feat(rerank/main.py): support langfuse logging

* fix(proxy/utils.py): fix linting errors

* fix(langfuse.py): log clean metadata

* test(tests): replace deprecated openai model
2024-09-20 08:19:52 -07:00
Krish Dholakia
72e961af3c
LiteLLM Minor Fixes and Improvements (08/06/2024) (#5567)
* fix(utils.py): return citations for perplexity streaming

Fixes https://github.com/BerriAI/litellm/issues/5535

* fix(anthropic/chat.py): support fallbacks for anthropic streaming (#5542)

* fix(anthropic/chat.py): support fallbacks for anthropic streaming

Fixes https://github.com/BerriAI/litellm/issues/5512

* fix(anthropic/chat.py): use module level http client if none given (prevents early client closure)

* fix: fix linting errors

* fix(http_handler.py): fix raise_for_status error handling

* test: retry flaky test

* fix otel type

* fix(bedrock/embed): fix error raising

* test(test_openai_batches_and_files.py): skip azure batches test (for now) quota exceeded

* fix(test_router.py): skip azure batch route test (for now) - hit batch quota limits

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* All `model_group_alias` should show up in `/models`, `/model/info` , `/model_group/info` (#5539)

* fix(router.py): support returning model_alias model names in `/v1/models`

* fix(proxy_server.py): support returning model alias'es on `/model/info`

* feat(router.py): support returning model group alias for `/model_group/info`

* fix(proxy_server.py): fix linting errors

* fix(proxy_server.py): fix linting errors

* build(model_prices_and_context_window.json): add amazon titan text premier pricing information

Closes https://github.com/BerriAI/litellm/issues/5560

* feat(litellm_logging.py): log standard logging response object for pass through endpoints. Allows bedrock /invoke agent calls to be correctly logged to langfuse + s3

* fix(success_handler.py): fix linting error

* fix(success_handler.py): fix linting errors

* fix(team_endpoints.py): Allows admin to update team member budgets

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-06 17:16:24 -07:00
Ishaan Jaff
c364d311b9 rename type 2024-09-04 16:33:36 -07:00
Ishaan Jaff
8eda374d79 feat log request / response on pass through endpoints 2024-09-04 16:26:32 -07:00
Ishaan Jaff
42b95c5979 code cleanup 2024-09-02 16:36:19 -07:00
Ishaan Jaff
a6d4a27207 pass through track usage for streaming endpoints 2024-09-02 16:11:20 -07:00
Ishaan Jaff
73d0a78444 use chunk_processort 2024-09-02 15:51:52 -07:00
Ishaan Jaff
f50374e81d use helper class for pass through success handler 2024-08-30 15:52:47 -07:00
Ishaan Jaff
bcc0f99476 fix pass through endpoints 2024-08-21 17:21:22 -07:00
Krrish Dholakia
e747127e3b fix(pass_through_endpoints.py): fix query param pass through 2024-08-19 21:38:30 -07:00
Krrish Dholakia
663a0c1b83 feat(Support-pass-through-for-bedrock-endpoints): Allows pass-through support for bedrock endpoints 2024-08-17 17:57:43 -07:00
Krrish Dholakia
f7a2e04426 feat(pass_through_endpoints.py): add pass-through support for all cohere endpoints 2024-08-17 16:57:55 -07:00
Krrish Dholakia
db54b66457 style(vertex_httpx.py): make vertex error string more helpful 2024-08-17 15:09:55 -07:00
Krrish Dholakia
fd44cf8d26 feat(pass_through_endpoints.py): support streaming requests 2024-08-17 12:46:57 -07:00
Krrish Dholakia
bc0023a409 feat(google_ai_studio_endpoints.py): support pass-through endpoint for all google ai studio requests
New Feature
2024-08-17 10:46:59 -07:00
Krrish Dholakia
b56ecd7e02 fix(pass_through_endpoints.py): fix returned response headers for pass-through endpoitns 2024-08-17 09:00:00 -07:00
Krrish Dholakia
61f4b71ef7 refactor: replace .error() with .exception() logging for better debugging on sentry 2024-08-16 09:22:47 -07:00