Commit graph

18440 commits

Author SHA1 Message Date
Ishaan Jaff
d52aae4e82 ui new build 2024-11-25 22:42:59 -08:00
Ishaan Jaff
e952c666f3
(UI fix) UI does not reload when you login / open a new tab (#6909)
* store current page on url

* update menu history
2024-11-25 22:41:45 -08:00
Ishaan Jaff
c60261c3bc
(feat) Add support for using @google/generative-ai JS with LiteLLM Proxy (#6899)
* feat - allow using gemini js SDK with LiteLLM

* add auth for gemini_proxy_route

* basic local test for js

* test cost tagging gemini js requests

* add js sdk test for gemini with litellm

* add docs on gemini JS SDK

* run node.js tests

* fix google ai studio tests

* fix vertex js spend test
2024-11-25 13:13:03 -08:00
Ishaan Jaff
f77bf49772
feat - allow sending tags on vertex pass through requests (#6876)
* feat - allow tagging vertex JS SDK request

* add unit testing for passing headers for pass through endpoints

* fix allow using vertex_ai as the primary way for pass through vertex endpoints

* docs on vertex js pass tags

* add e2e test for vertex pass through with spend tags

* add e2e tests for streaming vertex JS with tags

* fix vertex ai testing
2024-11-25 12:12:09 -08:00
Ishaan Jaff
c73ce95c01
(feat) - provider budget improvements - ensure provider budgets work with multiple proxy instances + improve latency to ~90ms (#6886)
* use 1 file for duration_in_seconds

* add to readme.md

* re use duration_in_seconds

* fix importing _extract_from_regex, get_last_day_of_month

* fix import

* update provider budget routing

* fix - remove dup test

* add support for using in multi instance environments

* test_in_memory_redis_sync_e2e

* test_in_memory_redis_sync_e2e

* fix test_in_memory_redis_sync_e2e

* fix code quality check

* fix test provider budgets

* working provider budget tests

* add fixture for provider budget routing

* fix router testing for provider budgets

* add comments on provider budget routing

* use RedisPipelineIncrementOperation

* add redis async_increment_pipeline

* use redis async_increment_pipeline

* use lower value for testing

* use redis async_increment_pipeline

* use consistent key name for increment op

* add handling for budget windows

* fix typing async_increment_pipeline

* fix set attr

* add clear doc strings

* unit testing for provider budgets

* test_redis_increment_pipeline
2024-11-24 16:36:19 -08:00
Ishaan Jaff
34bfebe470
(QOL improvement) Provider budget routing - allow using 1s, 1d, 1mo, 2mo etc (#6885)
* use 1 file for duration_in_seconds

* add to readme.md

* re use duration_in_seconds

* fix importing _extract_from_regex, get_last_day_of_month

* fix import

* update provider budget routing

* fix - remove dup test
2024-11-23 16:59:46 -08:00
Ishaan Jaff
e69678a9b3 update doc title 2024-11-23 16:25:00 -08:00
Krrish Dholakia
3d8c0bad58 build(ui/): update ui build 2024-11-24 05:32:26 +05:30
Ishaan Jaff
afc69761de
docs - have 1 section for routing +load balancing (#6884)
* docs - have 1 section for routing +load balancing

* remove emoji
2024-11-23 15:56:57 -08:00
Krrish Dholakia
50314a66ca bump: version 1.52.14 → 1.52.15 2024-11-23 23:43:30 +05:30
Krrish Dholakia
19a7932329 build: update ui build 2024-11-23 23:32:08 +05:30
Krish Dholakia
424b8b0231
Litellm dev 11 23 2024 (#6881)
* build(ui/create_key_button.tsx): support adding tags for cost tracking/routing when making key

* LiteLLM Minor Fixes & Improvements (11/23/2024)  (#6870)

* feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc.

* fix(utils.py): allow disabling end user cost tracking with new param

Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small

* docs(configs.md): add disable_end_user_cost_tracking reference to docs

* feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role

Enables admin to restrict key creation, and assign team admins to handle distributing keys

* test(test_key_management.py): add unit testing for personal / team key restriction checks

* docs: add docs on restricting key creation

* docs(finetuned_models.md): add new guide on calling finetuned models

* docs(input.md): cleanup anthropic supported params

Closes https://github.com/BerriAI/litellm/issues/6856

* test(test_embedding.py): add test for passing extra headers via embedding

* feat(cohere/embed): pass client to async embedding

* feat(rerank.py): add `/v1/rerank` if missing for cohere base url

Closes https://github.com/BerriAI/litellm/issues/6844

* fix(main.py): pass extra_headers param to openai

Fixes https://github.com/BerriAI/litellm/issues/6836

* fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set

Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically

* fix(handler.py): fix linting error

* fix: fix typing

* build: add conftest to proxy_admin_ui_tests/

* test: fix test

* fix: fix linting errors

* test: fix test

* fix: fix pass through testing

* feat(key_management_endpoints.py): allow proxy_admin to enforce params on key creation

allows admin to force team keys to have tags

* build(ui/): show teams in leftnav + allow team admin to add new members

* build(ui/): show created tags in dropdown

makes it easier for admin to add tags to keys

* test(test_key_management.py): fix test

* test: fix test

* fix playwright e2e ui test

* fix e2e ui testing deps

* fix: fix linting errors

* fix e2e ui testing

* fix e2e ui testing, only run e2e ui testing in playwright

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-11-23 22:37:16 +05:30
Ishaan Jaff
6b6353d4e7 fix e2e ui testing, only run e2e ui testing in playwright 2024-11-23 08:50:10 -08:00
Ishaan Jaff
f3ffa67553 fix e2e ui testing 2024-11-23 08:45:14 -08:00
Ishaan Jaff
fb5f458448 fix e2e ui testing deps 2024-11-23 08:39:11 -08:00
Ishaan Jaff
a8b4e1cc03 fix playwright e2e ui test 2024-11-23 08:34:55 -08:00
Krish Dholakia
7e9d8b58f6
LiteLLM Minor Fixes & Improvements (11/23/2024) (#6870)
* feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc.

* fix(utils.py): allow disabling end user cost tracking with new param

Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small

* docs(configs.md): add disable_end_user_cost_tracking reference to docs

* feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role

Enables admin to restrict key creation, and assign team admins to handle distributing keys

* test(test_key_management.py): add unit testing for personal / team key restriction checks

* docs: add docs on restricting key creation

* docs(finetuned_models.md): add new guide on calling finetuned models

* docs(input.md): cleanup anthropic supported params

Closes https://github.com/BerriAI/litellm/issues/6856

* test(test_embedding.py): add test for passing extra headers via embedding

* feat(cohere/embed): pass client to async embedding

* feat(rerank.py): add `/v1/rerank` if missing for cohere base url

Closes https://github.com/BerriAI/litellm/issues/6844

* fix(main.py): pass extra_headers param to openai

Fixes https://github.com/BerriAI/litellm/issues/6836

* fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set

Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically

* fix(handler.py): fix linting error

* fix: fix typing

* build: add conftest to proxy_admin_ui_tests/

* test: fix test

* fix: fix linting errors

* test: fix test

* fix: fix pass through testing
2024-11-23 15:17:40 +05:30
Ishaan Jaff
d81ae45827
(Perf / latency improvement) improve pass through endpoint latency to ~50ms (before PR was 400ms) (#6874)
* use correct location for types

* fix types location

* perf improvement for pass through endpoints

* update lint check

* fix import

* fix ensure async clients test

* fix azure.py health check

* fix ollama
2024-11-22 18:47:26 -08:00
dependabot[bot]
772b2f9cd2
Bump cross-spawn from 7.0.3 to 7.0.6 in /ui/litellm-dashboard (#6865)
Bumps [cross-spawn](https://github.com/moxystudio/node-cross-spawn) from 7.0.3 to 7.0.6.
- [Changelog](https://github.com/moxystudio/node-cross-spawn/blob/master/CHANGELOG.md)
- [Commits](https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.6)

---
updated-dependencies:
- dependency-name: cross-spawn
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-22 17:42:08 -08:00
Ishaan Jaff
97cde31113
fix tests (#6875) 2024-11-22 17:35:38 -08:00
Ishaan Jaff
b2b3e40d13
(feat) use @google-cloud/vertexai js sdk with litellm (#6873)
* stash gemini JS test

* add vertex js sdj example

* handle vertex pass through separately

* tes vertex JS sdk

* fix vertex_proxy_route

* use PassThroughStreamingHandler

* fix PassThroughStreamingHandler

* use common _create_vertex_response_logging_payload_for_generate_content

* test vertex js

* add working vertex jest tests

* move basic bass through test

* use good name for test

* test vertex

* test_chunk_processor_yields_raw_bytes

* unit tests for streaming

* test_convert_raw_bytes_to_str_lines

* run unit tests 1st

* simplify local

* docs add usage example for js

* use get_litellm_virtual_key

* add unit tests for vertex pass through
2024-11-22 16:50:10 -08:00
Ishaan Jaff
5930c42e74 fix coverage 2024-11-22 16:21:22 -08:00
Ishaan Jaff
377cfeb24f add pass_through_unit_testing 2024-11-22 16:20:16 -08:00
Krrish Dholakia
d8e5134935 test: skip flaky test 2024-11-22 19:23:36 +05:30
Ishaan Jaff
a6220f7a40 test - also try diff host for langfuse 2024-11-21 23:51:58 -08:00
Ishaan Jaff
701c154e35 fix test_aaateam_logging 2024-11-21 23:47:38 -08:00
Ishaan Jaff
8856256730 fix doc format 2024-11-21 23:29:40 -08:00
Ishaan Jaff
20f2bf4bbd bump: version 1.52.13 → 1.52.14 2024-11-21 23:19:02 -08:00
Ishaan Jaff
b903134cc9 ci/cd run again 2024-11-21 23:12:54 -08:00
Ishaan Jaff
952dbb9eb7 test_langfuse_masked_input_output 2024-11-21 22:59:36 -08:00
Ishaan Jaff
366a6895e2 test_langfuse_masked_input_output 2024-11-21 22:54:18 -08:00
Ishaan Jaff
be0f0dd345 test_langfuse_masked_input_output 2024-11-21 22:51:19 -08:00
Ishaan Jaff
027967d260 test_langfuse_logging_audio_transcriptions 2024-11-21 22:46:23 -08:00
Ishaan Jaff
f398c9b172 fix test_aaateam_logging 2024-11-21 22:36:44 -08:00
Ishaan Jaff
5a2e5b43c4 fix test_aaapass_through_endpoint_pass_through_keys_langfuse 2024-11-21 22:05:00 -08:00
Ishaan Jaff
e0921da38c test_team_logging 2024-11-21 22:01:12 -08:00
Ishaan Jaff
f77bd9a99c test_aaalangfuse_logging_metadata 2024-11-21 21:56:36 -08:00
Ishaan Jaff
14124bab45 docs - Send litellm_metadata (tags) 2024-11-21 21:46:49 -08:00
Ishaan Jaff
6717929206
(Feat) Allow passing litellm_metadata to pass through endpoints + Add e2e tests for /anthropic/ usage tracking (#6864)
* allow passing _litellm_metadata in pass through endpoints

* fix _create_anthropic_response_logging_payload

* include litellm_call_id in logging

* add e2e testing for anthropic spend logs

* add testing for spend logs payload

* add example with anthropic python SDK
2024-11-21 21:41:05 -08:00
Ishaan Jaff
b8af46e1a2
(feat) Add usage tracking for streaming /anthropic passthrough routes (#6842)
* use 1 file for AnthropicPassthroughLoggingHandler

* add support for anthropic streaming usage tracking

* ci/cd run again

* fix - add real streaming for anthropic pass through

* remove unused function stream_response

* working anthropic streaming logging

* fix code quality

* fix use 1 file for vertex success handler

* use helper for _handle_logging_vertex_collected_chunks

* enforce vertex streaming to use sse for streaming

* test test_basic_vertex_ai_pass_through_streaming_with_spendlog

* fix type hints

* add comment

* fix linting

* add pass through logging unit testing
2024-11-21 19:36:03 -08:00
Ishaan Jaff
920f4c9f82
(fix) add linting check to ban creating AsyncHTTPHandler during LLM calling (#6855)
* fix triton

* fix TEXT_COMPLETION_CODESTRAL

* fix REPLICATE

* fix CLARIFAI

* fix HUGGINGFACE

* add test_no_async_http_handler_usage

* fix PREDIBASE

* fix anthropic use get_async_httpx_client

* fix vertex fine tuning

* fix dbricks get_async_httpx_client

* fix get_async_httpx_client vertex

* fix get_async_httpx_client

* fix get_async_httpx_client

* fix make_async_azure_httpx_request

* fix check_for_async_http_handler

* test: cleanup mistral model

* add check for AsyncClient

* fix check_for_async_http_handler

* fix get_async_httpx_client

* fix tests using in_memory_llm_clients_cache

* fix langfuse import

* fix import

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2024-11-21 19:03:02 -08:00
Ishaan Jaff
71ebf47cef
fix latency issues on google ai studio (#6852) 2024-11-21 19:02:08 -08:00
Krrish Dholakia
2903fd4164 docs: update json mode docs 2024-11-22 03:00:45 +05:30
Krrish Dholakia
b8edef389c bump: version 1.52.12 → 1.52.13 2024-11-22 02:29:16 +05:30
Krish Dholakia
7e5085dc7b
Litellm dev 11 21 2024 (#6837)
* Fix Vertex AI function calling invoke: use JSON format instead of protobuf text format. (#6702)

* test: test tool_call conversion when arguments is empty dict

Fixes https://github.com/BerriAI/litellm/issues/6833

* fix(openai_like/handler.py): return more descriptive error message

Fixes https://github.com/BerriAI/litellm/issues/6812

* test: skip overloaded model

* docs(anthropic.md): update anthropic docs to show how to route to any new model

* feat(groq/): fake stream when 'response_format' param is passed

Groq doesn't support streaming when response_format is set

* feat(groq/): add response_format support for groq

Closes https://github.com/BerriAI/litellm/issues/6845

* fix(o1_handler.py): remove fake streaming for o1

Closes https://github.com/BerriAI/litellm/issues/6801

* build(model_prices_and_context_window.json): add groq llama3.2b model pricing

Closes https://github.com/BerriAI/litellm/issues/6807

* fix(utils.py): fix handling ollama response format param

Fixes https://github.com/BerriAI/litellm/issues/6848#issuecomment-2491215485

* docs(sidebars.js): refactor chat endpoint placement

* fix: fix linting errors

* test: fix test

* test: fix test

* fix(openai_like/handler): handle max retries

* fix(streaming_handler.py): fix streaming check for openai-compatible providers

* test: update test

* test: correctly handle model is overloaded error

* test: update test

* test: fix test

* test: mark flaky test

---------

Co-authored-by: Guowang Li <Guowang@users.noreply.github.com>
2024-11-22 01:53:52 +05:30
Ishaan Jaff
a7d5536872
(fix) passthrough - allow internal users to access /anthropic (#6843)
* fix /anthropic/

* test llm_passthrough_router

* fix test_gemini_pass_through_endpoint
2024-11-21 11:46:50 -08:00
Krrish Dholakia
50d2510b60 test: cleanup mistral model 2024-11-21 23:44:50 +05:30
Ishaan Jaff
ddfe687b13
(fix) don't block proxy startup if license check fails & using prometheus (#6839)
* fix - don't block proxy startup if not a premium user

* test_litellm_proxy_server_config_with_prometheus

* add test for proxy startup

* fix remove unused test

* fix startup test

* add comment on bad-license
2024-11-20 17:55:39 -08:00
Ishaan Jaff
cc1f8ff0ba
(testing) - add e2e tests for anthropic pass through endpoints (#6840)
* tests - add e2e tests for anthropic pass through

* fix swagger

* fix pass through tests
2024-11-20 17:55:13 -08:00
Ishaan Jaff
c107bae7ae
(feat) add usage / cost tracking for Anthropic passthrough routes (#6835)
* move _process_response in transformation

* fix AnthropicConfig test

* add AnthropicConfig

* fix anthropic_passthrough_handler

* fix get_response_body

* fix check for streaming response

* use 1 helper to return stream_response on passthrough
2024-11-20 17:25:12 -08:00