Commit graph

102 commits

Author SHA1 Message Date
Krrish Dholakia
3560f0ef2c refactor: move all testing to top-level of repo
Closes https://github.com/BerriAI/litellm/issues/486
2024-09-28 21:08:14 -07:00
Ishaan Jaff
f4613a100d [Perf Proxy] parallel request limiter - use one cache update call (#5932)
* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf
2024-09-27 17:24:46 -07:00
Krish Dholakia
16c0307eab
LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880)
* LiteLLM Minor Fixes & Improvements (09/23/2024)  (#5842)

* feat(auth_utils.py): enable admin to allow client-side credentials to be passed

Makes it easier for devs to experiment with finetuned fireworks ai models

* feat(router.py): allow setting configurable_clientside_auth_params for a model

Closes https://github.com/BerriAI/litellm/issues/5843

* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit

Fixes https://github.com/BerriAI/litellm/issues/5850

* fix(azure_ai/): support content list for azure ai

Fixes https://github.com/BerriAI/litellm/issues/4237

* fix(litellm_logging.py): always set saved_cache_cost

Set to 0 by default

* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing

handles calling 405b+ size models

* fix(slack_alerting.py): fix error alerting for failed spend tracking

Fixes regression with slack alerting error monitoring

* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error

* docs(bedrock.md): add llama3-1 models

* test: fix tests

* fix(azure_ai/chat): fix transformation for azure ai calls

* feat(azure_ai/embed): Add azure ai embeddings support

Closes https://github.com/BerriAI/litellm/issues/5861

* fix(azure_ai/embed): enable async embedding

* feat(azure_ai/embed): support azure ai multimodal embeddings

* fix(azure_ai/embed): support async multi modal embeddings

* feat(together_ai/embed): support together ai embedding calls

* feat(rerank/main.py): log source documents for rerank endpoints to langfuse

improves rerank endpoint logging

* fix(langfuse.py): support logging `/audio/speech` input to langfuse

* test(test_embedding.py): fix test

* test(test_completion_cost.py): fix helper util
2024-09-25 22:11:57 -07:00
Ishaan Jaff
1d630b61ad
[Feat] Add fireworks AI embedding (#5812)
* add fireworks embedding models

* add fireworks ai

* fireworks ai embeddings support

* is_fireworks_embedding_model

* working fireworks embeddings

* fix health check * models

* fix embedding get optional params

* fix linting errors

* fix pick_cheapest_chat_model_from_llm_provider

* add fireworks ai litellm provider

* docs fireworks embedding models

* fixes for when azure ad token  is passed
2024-09-20 22:23:28 -07:00
Krish Dholakia
f9e6507cd1
LiteLLM Minor Fixes + Improvements (#5474)
* feat(proxy/_types.py): add lago billing to callbacks ui

Closes https://github.com/BerriAI/litellm/issues/5472

* fix(anthropic.py): return anthropic prompt caching information

Fixes https://github.com/BerriAI/litellm/issues/5364

* feat(bedrock/chat.py): support 'json_schema' for bedrock models

Closes https://github.com/BerriAI/litellm/issues/5434

* fix(bedrock/embed/embeddings.py): support async embeddings for amazon titan models

* fix: linting fixes

* fix: handle key errors

* fix(bedrock/chat.py): fix bedrock ai21 streaming object

* feat(bedrock/embed): support bedrock embedding optional params

* fix(databricks.py): fix usage chunk

* fix(internal_user_endpoints.py): apply internal user defaults, if user role updated

Fixes issue where user update wouldn't apply defaults

* feat(slack_alerting.py): provide multiple slack channels for a given alert type

multiple channels might be interested in receiving an alert for a given type

* docs(alerting.md): add multiple channel alerting to docs
2024-09-02 14:29:57 -07:00
Krish Dholakia
37f9705d6e
Bedrock Embeddings refactor + model support (#5462)
* refactor(bedrock): initial commit to refactor bedrock to a folder

Improve code readability + maintainability

* refactor: more refactor work

* fix: fix imports

* feat(bedrock/embeddings.py): support translating embedding into amazon embedding formats

* fix: fix linting errors

* test: skip test on end of life model

* fix(cohere/embed.py): fix linting error

* fix(cohere/embed.py): fix typing

* fix(cohere/embed.py): fix post-call logging for cohere embedding call

* test(test_embeddings.py): fix error message assertion in test
2024-09-01 13:29:58 -07:00
Krish Dholakia
d928220ed2
Merge pull request #5393 from BerriAI/litellm_gemini_embedding_support
feat(vertex_ai_and_google_ai_studio): Support Google AI Studio Embedding Endpoint
2024-08-28 13:46:28 -07:00
Krrish Dholakia
3cec00939e test(test_embeddings.py): fix test 2024-08-28 07:51:00 -07:00
Krrish Dholakia
a6ce27ca29 feat(batch_embed_content_transformation.py): support google ai studio /batchEmbedContent endpoint
Allows for multiple strings to be given for embedding
2024-08-27 19:23:50 -07:00
Krrish Dholakia
bb42146ffe feat(embeddings_handler.py): support async gemini embeddings 2024-08-27 18:31:57 -07:00
Krrish Dholakia
5b29ddd2a6 fix(embeddings_handler.py): initial working commit for google ai studio text embeddings /embedContent endpoint 2024-08-27 18:14:56 -07:00
Krrish Dholakia
77e6da78a1 fix: initial commit 2024-08-27 17:35:56 -07:00
Ishaan Jaff
df024fbbbc add testing for cohere embeddings 2024-08-09 12:08:25 -07:00
Krrish Dholakia
466dc9f32a fix(huggingface_restapi.py): fix hf embeddings optional param processing 2024-08-09 09:10:56 -07:00
Krrish Dholakia
51ccfa9e77 fix(huggingface_restapi.py): fixes issue where 'wait_for_model' was not being passed as expected 2024-08-09 08:36:35 -07:00
Krish Dholakia
653aefde40
Merge branch 'main' into litellm_async_cohere_calls 2024-07-30 15:35:20 -07:00
Krrish Dholakia
9b2eb1702b fix(cohere.py): support async cohere embedding calls 2024-07-30 14:49:07 -07:00
Krrish Dholakia
69afbc6091 feat(huggingface_restapi.py): Support multiple hf embedding types + async hf embeddings
Closes https://github.com/BerriAI/litellm/issues/3261
2024-07-30 13:32:03 -07:00
Krrish Dholakia
3cd3491920 test: cleanup testing 2024-07-24 19:47:50 -07:00
Krrish Dholakia
65705fde25 test(test_embedding.py): add simple azure embedding ad token test
Addresses https://github.com/BerriAI/litellm/issues/4859#issuecomment-2248838617
2024-07-24 13:38:03 -07:00
David Manouchehri
ced03d9d7f
(test_embedding.py) - Re-enable embedding test with Azure OIDC. 2024-07-24 16:41:24 +00:00
David Manouchehri
4b89397136
(tests) - Skip embedding Azure AD test for now. 2024-07-24 15:42:57 +00:00
Krrish Dholakia
41fda47587 test(test_embedding.py): fix base url 2024-07-24 08:04:27 -07:00
David Manouchehri
2dcd9a5567
(test - azure): Add test for Azure OIDC auth. 2024-07-23 19:12:40 +00:00
Ishaan Jaff
48c365976f fix bedrock embedding test 2024-07-20 20:05:22 -07:00
Ishaan Jaff
613bbe306f fix triton embedding test 2024-07-17 17:29:22 -07:00
Krrish Dholakia
58ac2a7e2b docs(supported_embeddings.md): add doc on provider-specific params for embedding models 2024-07-09 12:39:10 -07:00
Simon Sanchez Viloria
e2827ee28b (test - watsonx) use MagicMock to mock httpx.AsyncClient endpoint for aembedding test 2024-07-07 18:55:42 +02:00
Simon Sanchez Viloria
ea952a57b0 (test - watsonx) Added tests for watsonx embeddings with mocked endpoints 2024-07-07 17:59:37 +02:00
Krrish Dholakia
43353c28b3 feat(databricks.py): add embedding model support 2024-05-23 18:22:03 -07:00
Krrish Dholakia
d4123951d9 test: handle watsonx rate limit error 2024-05-13 18:27:39 -07:00
Ishaan Jaff
0cde9473c9 test triton embeddings 2024-05-10 18:50:34 -07:00
Krrish Dholakia
3e8d9fc80d test: skip local test 2024-04-27 19:07:49 -07:00
Simon Sanchez Viloria
9fc30e8b31 (test) Added completion and embedding tests for watsonx provider 2024-04-24 12:52:29 +02:00
Ishaan Jaff
fb741d96ca test - voyage ai embedding 2024-04-03 20:54:35 -07:00
Krish Dholakia
c840fecdeb
Merge pull request #2142 from vilmar-hillow/azure_embedding_ad_token
Fixed azure ad token not being processed properly in embedding models
2024-03-19 11:51:28 -07:00
Krrish Dholakia
4e1dc7d62e fix(cohere.py): return usage as a pydantic object not dict 2024-03-15 10:00:22 -07:00
Dmitry Supranovich
57ebb9582e Fixed azure ad token not being processed properly in embedding models 2024-03-12 21:29:24 -04:00
Krrish Dholakia
cdb960eb34 fix(vertex_ai.py): correctly parse optional params and pass vertex ai project 2024-03-06 14:00:50 -08:00
Krrish Dholakia
478307d4cf fix(bedrock.py): support anthropic messages api on bedrock (claude-3) 2024-03-04 17:15:47 -08:00
Krrish Dholakia
8a038e7da4 test: skip aws test - aws account suspended 2024-02-28 14:42:50 -08:00
Krrish Dholakia
4c951d20bc test: removing aws tests - account suspended - pending their approval 2024-02-28 13:46:20 -08:00
Krrish Dholakia
0ffdf57dec fix(vertex_ai.py): add async embedding support for vertex ai 2024-02-03 10:35:17 -08:00
Krrish Dholakia
1ba6882f76 fix(test_embedding.py): fix test 2024-02-03 09:49:23 -08:00
Krrish Dholakia
d9ba8668f4 feat(vertex_ai.py): vertex ai gecko text embedding support 2024-02-03 09:48:29 -08:00
ishaan-jaff
6488de36c4 (test) bedrock input validation - exceptions 2024-01-30 08:12:43 -08:00
ishaan-jaff
17370dc50f (test) dimension param - openai 2024-01-26 10:37:01 -08:00
ishaan-jaff
72790f44da (chore) cleanup testing file 2024-01-25 14:36:11 -08:00
ishaan-jaff
6cbde02cab (test) embedding models 2024-01-25 14:30:49 -08:00
Krrish Dholakia
39a1b4c3b5 fix(main.py): support custom pricing for embedding calls 2024-01-22 15:15:34 -08:00