Commit graph

534 commits

Author SHA1 Message Date
Krrish Dholakia
1262d89ab3 feat(gemini.py): add support for completion calls for gemini-pro (google ai studio) 2023-12-24 09:42:58 +05:30
Krrish Dholakia
eaaad79823 feat(ollama.py): add support for async ollama embeddings 2023-12-23 18:01:25 +05:30
Krrish Dholakia
be68796eba fix(router.py): add support for async image generation endpoints 2023-12-21 14:38:44 +05:30
Krrish Dholakia
f355e03515 feat(main.py): add async image generation support 2023-12-20 16:58:40 +05:30
Krrish Dholakia
b3962e483f feat(azure.py): add support for azure image generations endpoint 2023-12-20 16:37:21 +05:30
Krrish Dholakia
f0df28362a feat(ollama.py): add support for ollama function calling 2023-12-20 14:59:55 +05:30
Krish Dholakia
408f232bd7
Merge branch 'main' into main 2023-12-18 17:54:34 -08:00
Krrish Dholakia
34509d8dda fix(main.py): return async completion calls 2023-12-18 17:41:54 -08:00
ishaan-jaff
bd15c61a65 (feat) OR default transforms=[] 2023-12-18 10:59:09 +05:30
ishaan-jaff
3a97a2817f (fix) default args batch completion 2023-12-18 10:05:44 +05:30
ishaan-jaff
1e57c0c152 (feat) completion set function, function_call default None 2023-12-18 09:54:43 +05:30
ishaan-jaff
6b272076d7 (feat) openrouter set transforms=[] default 2023-12-18 09:16:33 +05:30
Joel Eriksson
a419d59542 Fix for issue that occured when proxying to ollama
In the text_completion() function, it previously threw an exception at:
raw_response = response._hidden_params.get("original_response", None)

Due to response being an coroutine object to an ollama_acompletion call,
so I added an asyncio.iscoroutine() check for the response and handle it
by calling response = asyncio.run(response)

I also had to fix atext_completion(), where init_response was an instance
of TextCompletionResponse.

Since this case was not handled by the if-elif that checks if init_response
is a coroutine, a dict or a ModelResponse instance, response was unbound
which threw an exception on the "return response" line.

Note that a regular pyright based linter detects that response is possibly
unbound, and that the same code pattern is used in multiple other places
in main.py.

I would suggest that you either change these cases:

init_response = await loop.run_in_executor(...
if isinstance(init_response, ...
    response = init_response
elif asyncio.iscoroutine(init_response):
    response = await init_response

To either just:

response = await loop.run_in_executor(
if asyncio.iscoroutine(response):
    response = await response

Or at the very least, include an else statement and set response = init_response,
so that response is never unbound when the code proceeds.
2023-12-17 17:27:47 +02:00
Krrish Dholakia
a3c7a340a5 fix(ollama.py): fix sync ollama streaming 2023-12-16 21:23:21 -08:00
Krrish Dholakia
13d088b72e feat(main.py): add support for image generation endpoint 2023-12-16 21:07:29 -08:00
Krrish Dholakia
5b4ca42de6 docs(routing.md): add docs on using caching groups across deployments 2023-12-15 21:51:59 -08:00
Krrish Dholakia
84ad9f441e feat(router.py): support caching groups 2023-12-15 21:45:51 -08:00
ishaan-jaff
287633887e (feat) add ollama/llava 2023-12-16 10:35:27 +05:30
Krrish Dholakia
add153d110 fix(huggingface_restapi.py): add support for additional hf embedding formats 2023-12-15 21:02:41 -08:00
Krrish Dholakia
cab870f73a fix(ollama.py): fix ollama async streaming for /completions calls 2023-12-15 09:28:32 -08:00
Krrish Dholakia
e45b491985 fix(main.py): log user for embedding calls 2023-12-14 15:14:37 -08:00
Krrish Dholakia
1608dd7e0b fix(main.py): support async streaming for text completions endpoint 2023-12-14 13:56:32 -08:00
ishaan-jaff
7945664e61 (feat) add mistral api 2023-12-14 18:17:48 +05:30
ishaan-jaff
9526ee16c4 (fix) aembedding - don't pop aembedding out 2023-12-14 17:13:35 +05:30
Ishaan Jaff
cbc8cb08c8
Merge pull request #1112 from Undertone0809/add-cs-return-type-for-completion
refactor: add CustomStreamWrapper return type for completion
2023-12-14 16:52:11 +05:30
Krrish Dholakia
7b8851cce5 fix(ollama.py): fix async completion calls for ollama 2023-12-13 13:10:25 -08:00
Krrish Dholakia
69c29f8f86 fix(vertex_ai.py): add support for real async streaming + completion calls 2023-12-13 11:53:55 -08:00
Krrish Dholakia
07015843ac fix(vertex_ai.py): support optional params + enable async calls for gemini 2023-12-13 11:01:23 -08:00
Krrish Dholakia
ef7a6e3ae1 feat(vertex_ai.py): adds support for gemini-pro on vertex ai 2023-12-13 10:26:30 -08:00
ishaan-jaff
86e626edab (feat) pass vertex_ai/ as custom_llm_provider 2023-12-13 19:02:24 +03:00
zeeland
79ea466cf5 refactor: add CustomStreamWrapper return type for completion 2023-12-13 22:57:19 +08:00
Krrish Dholakia
8b07a6c046 fix(main.py): pass user_id + encoding_format for logging + to openai/azure 2023-12-12 15:46:44 -08:00
Krrish Dholakia
632d6e0bff fix(utils.py): add more logging 2023-12-12 15:46:12 -08:00
Krrish Dholakia
d6669fe9e3 fix(utils.py): add more logging 2023-12-12 15:46:00 -08:00
ishaan-jaff
b6b88370ca (fix) from re import T - junk import 2023-12-12 12:26:15 -08:00
Krrish Dholakia
9cf5ab468f fix(router.py): deepcopy initial model list, don't mutate it 2023-12-12 09:54:06 -08:00
Krrish Dholakia
2c1c75fdf0 fix(ollama.py): enable parallel ollama completion calls 2023-12-11 23:18:37 -08:00
Krrish Dholakia
3b6099633c test(test_custom_callback_router.py): add async azure testing for router 2023-12-11 16:40:35 -08:00
Krrish Dholakia
ad39afc0ad test(test_custom_callback_input.py): embedding callback tests for azure, openai, bedrock 2023-12-11 15:32:46 -08:00
Krrish Dholakia
b09ecb986e test(test_custom_callback_input.py): add bedrock testing
n

n
2023-12-11 13:00:01 -08:00
Krrish Dholakia
ea89a8a938 test(test_custom_callback_unit.py): adding unit tests for custom callbacks + fixing related bugs 2023-12-11 11:44:09 -08:00
ishaan-jaff
91f09365e8 (feat) access metadata in embedding kwargs() 2023-12-11 09:39:25 -08:00
ishaan-jaff
ee3c9d19a2 (feat) caching + stream - bedrock 2023-12-11 08:43:50 -08:00
Krish Dholakia
cc4a1d2603
Merge pull request #985 from estill01/patch-1
Enable setting default `model` value for `LiteLLM`, `Chat`, `Completions`
2023-12-09 13:59:00 -08:00
ishaan-jaff
c8b699c0aa (feat) custom logger: async stream,assemble chunks 2023-12-09 10:10:48 -08:00
ishaan-jaff
14d57ec7de (feat) proxy: log model_info + proxy_server request 2023-12-08 14:26:18 -08:00
ishaan-jaff
93b1df1c79 (feat) embedding - pass model_info, proxy_server request 2023-12-08 14:26:18 -08:00
ishaan-jaff
be94a8c478 (feat) pass model_info, proxy_server_request to callback 2023-12-08 14:26:18 -08:00
ishaan-jaff
762f28e4d7 (fix) make print_verbose non blocking 2023-12-07 17:31:32 -08:00
Krrish Dholakia
c1e95740b0 fix(bedrock.py): fix output format for cohere embeddings 2023-12-06 22:47:01 -08:00