Commit graph

52 commits

Author SHA1 Message Date
Krrish Dholakia
5f93cae3ff feat(proxy_server.py): return litellm version in response headers 2024-05-08 16:00:08 -07:00
Ishaan Jaff
d399947111 Merge pull request #3470 from mbektas/fix-ollama-embeddings
support sync ollama embeddings
2024-05-07 19:21:37 -07:00
Mehmet Bektas
1236638266 support sync ollama embeddings 2024-05-05 19:44:25 -07:00
Jack Collins
218f15de60 Fix: get format from data not optional_params ollama non-stream completion 2024-05-05 18:59:26 -07:00
Jack Collins
5bc934303c Add missing import itertools.chain 2024-05-05 18:54:08 -07:00
Jack Collins
5393c5459e Fix: Set finish_reason to tool_calls for non-stream responses in ollama 2024-05-05 18:52:31 -07:00
Jack Collins
2d43423138 Parse streamed function calls as single delta in ollama 2024-05-05 18:52:20 -07:00
Krish Dholakia
52f43c8c2e Merge branch 'main' into litellm_ollama_tool_call_reponse 2024-05-01 10:24:05 -07:00
frob
df4fd2a7bf Disable special tokens in ollama completion when counting tokens
Some(?) models (eg, codegemma) don't return a prompt_eval_count field, so ollama.py tries to compute the value based on encoding of the prompt.  Unfortunately FIM symbols used in the prompt (eg, "<|fim_prefix|>") cause the encoder to throw an exception, so we disable special processing.
2024-04-19 21:38:42 +02:00
Gregory Nwosu
0609da96e3 created defaults for response["eval_count"]
there is no way in litellm to disable the cache in ollama that is removing the eval_count response keys from the json.
This PR allows the code to create sensible defaults for when the response is empty
see 
- https://github.com/ollama/ollama/issues/1573
- https://github.com/ollama/ollama/issues/2023
2024-04-08 02:03:54 +01:00
DaxServer
947ba9d15b docs: Update references to Ollama repository url
Updated references to the Ollama repository URL from https://github.com/jmorganca/ollama to https://github.com/ollama/ollama.
2024-03-31 19:35:37 +02:00
Krrish Dholakia
f7dd1758bb fix(ollama.py): fix type issue 2024-03-28 15:01:56 -07:00
onukura
1bd60287ba Add a feature to ollama aembedding to accept batch input 2024-03-27 21:39:19 +00:00
onukura
6ee8f26746 Fix ollama embedding response 2024-03-25 16:26:49 +00:00
Lunik
a1be265052 🐛 fix: Ollama vision models call arguments (like : llava)
Signed-off-by: Lunik <lunik@tiwabbit.fr>
2024-02-26 17:52:55 +01:00
Krrish Dholakia
220a90527f fix(ollama.py): support format for ollama 2024-02-06 10:11:52 -08:00
Ishaan Jaff
175c4000da Merge pull request #1750 from vanpelt/patch-2
Re-raise exception in async ollama streaming
2024-02-05 08:12:17 -08:00
Krrish Dholakia
a2bb95be59 refactor(ollama.py): trigger rebuild 2024-02-03 20:23:43 -08:00
Krrish Dholakia
56110188fd fix(ollama.py): fix api connection error
https://github.com/BerriAI/litellm/issues/1735
2024-02-03 20:22:33 -08:00
Chris Van Pelt
547b9beefc Re-raise exception in async ollama streaming 2024-02-01 16:14:07 -08:00
Krrish Dholakia
635a34b543 fix(utils.py): fix streaming chunks to not return role, unless set 2024-02-01 09:55:56 -08:00
TheDiscoMole
02a73e14a3 changing ollama response parsing to expected behaviour 2024-01-19 23:36:24 +01:00
ishaan-jaff
3081dc525a (feat) litellm.completion - support ollama timeout 2024-01-09 10:34:41 +05:30
Krrish Dholakia
d89a58ec54 fix(ollama.py): use tiktoken as backup for prompt token counting 2024-01-09 09:47:18 +05:30
Krrish Dholakia
79978c44ba refactor: add black formatting 2023-12-25 14:11:20 +05:30
Krrish Dholakia
b7a7c3a4e5 feat(ollama.py): add support for async ollama embeddings 2023-12-23 18:01:25 +05:30
Krrish Dholakia
a65dfdde94 test(test_completion.py-+-test_streaming.py): add ollama endpoint to ci/cd pipeline 2023-12-22 12:21:33 +05:30
Krrish Dholakia
ae288c97fb fix(ollama.py): use litellm.request timeout for async call timeout 2023-12-22 11:22:24 +05:30
Krrish Dholakia
636ac9b605 feat(ollama.py): add support for ollama function calling 2023-12-20 14:59:55 +05:30
ishaan-jaff
3c37e0d58b (fix) proxy + ollama - raise exception correctly 2023-12-19 18:48:34 +05:30
Joel Eriksson
afcc83bb15 Fix bug when iterating over lines in ollama response
async for line in resp.content.iter_any() will return
incomplete lines when the lines are long, and that
results in an exception being thrown by json.loads()
when it tries to parse the incomplete JSON

The default behavior of the stream reader for aiohttp
response objects is to iterate over lines, so just
removing .iter_any() fixes the bug
2023-12-17 20:23:26 +02:00
Krrish Dholakia
5f4310f592 fix(ollama.py): fix sync ollama streaming 2023-12-16 21:23:21 -08:00
Krrish Dholakia
87df233a19 fix(health.md): add background health check details to docs 2023-12-16 10:31:59 -08:00
Krrish Dholakia
1da7d35218 feat(proxy_server.py): enable infinite retries on rate limited requests 2023-12-15 20:03:41 -08:00
Krrish Dholakia
3d6ade8f26 fix(ollama.py): fix ollama async streaming for /completions calls 2023-12-15 09:28:32 -08:00
Krish Dholakia
c230fa4cd7 Merge pull request #1122 from emsi/main
Fix #1119, no content when streaming.
2023-12-14 10:01:00 -08:00
Krrish Dholakia
2231601d5a fix(ollama.py): fix async completion calls for ollama 2023-12-13 13:10:25 -08:00
Mariusz Woloszyn
3b643676d9 Fix #1119, no content when streaming. 2023-12-13 21:42:35 +01:00
Krrish Dholakia
e452aec9ad fix(ollama.py): add support for async streaming 2023-12-12 16:44:20 -08:00
ishaan-jaff
eec316f3bb (fix) tkinter import 2023-12-12 12:18:25 -08:00
Krrish Dholakia
b80a81b419 fix(ollama.py): enable parallel ollama completion calls 2023-12-11 23:18:37 -08:00
ishaan-jaff
d25d4d26bd (feat) debug ollama POST request 2023-11-14 17:53:48 -08:00
Krrish Dholakia
753c722c9f refactor(ai21,-aleph-alpha,-ollama): making ai21, aleph-alpha, ollama compatible with openai v1 sdk 2023-11-11 17:49:13 -08:00
ishaan-jaff
6e3654d309 (feat) completion ollama raise exception when ollama resp != 200 2023-11-10 08:54:05 -08:00
Krrish Dholakia
d0b23a2722 refactor(all-files): removing all print statements; adding pre-commit + flake8 to prevent future regressions 2023-11-04 12:50:15 -07:00
ishaan-jaff
960481a540 (feat) ollama raise Exceptions + use LiteLLM stream wrapper 2023-10-11 17:00:39 -07:00
Krrish Dholakia
37d7837b63 feat(ollama.py): exposing ollama config 2023-10-06 15:52:58 -07:00
Krrish Dholakia
694265798d push cli tool 2023-09-26 13:30:47 -07:00
ishaan-jaff
ebce57dc2e fix async import error 2023-09-21 11:16:50 -07:00
ishaan-jaff
6bfde2496c conditional import async_generator 2023-09-21 11:09:57 -07:00