Commit graph

75 commits

Author SHA1 Message Date
Krish Dholakia
6fdee99632 LiteLLM Minor fixes + improvements (08/04/2024) (#5505)
* Minor IAM AWS OIDC Improvements (#5246)

* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.

* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.

* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.

* fix(router.py): log rejected requests

Fixes https://github.com/BerriAI/litellm/issues/5498

* refactor: don't use verbose_logger.exception, if exception is raised

User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.

* fix(datadog.py): support setting datadog source as an env var

Fixes https://github.com/BerriAI/litellm/issues/5508

* docs(logging.md): add dd_source to datadog docs

* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers

* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509)

* feat(anthropic.py): support 'cache_control' param for content when it is a string

* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519)

This reverts commit 3fac0349c2.

* refactor: ci/cd run again

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-09-04 22:16:55 -07:00
Krrish Dholakia
1e0b85dfc6 feat(ollama.py): support ollama /api/embed endpoint
Closes https://github.com/BerriAI/litellm/issues/5291
2024-08-20 09:10:08 -07:00
Krrish Dholakia
a449661223 fix(ollama.py): fix ollama embeddings - pass optional params
Fixes https://github.com/BerriAI/litellm/issues/5267
2024-08-19 08:45:26 -07:00
Krrish Dholakia
2874b94fb1 refactor: replace .error() with .exception() logging for better debugging on sentry 2024-08-16 09:22:47 -07:00
thiswillbeyourgithub
ac8967c07b fix: wrong order of arguments for ollama 2024-08-08 17:19:17 +02:00
Krrish Dholakia
879289b06e fix(ollama.py): correctly raise ollama streaming error
Fixes https://github.com/BerriAI/litellm/issues/4974
2024-07-30 15:01:26 -07:00
Titusz
c528f7df17 Add missing num_gpu ollama configuration parameter 2024-07-18 17:51:56 +02:00
Krrish Dholakia
c69193c321 fix: move to using pydantic obj for setting values 2024-07-11 13:18:36 -07:00
corrm
93cb6d6175 chore: Improved OllamaConfig get_required_params and ollama_acompletion and ollama_async_streaming functions 2024-06-24 05:55:22 +03:00
Krish Dholakia
ea4334f760 Merge branch 'main' into litellm_cleanup_traceback 2024-06-06 16:32:08 -07:00
Krrish Dholakia
e391e30285 refactor: replace 'traceback.print_exc()' with logging library
allows error logs to be in json format for otel logging
2024-06-06 13:47:43 -07:00
sha-ahammed
93e7b9346c feat: Add Ollama as a provider in the proxy UI 2024-06-05 16:48:38 +05:30
KX
ddb998fac1 fix: add missing seed parameter to ollama input
Current ollama interfacing does not allow for seed, which is supported in https://github.com/ollama/ollama/blob/main/docs/api.md#parameters and https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values

This resolves that by adding in handling of seed parameter.
2024-05-31 01:47:56 +08:00
frob
5ab5f2c29e Merge branch 'BerriAI:main' into ollama-image-handling 2024-05-09 20:25:30 +02:00
Krrish Dholakia
5f93cae3ff feat(proxy_server.py): return litellm version in response headers 2024-05-08 16:00:08 -07:00
frob
4a6867879d Merge branch 'BerriAI:main' into ollama-image-handling 2024-05-09 00:14:29 +02:00
Ishaan Jaff
d399947111 Merge pull request #3470 from mbektas/fix-ollama-embeddings
support sync ollama embeddings
2024-05-07 19:21:37 -07:00
frob
c6e6efc902 Merge branch 'BerriAI:main' into ollama-image-handling 2024-05-06 18:06:45 +02:00
Mehmet Bektas
1236638266 support sync ollama embeddings 2024-05-05 19:44:25 -07:00
Jack Collins
218f15de60 Fix: get format from data not optional_params ollama non-stream completion 2024-05-05 18:59:26 -07:00
Jack Collins
5bc934303c Add missing import itertools.chain 2024-05-05 18:54:08 -07:00
Jack Collins
5393c5459e Fix: Set finish_reason to tool_calls for non-stream responses in ollama 2024-05-05 18:52:31 -07:00
Jack Collins
2d43423138 Parse streamed function calls as single delta in ollama 2024-05-05 18:52:20 -07:00
frob
74a0508682 Merge branch 'BerriAI:main' into ollama-image-handling 2024-05-01 22:29:37 +02:00
Krish Dholakia
52f43c8c2e Merge branch 'main' into litellm_ollama_tool_call_reponse 2024-05-01 10:24:05 -07:00
frob
b986fbfa13 Merge branch 'BerriAI:main' into ollama-image-handling 2024-04-21 01:49:10 +02:00
frob
df4fd2a7bf Disable special tokens in ollama completion when counting tokens
Some(?) models (eg, codegemma) don't return a prompt_eval_count field, so ollama.py tries to compute the value based on encoding of the prompt.  Unfortunately FIM symbols used in the prompt (eg, "<|fim_prefix|>") cause the encoder to throw an exception, so we disable special processing.
2024-04-19 21:38:42 +02:00
frob
e0276feaf5 Update comment. 2024-04-16 01:12:24 +02:00
frob
8535e9eb9d Merge branch 'BerriAI:main' into ollama-image-handling 2024-04-13 21:42:58 +02:00
frob
dadf356a8e ollama also accepts PNG 2024-04-08 03:35:02 +02:00
frob
1732478592 Update ollama.py for image handling
ollama wants plain base64 jpeg images, and some clients send dataURI and/or webp.  Remove prefixes and convert all non-jpeg images to jpeg.
2024-04-08 03:28:24 +02:00
Gregory Nwosu
0609da96e3 created defaults for response["eval_count"]
there is no way in litellm to disable the cache in ollama that is removing the eval_count response keys from the json.
This PR allows the code to create sensible defaults for when the response is empty
see 
- https://github.com/ollama/ollama/issues/1573
- https://github.com/ollama/ollama/issues/2023
2024-04-08 02:03:54 +01:00
frob
5c000fb6b1 Update ollama.py for image handling
Some clients (eg librechat) send images in datauri format, not plain base64.  Strip off the prerix when passing images to ollama.
2024-04-07 13:05:39 +02:00
DaxServer
947ba9d15b docs: Update references to Ollama repository url
Updated references to the Ollama repository URL from https://github.com/jmorganca/ollama to https://github.com/ollama/ollama.
2024-03-31 19:35:37 +02:00
Krrish Dholakia
f7dd1758bb fix(ollama.py): fix type issue 2024-03-28 15:01:56 -07:00
onukura
1bd60287ba Add a feature to ollama aembedding to accept batch input 2024-03-27 21:39:19 +00:00
onukura
6ee8f26746 Fix ollama embedding response 2024-03-25 16:26:49 +00:00
Lunik
a1be265052 🐛 fix: Ollama vision models call arguments (like : llava)
Signed-off-by: Lunik <lunik@tiwabbit.fr>
2024-02-26 17:52:55 +01:00
Krrish Dholakia
220a90527f fix(ollama.py): support format for ollama 2024-02-06 10:11:52 -08:00
Ishaan Jaff
175c4000da Merge pull request #1750 from vanpelt/patch-2
Re-raise exception in async ollama streaming
2024-02-05 08:12:17 -08:00
Krrish Dholakia
a2bb95be59 refactor(ollama.py): trigger rebuild 2024-02-03 20:23:43 -08:00
Krrish Dholakia
56110188fd fix(ollama.py): fix api connection error
https://github.com/BerriAI/litellm/issues/1735
2024-02-03 20:22:33 -08:00
Chris Van Pelt
547b9beefc Re-raise exception in async ollama streaming 2024-02-01 16:14:07 -08:00
Krrish Dholakia
635a34b543 fix(utils.py): fix streaming chunks to not return role, unless set 2024-02-01 09:55:56 -08:00
TheDiscoMole
02a73e14a3 changing ollama response parsing to expected behaviour 2024-01-19 23:36:24 +01:00
ishaan-jaff
3081dc525a (feat) litellm.completion - support ollama timeout 2024-01-09 10:34:41 +05:30
Krrish Dholakia
d89a58ec54 fix(ollama.py): use tiktoken as backup for prompt token counting 2024-01-09 09:47:18 +05:30
Krrish Dholakia
79978c44ba refactor: add black formatting 2023-12-25 14:11:20 +05:30
Krrish Dholakia
b7a7c3a4e5 feat(ollama.py): add support for async ollama embeddings 2023-12-23 18:01:25 +05:30
Krrish Dholakia
a65dfdde94 test(test_completion.py-+-test_streaming.py): add ollama endpoint to ci/cd pipeline 2023-12-22 12:21:33 +05:30