Commit graph

2221 commits

Author SHA1 Message Date
Krrish Dholakia
f0c4ff6e60 fix(vertex_ai_anthropic.py): support streaming, async completion, async streaming for vertex ai anthropic 2024-04-05 09:27:48 -07:00
lazyhope
596d50a72a
Merge branch 'BerriAI:main' into anthropic-tools-use-2024-04-04 2024-04-05 23:51:03 +08:00
Caixiaopig
3d96e810b0
Updating the default Anthropic Officlal Claude 3 max_tokens to 4096
fix bug
2024-04-05 09:45:57 -05:00
Zihao Li
342073c212 Clean up imports of XML processing functions 2024-04-05 22:36:18 +08:00
Krish Dholakia
eb34306099
Merge pull request #2665 from BerriAI/litellm_claude_vertex_ai
[WIP] feat(vertex_ai_anthropic.py): Add support for claude 3 on vertex ai
2024-04-05 07:06:04 -07:00
Zihao Li
71fdf31790 Refactor tool result submission and tool invoke conversion 2024-04-05 17:11:35 +08:00
Zihao Li
d2cf9d2cf1 Move tool definitions from system prompt to parameter and refactor tool calling parse 2024-04-05 16:01:40 +08:00
Caixiaopig
09463bc584
Updating the default Anthropic Claude 3 max_tokens to 4096
The default value of max_tokens used to be 256. If the client does not set a larger value, the model's output may be truncated, so the default value has been changed to 4096. This value is also the maximum output value described in the official interface.
see: https://docs.anthropic.com/claude/reference/messages_post
2024-04-05 14:44:40 +08:00
Krrish Dholakia
475144e5b7 fix(openai.py): support passing prompt as list instead of concat string 2024-04-03 15:23:20 -07:00
Krrish Dholakia
15e0099948 fix(proxy_server.py): return original model response via response headers - /v1/completions
to help devs with debugging
2024-04-03 13:05:43 -07:00
Krrish Dholakia
1d341970ba feat(vertex_ai_anthropic.py): add claude 3 on vertex ai support - working .completions call
.completions() call works
2024-04-02 22:07:39 -07:00
yishiyiyuan
5faa493d35 🐞 fix: djl vllm support
support vllm response format on sagemaker, which only return one choice.
2024-04-03 11:00:51 +08:00
Krrish Dholakia
919ec86b2b fix(openai.py): switch to using openai sdk for text completion calls 2024-04-02 15:08:12 -07:00
Krrish Dholakia
b07788d2a5 fix(openai.py): return logprobs for text completion calls 2024-04-02 14:05:56 -07:00
jdhuang
b6f98e408f Add sync iterator 2024-04-02 20:14:37 +08:00
Krrish Dholakia
ceabf726b0 fix(main.py): support max retries for transcription calls 2024-04-01 18:37:53 -07:00
Emir Ayar
c0336d3f40 add a third condition: list of text-content dictionaries 2024-03-31 22:43:30 +02:00
DaxServer
61b6f8be44 docs: Update references to Ollama repository url
Updated references to the Ollama repository URL from https://github.com/jmorganca/ollama to https://github.com/ollama/ollama.
2024-03-31 19:35:37 +02:00
Krrish Dholakia
49642a5b00 fix(factory.py): parse list in xml tool calling response (anthropic)
improves tool calling outparsing to check if list in response. Also returns the raw response back to the user via `response._hidden_params["original_response"]`, so user can see exactly what anthropic returned
2024-03-29 11:51:26 -07:00
Krish Dholakia
b6cf3321f7
Merge pull request #2640 from mnicstruwig/fix/fix-xml-function-args-parsing
Fix XML function calling args parsing.
2024-03-29 10:11:52 -07:00
Krrish Dholakia
109cd93a39 fix(sagemaker.py): support model_id consistently. support dynamic args for async calls 2024-03-29 09:05:00 -07:00
Krrish Dholakia
d547944556 fix(sagemaker.py): support 'model_id' param for sagemaker
allow passing inference component param to sagemaker in the same format as we handle this for bedrock
2024-03-29 08:43:17 -07:00
Krrish Dholakia
48af367885 fix(ollama.py): fix type issue 2024-03-28 15:01:56 -07:00
Krish Dholakia
28905c85b6
Merge pull request #2720 from onukura/ollama-batch-embedding
Batch embedding for Ollama
2024-03-28 14:58:55 -07:00
Krrish Dholakia
1e856443e1 feat(proxy/utils.py): enable updating db in a separate server 2024-03-27 16:02:36 -07:00
onukura
f86472518d Add a feature to ollama aembedding to accept batch input 2024-03-27 21:39:19 +00:00
Krish Dholakia
d259c754ef
Merge pull request #2701 from rmann-nflx/main
Updating the default Claude3 max tokens
2024-03-27 10:14:20 -07:00
Rob Mann
e80aae5c30 Updating the default Claude3 max tokens 2024-03-26 11:46:59 -04:00
onukura
2df63cc621 Fix ollama embedding response 2024-03-25 16:26:49 +00:00
Lucca Zenobio
0c4e76ce11 Merge branch 'main' of https://github.com/themrzmaster/litellm 2024-03-25 13:08:26 -03:00
Lucca Zenobio
cda78a5da0 update 2024-03-25 13:08:17 -03:00
Krrish Dholakia
2fabff06c0 fix(bedrock.py): fix supported openai params for bedrock claude 3 2024-03-23 16:02:15 -07:00
Krrish Dholakia
05029fdcc7 feat(vertex_ai_anthropic.py): Add support for claude 3 on vertex ai 2024-03-23 15:53:04 -07:00
Krrish Dholakia
b9143a0a00 fix(factory.py): fix anthropic check 2024-03-23 00:27:24 -07:00
Krrish Dholakia
42a7588b04 fix(anthropic.py): support async claude 3 tool calling + streaming
https://github.com/BerriAI/litellm/issues/2644
2024-03-22 19:57:01 -07:00
Krrish Dholakia
691a83b7dc fix(anthropic.py): handle multiple system prompts 2024-03-22 18:14:15 -07:00
Krrish Dholakia
dfcc0c9ff0 fix(ollama_chat.py): don't pop from dictionary while iterating through it 2024-03-22 08:18:22 -07:00
Michael Struwig
3adfb70fc9 Fix XML function calling args parsing. 2024-03-22 15:05:29 +02:00
Krrish Dholakia
94f55aa6d9 fix(bedrock.py): support claude 3 function calling when stream=true
https://github.com/BerriAI/litellm/issues/2615
2024-03-21 18:39:03 -07:00
Krish Dholakia
33a433eb0a
Merge branch 'main' into litellm_llm_api_prompt_injection_check 2024-03-21 09:57:10 -07:00
Krrish Dholakia
84a540f2d6 build: fix mypy build issues 2024-03-21 08:27:23 -07:00
Lucca Zenóbio
c2d2607272
Merge branch 'BerriAI:main' into main 2024-03-21 10:47:37 -03:00
Lucca Zenobio
0c0780be83 extra headers 2024-03-21 10:43:27 -03:00
Krrish Dholakia
d91f9a9f50 feat(proxy_server.py): enable llm api based prompt injection checks
run user calls through an llm api to check for prompt injection attacks. This happens in parallel to th
e actual llm call using `async_moderation_hook`
2024-03-20 22:43:42 -07:00
Lucca Zenobio
872ff6176d updates 2024-03-20 15:22:23 -03:00
Krrish Dholakia
90e17b5422 fix(handle_jwt.py): track spend for user using jwt auth 2024-03-20 10:55:52 -07:00
Krrish Dholakia
524c244dd9 fix(utils.py): support response_format param for ollama
https://github.com/BerriAI/litellm/issues/2580
2024-03-19 21:07:20 -07:00
Krish Dholakia
97130bb34b
Merge pull request #2558 from lucasmrdt/main
fix(anthropic): tool calling detection
2024-03-19 11:48:05 -07:00
Krish Dholakia
59a4e1bfb6
Merge branch 'main' into litellm_non_openai_tool_call_prompt 2024-03-18 18:29:36 -07:00
garfeildma
45d31e33aa support multiple system message tranlation for bedrock claude-3 2024-03-18 19:41:15 +08:00