Merge pull request #4896 from BerriAI/docs_add_example_usage_with_mistral_python

Docs Proxy - add example usage with mistral SDK with Proxy
2024-07-25 20:02:26 -07:00 · 2024-07-25 20:02:26 -07:00 · 87cebdefd3
commit 87cebdefd3
parent 826bb125e8 646b2d50f9
4 changed files with 204 additions and 160 deletions
--- a/docs/my-website/docs/proxy/quick_start.md
+++ b/docs/my-website/docs/proxy/quick_start.md
@ -255,6 +255,12 @@ litellm --config your_config.yaml

 ## Using LiteLLM Proxy - Curl Request, OpenAI Package, Langchain

+:::info
+LiteLLM is compatible with several SDKs - including OpenAI SDK, Anthropic SDK, Mistral SDK, LLamaIndex, Langchain (Js, Python)
+
+[More examples here](user_keys)
+:::
+
 <Tabs>
 <TabItem value="Curl" label="Curl Request">

@ -396,165 +402,6 @@ print(response)
 - POST `/key/generate` - generate a key to access the proxy


-## Using with OpenAI compatible projects
-Set `base_url` to the LiteLLM Proxy server
-
-<Tabs>
-<TabItem value="openai" label="OpenAI v1.0.0+">
-
-```python
-import openai
-client = openai.OpenAI(
-    api_key="anything",
-    base_url="http://0.0.0.0:4000"
-)
-
-# request sent to model set on litellm proxy, `litellm --model`
-response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
-    {
-        "role": "user",
-        "content": "this is a test request, write a short poem"
-    }
-])
-
-print(response)
-
-```
-</TabItem>
-<TabItem value="librechat" label="LibreChat">
-
-#### Start the LiteLLM proxy
-```shell
-litellm --model gpt-3.5-turbo
-
-#INFO: Proxy running on http://0.0.0.0:4000
-```
-
-#### 1. Clone the repo
-
-```shell
-git clone https://github.com/danny-avila/LibreChat.git
-```
-
-
-#### 2. Modify Librechat's `docker-compose.yml`
-LiteLLM Proxy is running on port `4000`, set `4000` as the proxy below
-```yaml
-OPENAI_REVERSE_PROXY=http://host.docker.internal:4000/v1/chat/completions
-```
-
-#### 3. Save fake OpenAI key in Librechat's `.env` 
-
-Copy Librechat's `.env.example` to `.env` and overwrite the default OPENAI_API_KEY (by default it requires the user to pass a key).
-```env
-OPENAI_API_KEY=sk-1234
-```
-
-#### 4. Run LibreChat: 
-```shell
-docker compose up
-```
-</TabItem>
-
-<TabItem value="continue-dev" label="ContinueDev">
-
-Continue-Dev brings ChatGPT to VSCode. See how to [install it here](https://continue.dev/docs/quickstart).
-
-In the [config.py](https://continue.dev/docs/reference/Models/openai) set this as your default model.
-```python
-  default=OpenAI(
-      api_key="IGNORED",
-      model="fake-model-name",
-      context_length=2048, # customize if needed for your model
-      api_base="http://localhost:4000" # your proxy server url
-  ),
-```
-
-Credits [@vividfog](https://github.com/ollama/ollama/issues/305#issuecomment-1751848077) for this tutorial. 
-</TabItem>
-
-<TabItem value="aider" label="Aider">
-
-```shell
-$ pip install aider 
-
-$ aider --openai-api-base http://0.0.0.0:4000 --openai-api-key fake-key
-```
-</TabItem>
-<TabItem value="autogen" label="AutoGen">
-
-```python
-pip install pyautogen
-```
-
-```python
-from autogen import AssistantAgent, UserProxyAgent, oai
-config_list=[
-    {
-        "model": "my-fake-model",
-        "api_base": "http://localhost:4000",  #litellm compatible endpoint
-        "api_type": "open_ai",
-        "api_key": "NULL", # just a placeholder
-    }
-]
-
-response = oai.Completion.create(config_list=config_list, prompt="Hi")
-print(response) # works fine
-
-llm_config={
-    "config_list": config_list,
-}
-
-assistant = AssistantAgent("assistant", llm_config=llm_config)
-user_proxy = UserProxyAgent("user_proxy")
-user_proxy.initiate_chat(assistant, message="Plot a chart of META and TESLA stock price change YTD.", config_list=config_list)
-```
-
-Credits [@victordibia](https://github.com/microsoft/autogen/issues/45#issuecomment-1749921972) for this tutorial.
-</TabItem>
-
-<TabItem value="guidance" label="guidance">
-A guidance language for controlling large language models.
-https://github.com/guidance-ai/guidance
-
-**NOTE:** Guidance sends additional params like `stop_sequences` which can cause some models to fail if they don't support it. 
-
-**Fix**: Start your proxy using the `--drop_params` flag
-
-```shell
-litellm --model ollama/codellama --temperature 0.3 --max_tokens 2048 --drop_params
-```
-
-```python
-import guidance
-
-# set api_base to your proxy
-# set api_key to anything
-gpt4 = guidance.llms.OpenAI("gpt-4", api_base="http://0.0.0.0:4000", api_key="anything")
-
-experts = guidance('''
-{{#system~}}
-You are a helpful and terse assistant.
-{{~/system}}
-
-{{#user~}}
-I want a response to the following question:
-{{query}}
-Name 3 world-class experts (past or present) who would be great at answering this?
-Don't answer the question yet.
-{{~/user}}
-
-{{#assistant~}}
-{{gen 'expert_names' temperature=0 max_tokens=300}}
-{{~/assistant}}
-''', llm=gpt4)
-
-result = experts(query='How can I be more productive?')
-print(result)
-```
-</TabItem>
-</Tabs>
-
 ## Debugging Proxy 

 Events that occur during normal operation
--- a/docs/my-website/docs/proxy/user_keys.md
+++ b/docs/my-website/docs/proxy/user_keys.md
@ -1,7 +1,7 @@
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';

-# Use with Langchain, OpenAI SDK, LlamaIndex, Instructor, Curl
+# 💡 Use with Langchain, OpenAI SDK, LlamaIndex, Instructor, Curl

 :::info

@ -234,6 +234,26 @@ main();
 ```

 </TabItem>
+<TabItem value="mistral-py" label="Mistral Python SDK">
+
+```python
+import os
+from mistralai.client import MistralClient
+from mistralai.models.chat_completion import ChatMessage
+
+
+client = MistralClient(api_key="sk-1234", endpoint="http://0.0.0.0:4000")
+chat_response = client.chat(
+    model="mistral-small-latest",
+    messages=[
+        {"role": "user", "content": "this is a test request, write a short poem"}
+    ],
+)
+print(chat_response.choices[0].message.content)
+```
+
+</TabItem>
+
 <TabItem value="instructor" label="Instructor">

 ```python
@ -566,6 +586,166 @@ curl --location 'http://0.0.0.0:4000/moderations' \
 ```


+## Using with OpenAI compatible projects
+Set `base_url` to the LiteLLM Proxy server
+
+<Tabs>
+<TabItem value="openai" label="OpenAI v1.0.0+">
+
+```python
+import openai
+client = openai.OpenAI(
+    api_key="anything",
+    base_url="http://0.0.0.0:4000"
+)
+
+# request sent to model set on litellm proxy, `litellm --model`
+response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
+    {
+        "role": "user",
+        "content": "this is a test request, write a short poem"
+    }
+])
+
+print(response)
+
+```
+</TabItem>
+<TabItem value="librechat" label="LibreChat">
+
+#### Start the LiteLLM proxy
+```shell
+litellm --model gpt-3.5-turbo
+
+#INFO: Proxy running on http://0.0.0.0:4000
+```
+
+#### 1. Clone the repo
+
+```shell
+git clone https://github.com/danny-avila/LibreChat.git
+```
+
+
+#### 2. Modify Librechat's `docker-compose.yml`
+LiteLLM Proxy is running on port `4000`, set `4000` as the proxy below
+```yaml
+OPENAI_REVERSE_PROXY=http://host.docker.internal:4000/v1/chat/completions
+```
+
+#### 3. Save fake OpenAI key in Librechat's `.env` 
+
+Copy Librechat's `.env.example` to `.env` and overwrite the default OPENAI_API_KEY (by default it requires the user to pass a key).
+```env
+OPENAI_API_KEY=sk-1234
+```
+
+#### 4. Run LibreChat: 
+```shell
+docker compose up
+```
+</TabItem>
+
+<TabItem value="continue-dev" label="ContinueDev">
+
+Continue-Dev brings ChatGPT to VSCode. See how to [install it here](https://continue.dev/docs/quickstart).
+
+In the [config.py](https://continue.dev/docs/reference/Models/openai) set this as your default model.
+```python
+  default=OpenAI(
+      api_key="IGNORED",
+      model="fake-model-name",
+      context_length=2048, # customize if needed for your model
+      api_base="http://localhost:4000" # your proxy server url
+  ),
+```
+
+Credits [@vividfog](https://github.com/ollama/ollama/issues/305#issuecomment-1751848077) for this tutorial. 
+</TabItem>
+
+<TabItem value="aider" label="Aider">
+
+```shell
+$ pip install aider 
+
+$ aider --openai-api-base http://0.0.0.0:4000 --openai-api-key fake-key
+```
+</TabItem>
+<TabItem value="autogen" label="AutoGen">
+
+```python
+pip install pyautogen
+```
+
+```python
+from autogen import AssistantAgent, UserProxyAgent, oai
+config_list=[
+    {
+        "model": "my-fake-model",
+        "api_base": "http://localhost:4000",  #litellm compatible endpoint
+        "api_type": "open_ai",
+        "api_key": "NULL", # just a placeholder
+    }
+]
+
+response = oai.Completion.create(config_list=config_list, prompt="Hi")
+print(response) # works fine
+
+llm_config={
+    "config_list": config_list,
+}
+
+assistant = AssistantAgent("assistant", llm_config=llm_config)
+user_proxy = UserProxyAgent("user_proxy")
+user_proxy.initiate_chat(assistant, message="Plot a chart of META and TESLA stock price change YTD.", config_list=config_list)
+```
+
+Credits [@victordibia](https://github.com/microsoft/autogen/issues/45#issuecomment-1749921972) for this tutorial.
+</TabItem>
+
+<TabItem value="guidance" label="guidance">
+A guidance language for controlling large language models.
+https://github.com/guidance-ai/guidance
+
+**NOTE:** Guidance sends additional params like `stop_sequences` which can cause some models to fail if they don't support it. 
+
+**Fix**: Start your proxy using the `--drop_params` flag
+
+```shell
+litellm --model ollama/codellama --temperature 0.3 --max_tokens 2048 --drop_params
+```
+
+```python
+import guidance
+
+# set api_base to your proxy
+# set api_key to anything
+gpt4 = guidance.llms.OpenAI("gpt-4", api_base="http://0.0.0.0:4000", api_key="anything")
+
+experts = guidance('''
+{{#system~}}
+You are a helpful and terse assistant.
+{{~/system}}
+
+{{#user~}}
+I want a response to the following question:
+{{query}}
+Name 3 world-class experts (past or present) who would be great at answering this?
+Don't answer the question yet.
+{{~/user}}
+
+{{#assistant~}}
+{{gen 'expert_names' temperature=0 max_tokens=300}}
+{{~/assistant}}
+''', llm=gpt4)
+
+result = experts(query='How can I be more productive?')
+print(result)
+```
+</TabItem>
+</Tabs>
+
+
 ## Advanced

 ### (BETA) Batch Completions - pass multiple models
--- a/litellm/proxy/proxy_config.yaml
+++ b/litellm/proxy/proxy_config.yaml
@ -8,6 +8,10 @@ model_list:
    litellm_params:
      model: fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct
      api_key: "os.environ/FIREWORKS"
+  - model_name: mistral-small-latest
+    litellm_params:
+      model: mistral/mistral-small-latest
+      api_key: "os.environ/MISTRAL_API_KEY"
  - model_name: tts
    litellm_params:
      model: openai/tts-1
--- a/litellm/proxy/tests/test_mistral_sdk.py
+++ b/litellm/proxy/tests/test_mistral_sdk.py
@ -0,0 +1,13 @@
+import os
+
+from mistralai.client import MistralClient
+from mistralai.models.chat_completion import ChatMessage
+
+client = MistralClient(api_key="sk-1234", endpoint="http://0.0.0.0:4000")
+chat_response = client.chat(
+    model="mistral-small-latest",
+    messages=[
+        {"role": "user", "content": "this is a test request, write a short poem"}
+    ],
+)
+print(chat_response.choices[0].message.content)