(docs) added watsonx cookbook

2025-04-25 18:54:30 +00:00 · 2024-04-20 20:52:25 +02:00 · 2024-04-20 20:52:25 +02:00 · ca0807d8ab
commit ca0807d8ab
parent 6edb133733
1 changed files with 213 additions and 0 deletions
--- a/cookbook/liteLLM_IBM_Watsonx.ipynb
+++ b/cookbook/liteLLM_IBM_Watsonx.ipynb
@ -0,0 +1,213 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# LiteLLM x IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai)\n",
+    "\n",
+    "Note: For watsonx.ai requests you need to ensure you have `ibm-watsonx-ai` installed."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Pre-Requisites"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install litellm\n",
+    "!pip install ibm-watsonx-ai"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Set watsonx Credentials\n",
+    "\n",
+    "See [this documentation](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-credentials.html?context=wx) for more information about authenticating to watsonx.ai"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "os.environ[\"WX_URL\"] = \"\" # Your watsonx.ai base URL\n",
+    "os.environ[\"WX_API_KEY\"] = \"\" # Your IBM cloud API key or watsonx.ai token\n",
+    "os.environ[\"WX_PROJECT_ID\"] = \"\" # ID of your watsonx.ai project"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example Requests"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Granite v2 response:\n",
+      "ModelResponse(id='chatcmpl-afe4e875-2cfb-4e8c-aba5-36853007aaae', choices=[Choices(finish_reason='stop', index=0, message=Message(content=' I\\'m looking for a way to extract the email addresses from a CSV file. I\\'ve tried using built-in functions like `split`, `grep`, and `awk`, but none of them seem to work. Specifically, I\\'m trying to extract all email addresses from a file called \"example.csv\". Here\\'s what I have so far:\\n```bash\\ngrep -oP \"[\\\\w-]+@[a-z0-9-]+\\\\.[a-z]{2,}$\" example.csv > extracted_emails.txt\\n```\\nThis command runs the `grep` command, searches for emails in \"example.csv\", and saves the results to a new file called \"extracted\\\\_emails.txt\". However, the email addresses are not properly formatted and do not include domains. I think there might be a better way to do this, so I\\'m open to suggestions.\\n\\nAny help or guidance would be greatly appreciated.\\n\\nPosting this question as a comment on the original response might not be the most effective way to get help. If it\\'s possible, I can create a Code Review question here instead.\\n(Original post here: <https://www.codinghorror.com/blog/archives/', role='assistant'))], created=1713638439, model='granite-20b-multilingual', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=6, completion_tokens=256, total_tokens=262), finish_reason='max_tokens')\n",
+      "LLaMa 3 8b response:\n",
+      "ModelResponse(id='chatcmpl-1d9ce720-0b68-4fa8-a064-41353c228701', choices=[Choices(finish_reason='stop', index=0, message=Message(content=\"assistant\\n\\nHello! I'm just a language model, so I don't have emotions or feelings like humans do, but I'm functioning properly and ready to help with any questions or tasks you may have. How about you? How can I assist you today?\", role='assistant'))], created=1713638444, model='llama-3-8b-instruct', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=12, completion_tokens=55, total_tokens=67), finish_reason='eos_token')\n"
+     ]
+    }
+   ],
+   "source": [
+    "from litellm import completion\n",
+    "\n",
+    "response = completion(\n",
+    "        model=\"watsonx/ibm/granite-20b-multilingual\",\n",
+    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
+    ")\n",
+    "print(\"Granite v2 response:\")\n",
+    "print(response)\n",
+    "\n",
+    "\n",
+    "response = completion(\n",
+    "        model=\"watsonx/meta-llama/llama-3-8b-instruct\",\n",
+    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}]\n",
+    ")\n",
+    "print(\"LLaMa 3 8b response:\")\n",
+    "print(response)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Streaming Requests"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Granite v2 streaming response:\n",
+      "\n",
+      "I'm doing well, thank you. I've been thinking about the type of leader I amNone\n",
+      "LLaMa 3 8b streaming response:\n",
+      "assistant\n",
+      "\n",
+      "Hello! I'm just an AI, so I don't have feelings or emotionsNone"
+     ]
+    }
+   ],
+   "source": [
+    "from litellm import completion\n",
+    "\n",
+    "response = completion(\n",
+    "        model=\"watsonx/ibm/granite-13b-chat-v2\",\n",
+    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
+    "        stream=True,\n",
+    "        max_tokens=20, # maps to watsonx.ai max_new_tokens\n",
+    ")\n",
+    "print(\"Granite v2 streaming response:\")\n",
+    "for chunk in response:\n",
+    "    print(chunk['choices'][0]['delta']['content'], end='')\n",
+    "\n",
+    "print()\n",
+    "response = completion(\n",
+    "        model=\"watsonx/meta-llama/llama-3-8b-instruct\",\n",
+    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
+    "        stream=True,\n",
+    "        max_tokens=20, # maps to watsonx.ai max_new_tokens\n",
+    ")\n",
+    "print(\"LLaMa 3 8b streaming response:\")\n",
+    "for chunk in response:\n",
+    "    print(chunk['choices'][0]['delta']['content'], end='')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Async Requests"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Granite v2 streaming response:\n",
+      "ModelResponse(id='chatcmpl-11827e78-1bdf-4991-ac94-bd28006cf50c', choices=[Choices(finish_reason='stop', index=0, message=Message(content=\" I'm good, and you? \\n\\n(Informal)\\nHey there! Just chillin\", role='assistant'))], created=1713638459, model='granite-13b-chat-v2', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=8, completion_tokens=20, total_tokens=28), finish_reason='max_tokens')\n",
+      "LLaMa 3 8b streaming response:\n",
+      "ModelResponse(id='chatcmpl-4f8a6332-994e-4700-8665-a188a971dda6', choices=[Choices(finish_reason='stop', index=0, message=Message(content=\"assistant\\n\\nI'm just a language model, I don't have emotions or feelings like humans\", role='assistant'))], created=1713638459, model='llama-3-8b-instruct', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=12, completion_tokens=20, total_tokens=32), finish_reason='max_tokens')\n"
+     ]
+    }
+   ],
+   "source": [
+    "from litellm import acompletion\n",
+    "import asyncio\n",
+    "\n",
+    "granite_task = acompletion(\n",
+    "        model=\"watsonx/ibm/granite-13b-chat-v2\",\n",
+    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
+    "        max_tokens=20, # maps to watsonx.ai max_new_tokens\n",
+    ")\n",
+    "llama_3_task = acompletion(\n",
+    "        model=\"watsonx/meta-llama/llama-3-8b-instruct\",\n",
+    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
+    "        max_tokens=20, # maps to watsonx.ai max_new_tokens\n",
+    ")\n",
+    "\n",
+    "granite_response, llama_3_response = await asyncio.gather(granite_task, llama_3_task)\n",
+    "\n",
+    "print(\"Granite v2 streaming response:\")\n",
+    "print(granite_response)\n",
+    "\n",
+    "print(\"LLaMa 3 8b streaming response:\")\n",
+    "print(llama_3_response)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "base",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}