litellm/cookbook/liteLLM_IBM_Watsonx.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# LiteLLM x IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai)\n",
    "\n",
    "Note: For watsonx.ai requests you need to ensure you have `ibm-watsonx-ai` installed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pre-Requisites"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install litellm\n",
    "!pip install ibm-watsonx-ai"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set watsonx Credentials\n",
    "\n",
    "See [this documentation](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-credentials.html?context=wx) for more information about authenticating to watsonx.ai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"WX_URL\"] = \"\" # Your watsonx.ai base URL\n",
    "os.environ[\"WX_API_KEY\"] = \"\" # Your IBM cloud API key or watsonx.ai token\n",
    "os.environ[\"WX_PROJECT_ID\"] = \"\" # ID of your watsonx.ai project"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example Requests"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Granite v2 response:\n",
      "ModelResponse(id='chatcmpl-afe4e875-2cfb-4e8c-aba5-36853007aaae', choices=[Choices(finish_reason='stop', index=0, message=Message(content=' I\\'m looking for a way to extract the email addresses from a CSV file. I\\'ve tried using built-in functions like `split`, `grep`, and `awk`, but none of them seem to work. Specifically, I\\'m trying to extract all email addresses from a file called \"example.csv\". Here\\'s what I have so far:\\n```bash\\ngrep -oP \"[\\\\w-]+@[a-z0-9-]+\\\\.[a-z]{2,}$\" example.csv > extracted_emails.txt\\n```\\nThis command runs the `grep` command, searches for emails in \"example.csv\", and saves the results to a new file called \"extracted\\\\_emails.txt\". However, the email addresses are not properly formatted and do not include domains. I think there might be a better way to do this, so I\\'m open to suggestions.\\n\\nAny help or guidance would be greatly appreciated.\\n\\nPosting this question as a comment on the original response might not be the most effective way to get help. If it\\'s possible, I can create a Code Review question here instead.\\n(Original post here: <https://www.codinghorror.com/blog/archives/', role='assistant'))], created=1713638439, model='granite-20b-multilingual', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=6, completion_tokens=256, total_tokens=262), finish_reason='max_tokens')\n",
      "LLaMa 3 8b response:\n",
      "ModelResponse(id='chatcmpl-1d9ce720-0b68-4fa8-a064-41353c228701', choices=[Choices(finish_reason='stop', index=0, message=Message(content=\"assistant\\n\\nHello! I'm just a language model, so I don't have emotions or feelings like humans do, but I'm functioning properly and ready to help with any questions or tasks you may have. How about you? How can I assist you today?\", role='assistant'))], created=1713638444, model='llama-3-8b-instruct', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=12, completion_tokens=55, total_tokens=67), finish_reason='eos_token')\n"
     ]
    }
   ],
   "source": [
    "from litellm import completion\n",
    "\n",
    "response = completion(\n",
    "        model=\"watsonx/ibm/granite-20b-multilingual\",\n",
    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
    ")\n",
    "print(\"Granite v2 response:\")\n",
    "print(response)\n",
    "\n",
    "\n",
    "response = completion(\n",
    "        model=\"watsonx/meta-llama/llama-3-8b-instruct\",\n",
    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}]\n",
    ")\n",
    "print(\"LLaMa 3 8b response:\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Streaming Requests"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Granite v2 streaming response:\n",
      "\n",
      "I'm doing well, thank you. I've been thinking about the type of leader I amNone\n",
      "LLaMa 3 8b streaming response:\n",
      "assistant\n",
      "\n",
      "Hello! I'm just an AI, so I don't have feelings or emotionsNone"
     ]
    }
   ],
   "source": [
    "from litellm import completion\n",
    "\n",
    "response = completion(\n",
    "        model=\"watsonx/ibm/granite-13b-chat-v2\",\n",
    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
    "        stream=True,\n",
    "        max_tokens=20, # maps to watsonx.ai max_new_tokens\n",
    ")\n",
    "print(\"Granite v2 streaming response:\")\n",
    "for chunk in response:\n",
    "    print(chunk['choices'][0]['delta']['content'], end='')\n",
    "\n",
    "print()\n",
    "response = completion(\n",
    "        model=\"watsonx/meta-llama/llama-3-8b-instruct\",\n",
    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
    "        stream=True,\n",
    "        max_tokens=20, # maps to watsonx.ai max_new_tokens\n",
    ")\n",
    "print(\"LLaMa 3 8b streaming response:\")\n",
    "for chunk in response:\n",
    "    print(chunk['choices'][0]['delta']['content'], end='')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Async Requests"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Granite v2 streaming response:\n",
      "ModelResponse(id='chatcmpl-11827e78-1bdf-4991-ac94-bd28006cf50c', choices=[Choices(finish_reason='stop', index=0, message=Message(content=\" I'm good, and you? \\n\\n(Informal)\\nHey there! Just chillin\", role='assistant'))], created=1713638459, model='granite-13b-chat-v2', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=8, completion_tokens=20, total_tokens=28), finish_reason='max_tokens')\n",
      "LLaMa 3 8b streaming response:\n",
      "ModelResponse(id='chatcmpl-4f8a6332-994e-4700-8665-a188a971dda6', choices=[Choices(finish_reason='stop', index=0, message=Message(content=\"assistant\\n\\nI'm just a language model, I don't have emotions or feelings like humans\", role='assistant'))], created=1713638459, model='llama-3-8b-instruct', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=12, completion_tokens=20, total_tokens=32), finish_reason='max_tokens')\n"
     ]
    }
   ],
   "source": [
    "from litellm import acompletion\n",
    "import asyncio\n",
    "\n",
    "granite_task = acompletion(\n",
    "        model=\"watsonx/ibm/granite-13b-chat-v2\",\n",
    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
    "        max_tokens=20, # maps to watsonx.ai max_new_tokens\n",
    ")\n",
    "llama_3_task = acompletion(\n",
    "        model=\"watsonx/meta-llama/llama-3-8b-instruct\",\n",
    "        messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
    "        max_tokens=20, # maps to watsonx.ai max_new_tokens\n",
    ")\n",
    "\n",
    "granite_response, llama_3_response = await asyncio.gather(granite_task, llama_3_task)\n",
    "\n",
    "print(\"Granite v2 streaming response:\")\n",
    "print(granite_response)\n",
    "\n",
    "print(\"LLaMa 3 8b streaming response:\")\n",
    "print(llama_3_response)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}