(docs) simplify left nav names + use a section for making llm requests (#6799 )

* fix emojis on docs

* add section on making LLM requests

* docs simplify sidebar

2024-11-18 12:53:43 -08:00

5.1 KiB

Raw Blame History

Google AI Studio SDK

Pass-through endpoints for Google AI Studio - call provider-specific endpoint, in native format (no translation).

Just replace https://generativelanguage.googleapis.com with LITELLM_PROXY_BASE_URL/gemini 🚀

Example Usage

http://0.0.0.0:4000/gemini/v1beta/models/gemini-1.5-flash:countTokens?key=sk-anything' \
-H 'Content-Type: application/json' \
-d '{
    "contents": [{
        "parts":[{
          "text": "The quick brown fox jumps over the lazy dog."
          }]
        }]
}'

Supports ALL Google AI Studio Endpoints (including streaming).

See All Google AI Studio Endpoints

Quick Start

Let's call the Gemini /countTokens endpoint

Add Gemini API Key to your environment

export GEMINI_API_KEY=""

Start LiteLLM Proxy

litellm

# RUNNING on http://0.0.0.0:4000

Test it!

Let's call the Google AI Studio token counting endpoint

http://0.0.0.0:4000/gemini/v1beta/models/gemini-1.5-flash:countTokens?key=anything' \
-H 'Content-Type: application/json' \
-d '{
    "contents": [{
        "parts":[{
          "text": "The quick brown fox jumps over the lazy dog."
          }]
        }]
}'

Examples

Anything after http://0.0.0.0:4000/gemini is treated as a provider-specific route, and handled accordingly.

Key Changes:

Original Endpoint	Replace With
`https://generativelanguage.googleapis.com`	`http://0.0.0.0:4000/gemini` (LITELLM_PROXY_BASE_URL="http://0.0.0.0:4000")
`key=$GOOGLE_API_KEY`	`key=anything` (use `key=LITELLM_VIRTUAL_KEY` if Virtual Keys are setup on proxy)

Example 1: Counting tokens

LiteLLM Proxy Call

curl http://0.0.0.0:4000/gemini/v1beta/models/gemini-1.5-flash:countTokens?key=anything \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{
          "text": "The quick brown fox jumps over the lazy dog."
          }],
        }],
      }'

Direct Google AI Studio Call

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:countTokens?key=$GOOGLE_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{
          "text": "The quick brown fox jumps over the lazy dog."
          }],
        }],
      }'

Example 2: Generate content

LiteLLM Proxy Call

curl "http://0.0.0.0:4000/gemini/v1beta/models/gemini-1.5-flash:generateContent?key=anything" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{"text": "Write a story about a magic backpack."}]
        }]
       }' 2> /dev/null

Direct Google AI Studio Call

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=$GOOGLE_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{"text": "Write a story about a magic backpack."}]
        }]
       }' 2> /dev/null

Example 3: Caching

curl -X POST "http://0.0.0.0:4000/gemini/v1beta/models/gemini-1.5-flash-001:generateContent?key=anything" \
-H 'Content-Type: application/json' \
-d '{
      "contents": [
        {
          "parts":[{
            "text": "Please summarize this transcript"
          }],
          "role": "user"
        },
      ],
      "cachedContent": "'$CACHE_NAME'"
    }'

Direct Google AI Studio Call

curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-001:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
      "contents": [
        {
          "parts":[{
            "text": "Please summarize this transcript"
          }],
          "role": "user"
        },
      ],
      "cachedContent": "'$CACHE_NAME'"
    }'

Advanced - Use with Virtual Keys

Pre-requisites

Setup proxy with DB

Use this, to avoid giving developers the raw Google AI Studio key, but still letting them use Google AI Studio endpoints.

Usage

Setup environment

export DATABASE_URL=""
export LITELLM_MASTER_KEY=""
export GEMINI_API_KEY=""

litellm

# RUNNING on http://0.0.0.0:4000

Generate virtual key

curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{}'

Expected Response

{
    ...
    "key": "sk-1234ewknldferwedojwojw"
}

Test it!

http://0.0.0.0:4000/gemini/v1beta/models/gemini-1.5-flash:countTokens?key=sk-1234ewknldferwedojwojw' \
-H 'Content-Type: application/json' \
-d '{
    "contents": [{
        "parts":[{
          "text": "The quick brown fox jumps over the lazy dog."
          }]
        }]
}'

5.1 KiB Raw Blame History

Google AI Studio SDK

Example Usage

Quick Start

Examples

Example 1: Counting tokens

LiteLLM Proxy Call

Direct Google AI Studio Call

Example 2: Generate content

LiteLLM Proxy Call

Direct Google AI Studio Call

Example 3: Caching

Direct Google AI Studio Call

Advanced - Use with Virtual Keys

Usage

5.1 KiB

Raw Blame History