(docs + fix) Add docs on Moderations endpoint, Text Completion (#6947)

* fix _pass_through_moderation_endpoint_factory

* fix route_llm_request

* doc moderations api

* docs on /moderations

* add e2e tests for moderations api

* docs moderations api

* test_pass_through_moderation_endpoint_factory

* docs text completion
This commit is contained in:
Ishaan Jaff 2024-11-27 16:30:48 -08:00 committed by GitHub
parent eba700a491
commit 4ebb7c8a7f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 390 additions and 6 deletions

View file

@ -0,0 +1,135 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# Moderation
### Usage
<Tabs>
<TabItem value="python" label="LiteLLM Python SDK">
```python
from litellm import moderation
response = moderation(
input="hello from litellm",
model="text-moderation-stable"
)
```
</TabItem>
<TabItem value="proxy" label="LiteLLM Proxy Server">
For `/moderations` endpoint, there is **no need to specify `model` in the request or on the litellm config.yaml**
Start litellm proxy server
```
litellm
```
<Tabs>
<TabItem value="python" label="OpenAI Python SDK">
```python
from openai import OpenAI
# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:4000")
response = client.moderations.create(
input="hello from litellm",
model="text-moderation-stable" # optional, defaults to `omni-moderation-latest`
)
print(response)
```
</TabItem>
<TabItem value="curl" label="Curl Request">
```shell
curl --location 'http://0.0.0.0:4000/moderations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
```
</TabItem>
</Tabs>
</TabItem>
</Tabs>
## Input Params
LiteLLM accepts and translates the [OpenAI Moderation params](https://platform.openai.com/docs/api-reference/moderations) across all supported providers.
### Required Fields
- `input`: *string or array* - Input (or inputs) to classify. Can be a single string, an array of strings, or an array of multi-modal input objects similar to other models.
- If string: A string of text to classify for moderation
- If array of strings: An array of strings to classify for moderation
- If array of objects: An array of multi-modal inputs to the moderation model, where each object can be:
- An object describing an image to classify with:
- `type`: *string, required* - Always `image_url`
- `image_url`: *object, required* - Contains either an image URL or a data URL for a base64 encoded image
- An object describing text to classify with:
- `type`: *string, required* - Always `text`
- `text`: *string, required* - A string of text to classify
### Optional Fields
- `model`: *string (optional)* - The moderation model to use. Defaults to `omni-moderation-latest`.
## Output Format
Here's the exact json output and type you can expect from all moderation calls:
[**LiteLLM follows OpenAI's output format**](https://platform.openai.com/docs/api-reference/moderations/object)
```python
{
"id": "modr-AB8CjOTu2jiq12hp1AQPfeqFWaORR",
"model": "text-moderation-007",
"results": [
{
"flagged": true,
"categories": {
"sexual": false,
"hate": false,
"harassment": true,
"self-harm": false,
"sexual/minors": false,
"hate/threatening": false,
"violence/graphic": false,
"self-harm/intent": false,
"self-harm/instructions": false,
"harassment/threatening": true,
"violence": true
},
"category_scores": {
"sexual": 0.000011726012417057063,
"hate": 0.22706663608551025,
"harassment": 0.5215635299682617,
"self-harm": 2.227119921371923e-6,
"sexual/minors": 7.107352217872176e-8,
"hate/threatening": 0.023547329008579254,
"violence/graphic": 0.00003391829886822961,
"self-harm/intent": 1.646940972932498e-6,
"self-harm/instructions": 1.1198755256458526e-9,
"harassment/threatening": 0.5694745779037476,
"violence": 0.9971134662628174
}
}
]
}
```
## **Supported Providers**
| Provider |
|-------------|
| OpenAI |

View file

@ -0,0 +1,174 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# Text Completion
### Usage
<Tabs>
<TabItem value="python" label="LiteLLM Python SDK">
```python
from litellm import text_completion
response = text_completion(
model="gpt-3.5-turbo-instruct",
prompt="Say this is a test",
max_tokens=7
)
```
</TabItem>
<TabItem value="proxy" label="LiteLLM Proxy Server">
1. Define models on config.yaml
```yaml
model_list:
- model_name: gpt-3.5-turbo-instruct
litellm_params:
model: text-completion-openai/gpt-3.5-turbo-instruct # The `text-completion-openai/` prefix will call openai.completions.create
api_key: os.environ/OPENAI_API_KEY
- model_name: text-davinci-003
litellm_params:
model: text-completion-openai/text-davinci-003
api_key: os.environ/OPENAI_API_KEY
```
2. Start litellm proxy server
```
litellm --config config.yaml
```
<Tabs>
<TabItem value="python" label="OpenAI Python SDK">
```python
from openai import OpenAI
# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:4000")
response = client.completions.create(
model="gpt-3.5-turbo-instruct",
prompt="Say this is a test",
max_tokens=7
)
print(response)
```
</TabItem>
<TabItem value="curl" label="Curl Request">
```shell
curl --location 'http://0.0.0.0:4000/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"model": "gpt-3.5-turbo-instruct",
"prompt": "Say this is a test",
"max_tokens": 7
}'
```
</TabItem>
</Tabs>
</TabItem>
</Tabs>
## Input Params
LiteLLM accepts and translates the [OpenAI Text Completion params](https://platform.openai.com/docs/api-reference/completions) across all supported providers.
### Required Fields
- `model`: *string* - ID of the model to use
- `prompt`: *string or array* - The prompt(s) to generate completions for
### Optional Fields
- `best_of`: *integer* - Generates best_of completions server-side and returns the "best" one
- `echo`: *boolean* - Echo back the prompt in addition to the completion.
- `frequency_penalty`: *number* - Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency.
- `logit_bias`: *map* - Modify the likelihood of specified tokens appearing in the completion
- `logprobs`: *integer* - Include the log probabilities on the logprobs most likely tokens. Max value of 5
- `max_tokens`: *integer* - The maximum number of tokens to generate.
- `n`: *integer* - How many completions to generate for each prompt.
- `presence_penalty`: *number* - Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
- `seed`: *integer* - If specified, system will attempt to make deterministic samples
- `stop`: *string or array* - Up to 4 sequences where the API will stop generating tokens
- `stream`: *boolean* - Whether to stream back partial progress. Defaults to false
- `suffix`: *string* - The suffix that comes after a completion of inserted text
- `temperature`: *number* - What sampling temperature to use, between 0 and 2.
- `top_p`: *number* - An alternative to sampling with temperature, called nucleus sampling.
- `user`: *string* - A unique identifier representing your end-user
## Output Format
Here's the exact JSON output format you can expect from completion calls:
[**Follows OpenAI's output format**](https://platform.openai.com/docs/api-reference/completions/object)
<Tabs>
<TabItem value="non-streaming" label="Non-Streaming Response">
```python
{
"id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
"object": "text_completion",
"created": 1589478378,
"model": "gpt-3.5-turbo-instruct",
"system_fingerprint": "fp_44709d6fcb",
"choices": [
{
"text": "\n\nThis is indeed a test",
"index": 0,
"logprobs": null,
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 5,
"completion_tokens": 7,
"total_tokens": 12
}
}
```
</TabItem>
<TabItem value="streaming" label="Streaming Response">
```python
{
"id": "cmpl-7iA7iJjj8V2zOkCGvWF2hAkDWBQZe",
"object": "text_completion",
"created": 1690759702,
"choices": [
{
"text": "This",
"index": 0,
"logprobs": null,
"finish_reason": null
}
],
"model": "gpt-3.5-turbo-instruct"
"system_fingerprint": "fp_44709d6fcb",
}
```
</TabItem>
</Tabs>
## **Supported Providers**
| Provider | Link to Usage |
|-------------|--------------------|
| OpenAI | [Usage](../docs/providers/text_completion_openai) |
| Azure OpenAI| [Usage](../docs/providers/azure) |

View file

@ -246,6 +246,7 @@ const sidebars = {
"completion/usage",
],
},
"text_completion",
"embedding/supported_embedding",
"image_generation",
{
@ -261,6 +262,7 @@ const sidebars = {
"batches",
"realtime",
"fine_tuning",
"moderation",
{
type: "link",
label: "Use LiteLLM Proxy with Vertex, Bedrock SDK",