Merge pull request #5035 from BerriAI/docs_add_example_doing_vtx_ft

Docs - Add example of Vertex AI fine tuning API
2024-08-03 09:36:37 -07:00 · 2024-08-03 09:36:37 -07:00 · a70d112b98
commit a70d112b98
parent a44ff761ab e3bee177e5
1 changed files with 132 additions and 0 deletions
--- a/docs/my-website/docs/fine_tuning.md
+++ b/docs/my-website/docs/fine_tuning.md
@ -13,6 +13,7 @@ This is an Enterprise only endpoint [Get Started with Enterprise here](https://c
 ## Supported Providers
 - Azure OpenAI
 - OpenAI
+- Vertex AI

 Add `finetune_settings` and `files_settings` to your litellm config.yaml to use the fine-tuning endpoints.
 ## Example config.yaml for `finetune_settings` and `files_settings`
@ -32,6 +33,10 @@ finetune_settings:
    api_version: "2023-03-15-preview"
  - custom_llm_provider: openai
    api_key: os.environ/OPENAI_API_KEY
+  - custom_llm_provider: "vertex_ai"
+    vertex_project: "adroit-crow-413218"
+    vertex_location: "us-central1"
+    vertex_credentials: "/Users/ishaanjaffer/Downloads/adroit-crow-413218-a956eef1a2a8.json"

 # for /files endpoints
 files_settings:
@ -73,6 +78,9 @@ curl http://localhost:4000/v1/files \

 ## Create fine-tuning job

+<Tabs>
+<TabItem value="azure" label="Azure OpenAI">
+
 <Tabs>
 <TabItem value="openai" label="OpenAI Python SDK">

@ -100,6 +108,130 @@ curl http://localhost:4000/v1/fine_tuning/jobs \
 </TabItem>
 </Tabs>

+</TabItem>
+
+<TabItem value="Vertex" label="VertexAI">
+
+<Tabs>
+<TabItem value="openai" label="OpenAI Python SDK">
+
+```python
+ft_job = await client.fine_tuning.jobs.create(
+    model="gemini-1.0-pro-002",                  # Vertex model you want to fine-tune
+    training_file="gs://cloud-samples-data/ai-platform/generative_ai/sft_train_data.jsonl",                 # file_id from create file response
+    extra_body={"custom_llm_provider": "vertex_ai"}, # tell litellm proxy which provider to use
+)
+```
+</TabItem>
+
+<TabItem value="curl" label="curl">
+
+```shell
+curl http://localhost:4000/v1/fine_tuning/jobs \
+    -H "Content-Type: application/json" \
+    -H "Authorization: Bearer sk-1234" \
+    -d '{
+    "custom_llm_provider": "vertex_ai",
+    "model": "gemini-1.0-pro-002",
+    "training_file": "gs://cloud-samples-data/ai-platform/generative_ai/sft_train_data.jsonl"
+    }'
+```
+</TabItem>
+</Tabs>
+
+</TabItem>
+</Tabs>
+
+### Request Body
+
+<Tabs>
+<TabItem value="params" label="Supported Params">
+
+* `model`
+
+    **Type:** string  
+    **Required:** Yes  
+    The name of the model to fine-tune
+
+* `custom_llm_provider`
+
+    **Type:** `Literal["azure", "openai", "vertex_ai"]`
+
+    **Required:** Yes
+    The name of the model to fine-tune. You can select one of the [**supported providers**](#supported-providers)
+
+* `training_file`
+
+    **Type:** string  
+    **Required:** Yes  
+    The ID of an uploaded file that contains training data.
+    - See **upload file** for how to upload a file.
+    - Your dataset must be formatted as a JSONL file.
+
+* `hyperparameters`
+
+    **Type:** object  
+    **Required:** No  
+    The hyperparameters used for the fine-tuning job.
+    > #### Supported `hyperparameters`
+    > #### batch_size
+    **Type:** string or integer  
+    **Required:** No  
+    Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
+    > #### learning_rate_multiplier
+    **Type:** string or number  
+    **Required:** No  
+    Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
+
+    > #### n_epochs
+    **Type:** string or integer  
+    **Required:** No  
+    The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
+
+* `suffix`
+    **Type:** string or null  
+    **Required:** No  
+    **Default:** null  
+    A string of up to 18 characters that will be added to your fine-tuned model name.
+    Example: A `suffix` of "custom-model-name" would produce a model name like `ft:gpt-4o-mini:openai:custom-model-name:7p4lURel`.
+
+* `validation_file`
+    **Type:** string or null  
+    **Required:** No  
+    The ID of an uploaded file that contains validation data.
+    - If provided, this data is used to generate validation metrics periodically during fine-tuning.
+
+
+* `integrations`
+    **Type:** array or null  
+    **Required:** No  
+    A list of integrations to enable for your fine-tuning job.
+
+* `seed`
+    **Type:** integer or null  
+    **Required:** No  
+    The seed controls the reproducibility of the job. Passing in the same seed and job parameters should produce the same results, but may differ in rare cases. If a seed is not specified, one will be generated for you.
+
+</TabItem>
+<TabItem value="example" label="Example Request Body">
+
+```json
+{
+  "model": "gpt-4o-mini",
+  "training_file": "file-abcde12345",
+  "hyperparameters": {
+    "batch_size": 4,
+    "learning_rate_multiplier": 0.1,
+    "n_epochs": 3
+  },
+  "suffix": "custom-model-v1",
+  "validation_file": "file-fghij67890",
+  "seed": 42
+}
+```
+</TabItem>
+</Tabs>
+
 ## Cancel fine-tuning job

 <Tabs>