docs(simple_proxy.md): adding docs

2023-11-03 14:03:48 -07:00 · 2023-11-03 14:03:48 -07:00 · 22fd8953c1
commit 22fd8953c1
parent 7ed8f8dac8
1 changed files with 42 additions and 2 deletions
--- a/docs/my-website/docs/simple_proxy.md
+++ b/docs/my-website/docs/simple_proxy.md
@ -4,7 +4,7 @@ import TabItem from '@theme/TabItem';
 # 💥 Evaluate LLMs - OpenAI Compatible Server
-LiteLLM Server, is a simple, fast, and lightweight **OpenAI-compatible server** to call 100+ LLM APIs in the OpenAI Input/Output format
+A simple, fast, and lightweight **OpenAI-compatible server** to call 100+ LLM APIs.
 LiteLLM Server supports:
@ -149,7 +149,7 @@ $ litellm --model command-nightly
 [**Jump to Code**](https://github.com/BerriAI/litellm/blob/fef4146396d5d87006259e00095a62e3900d6bb4/litellm/proxy.py#L36)
-# LM-Evaluation Harness with TGI
+# [TUTORIAL] LM-Evaluation Harness with TGI
 Evaluate LLMs 20x faster with TGI via litellm proxy's `/completions` endpoint. 
@ -209,6 +209,46 @@ model_list:
 $ litellm --config /path/to/config.yaml
 ```
 ## Multiple Models 
 Evaluate between multiple models. 
 If you have 1 model running on a local GPU and another that's hosted (e.g. on Runpod), you can call both via the same litellm server by listing them in your `config.yaml`. 
 ```yaml
 model_list:
  - model_name: zephyr-alpha
    litellm_params: # params for litellm.completion() - https://docs.litellm.ai/docs/completion/input#input---request-body
      model: huggingface/HuggingFaceH4/zephyr-7b-alpha
      api_base: http://0.0.0.0:8001
  - model_name: zephyr-beta
    litellm_params:
      model: huggingface/HuggingFaceH4/zephyr-7b-beta
      api_base: https://<my-hosted-endpoint>
 ```
 ### Evaluate model
 If you're repo let's you set model name, you can call the specific model by just passing in that model's name - 
 ```python
 import openai 
 openai.api_base = "http://0.0.0.0:8000" 
 completion = openai.ChatCompletion.create(model="zephyr-alpha", messages=[{"role": "user", "content": "Hello world"}])
 print(completion.choices[0].message.content)
 ```
 If you're repo only let's you specify api base, then you can add the model name to the api base passed in - 
 ```python
 import openai 
 openai.api_base = "http://0.0.0.0:8000/openai/deployments/zephyr-alpha/chat/completions" # zephyr-alpha will be used 
 completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello world"}])
 print(completion.choices[0].message.content)
 ```
 ## Save Model-specific params (API Base, API Keys, Temperature, etc.)
 Use the [router_config_template.yaml](https://github.com/BerriAI/litellm/blob/main/router_config_template.yaml) to save model-specific information like api_base, api_key, temperature, max_tokens, etc.