📣1-click deploy your own LLM proxy server. Grab time, if you're interested!
- LiteLLM manages - Translating inputs to the provider's completion and embedding endpoints @@ -78,8 +75,6 @@ response = completion(model="command-nightly", messages=messages) print(response) ``` -**Don't have a key? We'll give you access 👉 https://docs.litellm.ai/docs/proxy_api** - ## Streaming liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response. Streaming is supported for OpenAI, Azure, Anthropic, Huggingface models @@ -93,6 +88,16 @@ result = completion('claude-2', messages, stream=True) for chunk in result: print(chunk['choices'][0]['delta']) ``` + +## OpenAI Proxy Server +Spin up a local server to translate openai api calls to any non-openai model (e.g. Huggingface, TogetherAI, Ollama, etc.) + +This works for async + streaming as well. +```python +litellm --model