diff --git a/docs/my-website/docs/proxy/architecture.md b/docs/my-website/docs/proxy/architecture.md new file mode 100644 index 000000000..d75ffc765 --- /dev/null +++ b/docs/my-website/docs/proxy/architecture.md @@ -0,0 +1,31 @@ +import Image from '@theme/IdealImage'; +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +# Life of a Request + +## High Level architecture + + + + +### Request Flow + +1. **User Sends Request**: The process begins when a user sends a request to the LiteLLM Proxy Server (Gateway). + +2. [**Virtual Keys**](../virtual_keys): The request first passes through the Virtual Keys component + +3. **Rate Limiting**: The MaxParallelRequestsHandler applies rate limiting to manage the flow of requests. + +4. **Proxy Server Processing**: The request is then processed by the LiteLLM proxy_server.py, which handles the core logic of the proxy. + +5. [**LiteLLM Router**](../routing): LiteLLM Router**: The LiteLLM Router determines where to send the request based on the configuration and request parameters. + +6. **Model Interaction**: The request is sent to the appropriate model API (litellm.completion() or litellm.embedding()) for processing. + +7. **Response**: The model's response is sent back through the same components to the user. + +8. **Post-Request Processing**: After the response is sent, several asynchronous operations occur: + - The _PROXY_track_cost_callback updates spend in the database. + - Logging to LangFuse for analytics and monitoring. + - The MaxParallelRequestsHandler updates virtual key usage and performs post-request cleanup. diff --git a/docs/my-website/img/litellm_gateway.png b/docs/my-website/img/litellm_gateway.png new file mode 100644 index 000000000..f453a2bf9 Binary files /dev/null and b/docs/my-website/img/litellm_gateway.png differ diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js index 7e88f419c..52a380147 100644 --- a/docs/my-website/sidebars.js +++ b/docs/my-website/sidebars.js @@ -31,7 +31,12 @@ const sidebars = { "proxy/quick_start", "proxy/docker_quick_start", "proxy/deploy", - "proxy/prod", + "proxy/prod", + { + type: "category", + label: "Architecture", + items: ["proxy/architecture"], + }, { type: "link", label: "📖 All Endpoints (Swagger)",