mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-08-03 01:03:59 +00:00
docs: Add tips for debugging remote vLLM provider
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
This commit is contained in:
parent
9845631d51
commit
ca43978809
1 changed files with 1 additions and 1 deletions
|
@ -31,7 +31,7 @@ The following environment variables can be configured:
|
|||
In the following sections, we'll use AMD, NVIDIA or Intel GPUs to serve as hardware accelerators for the vLLM
|
||||
server, which acts as both the LLM inference provider and the safety provider. Note that vLLM also
|
||||
[supports many other hardware accelerators](https://docs.vllm.ai/en/latest/getting_started/installation.html) and
|
||||
that we only use GPUs here for demonstration purposes.
|
||||
that we only use GPUs here for demonstration purposes. Note that if you are running into issues, there's a new environment variable `VLLM_DEBUG_LOG_API_SERVER_RESPONSE` (available in vLLM v0.8.3 and above) to enable log response from API server for debugging.
|
||||
|
||||
### Setting up vLLM server on AMD GPU
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue