llama-stack-mirror/docs/source/distributions/self_hosted_distro
Dmitry Rogozhkin 241a42bb26 docs: add example for intel gpu in vllm remote
PR adds instructions to setup vLLM remote endpoint for vllm-remote
llama stack distribution.

* Verified with manual tests of the configured vllm-remote against vllm
  endpoint running on the system with Intel GPU
* Also verified with ci pytests (see cmdline below). Test passes in the
  same capacity as it does on the A10 Nvidia setup (some tests do fail which
  seems to be known issues with vllm remote llama stack distribution)

```
pytest -s -v tests/integration/inference/test_text_inference.py \
   --stack-config=http://localhost:5001 \
   --text-model=meta-llama/Llama-3.2-3B-Instruct
```

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2025-04-15 07:15:37 -07:00
..
bedrock.md fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
cerebras.md fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
dell-tgi.md fix: docker run with --pull always to fetch the latest image (#1733) 2025-03-20 15:35:48 -07:00
dell.md fix: docker run with --pull always to fetch the latest image (#1733) 2025-03-20 15:35:48 -07:00
fireworks.md test: verification on provider's OAI endpoints (#1893) 2025-04-07 23:06:28 -07:00
groq.md test: verification on provider's OAI endpoints (#1893) 2025-04-07 23:06:28 -07:00
meta-reference-gpu.md fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
meta-reference-quantized-gpu.md fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
nvidia.md fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
ollama.md fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
passthrough.md fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
remote-vllm.md docs: add example for intel gpu in vllm remote 2025-04-15 07:15:37 -07:00
sambanova.md test: verification on provider's OAI endpoints (#1893) 2025-04-07 23:06:28 -07:00
tgi.md fix: Default to port 8321 everywhere (#1734) 2025-03-20 15:50:41 -07:00
together.md test: verification on provider's OAI endpoints (#1893) 2025-04-07 23:06:28 -07:00