llama-stack/docs/source/distributions
Dmitry Rogozhkin 71ed47ea76
docs: add example for intel gpu in vllm remote (#1952)
# What does this PR do?

PR adds instructions to setup vLLM remote endpoint for vllm-remote llama
stack distribution.

## Test Plan

* Verified with manual tests of the configured vllm-remote against vllm
endpoint running on the system with Intel GPU
* Also verified with ci pytests (see cmdline below). Test passes in the
same capacity as it does on the A10 Nvidia setup (some tests do fail
which seems to be known issues with vllm remote llama stack
distribution)

```
pytest -s -v tests/integration/inference/test_text_inference.py \
   --stack-config=http://localhost:5001 \
   --text-model=meta-llama/Llama-3.2-3B-Instruct
```

CC: @ashwinb

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2025-04-15 07:56:23 -07:00
..
ondevice_distro docs: Fix trailing whitespace error (#1669) 2025-03-17 08:53:30 -07:00
remote_hosted_distro docs: resync missing nvidia doc (#1947) 2025-04-14 15:09:16 +02:00
self_hosted_distro docs: add example for intel gpu in vllm remote (#1952) 2025-04-15 07:56:23 -07:00
building_distro.md fix: misleading help text for 'llama stack build' and 'llama stack run' (#1910) 2025-04-12 01:19:11 -07:00
configuration.md docs: Update quickstart page to structure things a little more for the novices (#1873) 2025-04-10 14:09:00 -07:00
importing_as_library.md docs: update importing_as_library.md (#1863) 2025-04-07 12:31:04 +02:00
index.md docs: Updated documentation and Sphinx configuration (#1845) 2025-03-31 13:08:05 -07:00
kubernetes_deployment.md docs: fix errors in kubernetes deployment guide (#1914) 2025-04-11 13:04:13 +02:00
list_of_distributions.md docs: Updated documentation and Sphinx configuration (#1845) 2025-03-31 13:08:05 -07:00
starting_llama_stack_server.md docs: Update quickstart page to structure things a little more for the novices (#1873) 2025-04-10 14:09:00 -07:00