llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-30 21:50:01 +00:00

History

Dmitry Rogozhkin 241a42bb26 docs: add example for intel gpu in vllm remote PR adds instructions to setup vLLM remote endpoint for vllm-remote llama stack distribution. * Verified with manual tests of the configured vllm-remote against vllm endpoint running on the system with Intel GPU * Also verified with ci pytests (see cmdline below). Test passes in the same capacity as it does on the A10 Nvidia setup (some tests do fail which seems to be known issues with vllm remote llama stack distribution) ``` pytest -s -v tests/integration/inference/test_text_inference.py \ --stack-config=http://localhost:5001 \ --text-model=meta-llama/Llama-3.2-3B-Instruct ``` Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>		2025-04-15 07:15:37 -07:00
..
bedrock	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
cerebras	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
ci-tests	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
dell	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
dev	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
experimental-post-training	fix: fix experimental-post-training template (#1740 )	2025-03-20 23:07:19 -07:00
fireworks	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
groq	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
hf-endpoint	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
hf-serverless	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
meta-reference-gpu	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
nvidia	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
ollama	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
open-benchmark	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
passthrough	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
remote-vllm	docs: add example for intel gpu in vllm remote	2025-04-15 07:15:37 -07:00
sambanova	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
tgi	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
together	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
verification	fix: type (#1898 )	2025-04-08 09:07:25 -07:00
vllm-gpu	chore: Revert "chore(telemetry): remove service_name entirely" (#1785 )	2025-03-25 14:42:05 -07:00
__init__.py	Auto-generate distro yamls + docs (#468 )	2024-11-18 14:57:06 -08:00
dependencies.json	fix: use torchao 0.8.0 for inference (#1925 )	2025-04-10 13:39:20 -07:00
template.py	feat(api): (1/n) datasets api clean up (#1573 )	2025-03-17 16:55:45 -07:00