llama-stack/llama_stack/providers/remote/inference
Botao Chen 2b995c22eb
feat: inference passthrough provider (#1166)
##  What does this PR do?
In this PR, we implement a passthrough inference provider that works for
any endpoints that respect llama stack inference API definition.

## Test Plan
config some endpoint that respect llama stack inference API definition
and got the inference results successfully

<img width="1268" alt="Screenshot 2025-02-19 at 8 52 51 PM"
src="https://github.com/user-attachments/assets/447816e4-ea7a-4365-b90c-386dc7dcf4a1"
/>
2025-02-19 21:47:00 -08:00
..
bedrock chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
cerebras chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
databricks chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
fireworks chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
groq chore: move all Llama Stack types from llama-models to llama-stack (#1098) 2025-02-14 09:10:59 -08:00
nvidia fix: Get distro_codegen.py working with default deps and enabled in pre-commit hooks (#1123) 2025-02-19 18:39:20 -08:00
ollama chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
passthrough feat: inference passthrough provider (#1166) 2025-02-19 21:47:00 -08:00
runpod chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
sambanova chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
sample build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
tgi chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
together chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
vllm chore: remove llama_models.llama3.api imports from providers (#1107) 2025-02-19 19:01:29 -08:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00