llama-stack/llama_stack/providers/remote/inference
Ashwin Bharambe ab54b8cd58
feat(providers): support non-llama models for inference providers (#1200)
This PR begins the process of supporting non-llama models within Llama
Stack. We start simple by adding support for this functionality within a
few existing providers: fireworks, together and ollama.

## Test Plan

```bash
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/inference/test_text_inference.py \
  --inference-model accounts/fireworks/models/phi-3-vision-128k-instruct
```

^ this passes most of the tests but as expected fails the tool calling
related tests since they are very specific to Llama models

```
inference/test_text_inference.py::test_text_completion_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct] PASSED
inference/test_text_inference.py::test_completion_log_probs_non_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct] PASSED
inference/test_text_inference.py::test_completion_log_probs_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct] PASSED
inference/test_text_inference.py::test_text_completion_structured_output[accounts/fireworks/models/phi-3-vision-128k-instruct-completion-01] PASSED
inference/test_text_inference.py::test_text_chat_completion_non_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct-Which planet do humans live on?-Earth] PASSED
inference/test_text_inference.py::test_text_chat_completion_non_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct-Which planet has rings around it with a name starting w
ith letter S?-Saturn] PASSED
inference/test_text_inference.py::test_text_chat_completion_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct-What's the name of the Sun in latin?-Sol] PASSED
inference/test_text_inference.py::test_text_chat_completion_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct-What is the name of the US captial?-Washington] PASSED
inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_non_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct] FAILED
inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_streaming[accounts/fireworks/models/phi-3-vision-128k-instruct] FAILED
inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_required[accounts/fireworks/models/phi-3-vision-128k-instruct] FAILED
inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_none[accounts/fireworks/models/phi-3-vision-128k-instruct] PASSED
inference/test_text_inference.py::test_text_chat_completion_structured_output[accounts/fireworks/models/phi-3-vision-128k-instruct] ERROR
inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[accounts/fireworks/models/phi-3-vision-128k-instruct-True] PASSED
inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[accounts/fireworks/models/phi-3-vision-128k-instruct-False] PASSED
```
2025-02-21 13:21:28 -08:00
..
bedrock feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
cerebras feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
databricks feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
fireworks feat(providers): support non-llama models for inference providers (#1200) 2025-02-21 13:21:28 -08:00
groq feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
nvidia test(client-sdk): Update embedding test types to use latest imports (#1203) 2025-02-21 08:09:17 -08:00
ollama feat(providers): support non-llama models for inference providers (#1200) 2025-02-21 13:21:28 -08:00
passthrough feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
runpod feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
sambanova feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
sample build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
tgi feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
together feat(providers): support non-llama models for inference providers (#1200) 2025-02-21 13:21:28 -08:00
vllm feat(api): Add options for supporting various embedding models (#1192) 2025-02-20 22:27:12 -08:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00