llama-stack/llama_stack/providers/utils/inference
ehhuang 0b695538af
fix: chat completion with more than one choice (#2288)
# What does this PR do?
Fix a bug in openai_compat where choices are not indexed correctly.

## Test Plan
Added a new test.

Rerun the failed inference_store tests:
llama stack run fireworks --image-type conda
pytest -s -v tests/integration/ --stack-config http://localhost:8321 -k
'test_inference_store' --text-model meta-llama/Llama-3.3-70B-Instruct
--count 10
2025-05-27 15:39:15 -07:00
..
__init__.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
embedding_mixin.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
inference_store.py feat: implement get chat completions APIs (#2200) 2025-05-21 22:21:52 -07:00
litellm_openai_mixin.py feat: introduce APIs for retrieving chat completion requests (#2145) 2025-05-18 21:43:19 -07:00
model_registry.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
openai_compat.py fix: chat completion with more than one choice (#2288) 2025-05-27 15:39:15 -07:00
prompt_adapter.py chore: more mypy fixes (#2029) 2025-05-06 09:52:31 -07:00
stream_utils.py feat: implement get chat completions APIs (#2200) 2025-05-21 22:21:52 -07:00