llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-31 01:13:53 +00:00

Author	SHA1	Message	Date
Ben Browning	8f5cd49159	vllm prompt_logprobs can also be 0 This adjusts the vllm openai_completion endpoint to also pass a value of 0 for prompt_logprobs, instead of only passing values greater than zero to the backend. The existing test_openai_completion_prompt_logprobs was parameterized to test this case as well. Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-04-09 17:32:03 -04:00
Ben Browning	8d10556ce3	Add basic tests for OpenAI Chat Completions API Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-04-09 16:18:13 -04:00
Ben Browning	ac5dc8fae2	Add prompt_logprobs and guided_choice to OpenAI completions This adds the vLLM-specific extra_body parameters of prompt_logprobs and guided_choice to our openai_completion inference endpoint. The plan here would be to expand this to support all common optional parameters of any of the OpenAI providers, allowing each provider to use or ignore these parameters based on whether their server supports them. Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-04-09 15:47:02 -04:00
Ben Browning	ef684ff178	Fix openai_completion tests for ollama When called via the OpenAI API, ollama is responding with more brief responses than when called via its native API. This adjusts the prompting for its OpenAI calls to ask it to be more verbose.	2025-04-09 15:47:02 -04:00
Ben Browning	52b4766949	Start some integration tests with an OpenAI client This starts to stub in some integration tests for the OpenAI-compatible server APIs using an OpenAI client. Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-04-09 15:47:02 -04:00

5 commits