llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-07 06:20:45 +00:00

History

Ben Browning 48fdbf7188 fix: ollama chat completion needs unique ids The chat completion ids generated by Ollama are not unique enough to use with stored chat completions as they rely on only 3 numbers of randomness to give unique values - ie `chatcmpl-373`. This causes frequent collisions in id values of chat completions in Ollama, which creates issues in our SQL storage of chat completions by id where it expects ids to actually be unique. So, this adjusts Ollama responses to use uuids as unique ids. This does mean we're replacing the ids generated natively by Ollama. If we don't wish to do this, we'll either need to relax the unique constraint on our chat completions id field in the inference storage or convince Ollama upstream to use something closer to uuid values here. Closes #2315 I tested by running the openai completion / chat completion integration tests in a loop. Without this change, I regularly get unique id collisions. With this change, I do not. ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml while true; do; \ INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ pytest -s -v \ tests/integration/inference/test_openai_completion.py \ --stack-config=http://localhost:8321 \ --text-model="meta-llama/Llama-3.2-3B-Instruct"; \ done ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>		2025-06-02 19:07:42 -04:00
..
anthropic	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
bedrock	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
cerebras	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
cerebras_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
databricks	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
fireworks	fix: fireworks provider for openai compat inference endpoint (#2335 )	2025-06-02 14:11:15 -07:00
fireworks_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
gemini	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
groq	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
groq_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
llama_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
nvidia	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
ollama	fix: ollama chat completion needs unique ids	2025-06-02 19:07:42 -04:00
openai	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
passthrough	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
runpod	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
sambanova	fix(providers): update sambanova json schema mode (#2306 )	2025-05-29 09:54:23 -07:00
sambanova_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
tgi	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
together	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
together_openai_compat	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
vllm	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
watsonx	feat: New OpenAI compat embeddings API (#2314 )	2025-05-31 22:11:47 -07:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00