regenerate markdown with eom tags and tool role

2025-08-03 17:29:01 +00:00 · 2025-04-07 11:04:34 -07:00 · 2025-04-07 11:04:34 -07:00 · 7cf289ca03
commit 7cf289ca03
parent 8a76fb32f3
1 changed files with 7 additions and 3 deletions
--- a/llama_stack/models/llama/llama4/prompt_format.md
+++ b/llama_stack/models/llama/llama4/prompt_format.md
@ -10,6 +10,7 @@ Here is a list of special tokens that are supported by Llama 4:
    - at the end of a direct interaction between the model and the user
    - at the end of multiple interactions between the model and any available tools
    This token signals to the executor that the model has finished generating a response.
+- `<|eom|>`: End of message. This tag is used with the `tool` role, and is used at the end of the response from the executor.
 - `<|image_start|>` and `<|image_end|>`: These tokens enclose the image data in the prompt.
 - `<|patch|>`: This token represents a piece of the tile/
 - `<|tile_y_separator|>` and `<|tile_x_separator|>`: These tokens are used to separate the y and x tiles of an image
@ -17,10 +18,11 @@ Here is a list of special tokens that are supported by Llama 4:



-There are 3 different roles that are supported by Llama 4
+There are 4 different roles that are supported by Llama 4
 - `system`: Sets the context in which to interact with the AI model. It typically includes rules, guidelines, or necessary information that helps the model respond effectively.
 - `user`: Represents the human interacting with the model. It includes the inputs, commands, and questions to the model.
 - `assistant`: Represents the response generated by the AI model based on the context provided in the `system`, `tool` and `user` prompts.
+- `tool`: Represents the output of a tool call when sent back to the model from the executor. (The actual token used by the model is `<|ipython|>`.)


 # Llama 4 Instruct Model
@ -319,3 +321,5 @@ The top 2 latest trending songs are:

 - Tool outputs should be passed back to the model in the `tool` role, which uses the `<|ipython|>` tag.
 - The model parses the tool output contents until it encounters the `<|eom|>` tag. It uses this to synthesize an appropriate response to the query.
+
+