diff --git a/llama_stack/models/llama/llama4/prompt_format.md b/llama_stack/models/llama/llama4/prompt_format.md index 44568dc84..9deff1d1c 100644 --- a/llama_stack/models/llama/llama4/prompt_format.md +++ b/llama_stack/models/llama/llama4/prompt_format.md @@ -10,6 +10,7 @@ Here is a list of special tokens that are supported by Llama 4: - at the end of a direct interaction between the model and the user - at the end of multiple interactions between the model and any available tools This token signals to the executor that the model has finished generating a response. +- `<|eom|>`: End of message. This tag is used with the `tool` role, and is used at the end of the response from the executor. - `<|image_start|>` and `<|image_end|>`: These tokens enclose the image data in the prompt. - `<|patch|>`: This token represents a piece of the tile/ - `<|tile_y_separator|>` and `<|tile_x_separator|>`: These tokens are used to separate the y and x tiles of an image @@ -17,10 +18,11 @@ Here is a list of special tokens that are supported by Llama 4: -There are 3 different roles that are supported by Llama 4 +There are 4 different roles that are supported by Llama 4 - `system`: Sets the context in which to interact with the AI model. It typically includes rules, guidelines, or necessary information that helps the model respond effectively. - `user`: Represents the human interacting with the model. It includes the inputs, commands, and questions to the model. - `assistant`: Represents the response generated by the AI model based on the context provided in the `system`, `tool` and `user` prompts. +- `tool`: Represents the output of a tool call when sent back to the model from the executor. (The actual token used by the model is `<|ipython|>`.) # Llama 4 Instruct Model @@ -74,7 +76,7 @@ Notice the structure of the image section: <|image_start|><|image|><|patch|>...<|patch|><|image_end|> ``` This is due to the image being smaller than the tile size. - + ## Single image prompt format @@ -100,7 +102,7 @@ With a bigger image, the image will include the tile separator tokens. Additiona ``` <|image_start|><|patch|>...<|patch|><|tile_x_separator|><|patch|>...<|patch|><|tile_y_separator|><|patch|>...<|patch|><|image|><|patch|>...<|patch|><|image_end|> ``` - + ## Multiple images prompt format @@ -319,3 +321,5 @@ The top 2 latest trending songs are: - Tool outputs should be passed back to the model in the `tool` role, which uses the `<|ipython|>` tag. - The model parses the tool output contents until it encounters the `<|eom|>` tag. It uses this to synthesize an appropriate response to the query. + +