moving rfc->docs

2025-06-27 18:50:41 +00:00 · 2024-09-18 16:54:24 -07:00 · 2024-09-18 16:54:24 -07:00 · 5ec64ac68c
commit 5ec64ac68c
parent 2c1ad10710
6 changed files with 4 additions and 4 deletions
--- a/rfcs/RFC-0001-llama-stack-assets/agentic-system.png
+++ b/rfcs/RFC-0001-llama-stack-assets/agentic-system.png
--- a/rfcs/RFC-0001-llama-stack-assets/llama-stack-spec.html
+++ b/rfcs/RFC-0001-llama-stack-assets/llama-stack-spec.html
--- a/rfcs/RFC-0001-llama-stack-assets/llama-stack-spec.yaml
+++ b/rfcs/RFC-0001-llama-stack-assets/llama-stack-spec.yaml
--- a/rfcs/RFC-0001-llama-stack-assets/llama-stack.png
+++ b/rfcs/RFC-0001-llama-stack-assets/llama-stack.png
--- a/rfcs/RFC-0001-llama-stack-assets/model-lifecycle.png
+++ b/rfcs/RFC-0001-llama-stack-assets/model-lifecycle.png
--- a/rfcs/RFC-0001-llama-stack.md
+++ b/rfcs/RFC-0001-llama-stack.md
@ -21,7 +21,7 @@ Meta releases weights of both the pretrained and instruction fine-tuned Llama mo

 ### Model Lifecycle

-![Figure 1: Model Life Cycle](RFC-0001-llama-stack-assets/model-lifecycle.png)
+![Figure 1: Model Life Cycle](../docs/resources/model-lifecycle.png)


 For each of the operations that need to be performed (e.g. fine tuning, inference, evals etc) during the model life cycle, we identified the capabilities as toolchain APIs that are needed. Some of these capabilities are primitive operations like inference while other capabilities like synthetic data generation are composed of other capabilities. The list of APIs we have identified to support the lifecycle of Llama models is below:
@ -35,7 +35,7 @@ For each of the operations that need to be performed (e.g. fine tuning, inferenc

 ### Agentic System

-![Figure 2: Agentic System](RFC-0001-llama-stack-assets/agentic-system.png)
+![Figure 2: Agentic System](../docs/resources/agentic-system.png)

 In addition to the model lifecycle, we considered the different components involved in an agentic system. Specifically around tool calling and shields. Since the model may decide to call tools, a single model inference call is not enough. What’s needed is an agentic loop consisting of tool calls and inference. The model provides separate tokens representing end-of-message and end-of-turn. A message represents a possible stopping point for execution where the model can inform the execution environment that a tool call needs to be made. The execution environment, upon execution, adds back the result to the context window and makes another inference call. This process can get repeated until an end-of-turn token is generated.
 Note that as of today, in the OSS world, such a “loop” is often coded explicitly via elaborate prompt engineering using a ReAct pattern (typically) or preconstructed execution graph. Llama 3.1 (and future Llamas) attempts to absorb this multi-step reasoning loop inside the main model itself.
@ -60,12 +60,12 @@ The sequence diagram that details the steps is [here](https://github.com/meta-ll

 We define the Llama Stack as a layer cake shown below.

-![Figure 3: Llama Stack](RFC-0001-llama-stack-assets/llama-stack.png)
+![Figure 3: Llama Stack](../docs/resources/llama-stack.png)




-The API is defined in the [YAML](RFC-0001-llama-stack-assets/llama-stack-spec.yaml) and [HTML](RFC-0001-llama-stack-assets/llama-stack-spec.html) files. These files were generated using the Pydantic definitions in (api/datatypes.py and api/endpoints.py) files that are in the llama-models, llama-stack, and llama-agentic-system repositories.
+The API is defined in the [YAML](../docs/llama-stack-spec.yaml) and [HTML](../docs/llama-stack-spec.html) files. These files were generated using the Pydantic definitions in (api/datatypes.py and api/endpoints.py) files that are in the llama-models, llama-stack, and llama-agentic-system repositories.