Add documentations for building applications and with some content for agentic loop

This commit is contained in:
Ashwin Bharambe 2024-12-08 14:56:03 -08:00
parent a29013112f
commit 1274fa4c0d
5 changed files with 424 additions and 16 deletions

View file

@ -19,16 +19,17 @@ export LLAMA_STACK_PORT=5001
ollama run $OLLAMA_INFERENCE_MODEL --keepalive 60m
```
By default, Ollama keeps the model loaded in memory for 5 minutes which can be too short. We set the `--keepalive` flag to 60 minutes to enspagents/agenure the model remains loaded for sometime.
By default, Ollama keeps the model loaded in memory for 5 minutes which can be too short. We set the `--keepalive` flag to 60 minutes to ensure the model remains loaded for sometime.
### 2. Start the Llama Stack server
Llama Stack is based on a client-server architecture. It consists of a server which can be configured very flexibly so you can mix-and-match various providers for its individual API components -- beyond Inference, these include Memory, Agents, Telemetry, Evals and so forth.
To get started quickly, we provide various Docker images for the server component that work with different inference providers out of the box. For this guide, we will use `llamastack/distribution-ollama` as the Docker image.
```bash
docker run \
-it \
docker run -it \
-p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
-v ~/.llama:/root/.llama \
llamastack/distribution-ollama \
@ -42,8 +43,7 @@ Configuration for this is available at `distributions/ollama/run.yaml`.
### 3. Use the Llama Stack client SDK
You can interact with the Llama Stack server using the `llama-stack-client` CLI or via the Python SDK.
You can interact with the Llama Stack server using various client SDKs. We will use the Python SDK which you can install using:
```bash
pip install llama-stack-client
```
@ -123,7 +123,6 @@ async def run_main():
agent = Agent(client, agent_config)
session_id = agent.create_session("test-session")
print(f"Created session_id={session_id} for Agent({agent.agent_id})")
user_prompts = [
(
"I am attaching documentation for Torchtune. Help me answer questions I will ask next.",
@ -154,3 +153,10 @@ if __name__ == "__main__":
- Learn how to [Build Llama Stacks](../distributions/index.md)
- See [References](../references/index.md) for more details about the llama CLI and Python SDK
- For example applications and more detailed tutorials, visit our [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) repository.
## Thinking out aloud here in terms of what to write in the docs
- how to get a llama stack server running
- what are all the different client sdks
- what are the components of building agents