mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-28 02:53:30 +00:00

Dinesh Yeduguru a5c57cd381

# What does this PR do?

PR #639 introduced the notion of Tools API and ability to invoke tools
through API just as any resource. This PR changes the Agents to start
using the Tools API to invoke tools. Major changes include:
1) Ability to specify tool groups with AgentConfig
2) Agent gets the corresponding tool definitions for the specified tools
and pass along to the model
3) Attachements are now named as Documents and their behavior is mostly
unchanged from user perspective
4) You can specify args that can be injected to a tool call through
Agent config. This is especially useful in case of memory tool, where
you want the tool to operate on a specific memory bank.
5) You can also register tool groups with args, which lets the agent
inject these as well into the tool call.
6) All tests have been migrated to use new tools API and fixtures
including client SDK tests
7) Telemetry just works with tools API because of our trace protocol
decorator


## Test Plan
```
pytest -s -v -k fireworks llama_stack/providers/tests/agents/test_agents.py  \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct

pytest -s -v -k together  llama_stack/providers/tests/tools/test_tools.py \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct

LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/llamastack-together/together-run.yaml" pytest -v tests/client-sdk/agents/test_agents.py
```
run.yaml:
https://gist.github.com/dineshyv/0365845ad325e1c2cab755788ccc5994

Notebook:
https://colab.research.google.com/drive/1ck7hXQxRl6UvT-ijNRZ-gMZxH1G3cN2d?usp=sharing

2025-01-08 19:01:00 -08:00

2.5 KiB

Raw Blame History

orphan
true

Fireworks Distribution

:maxdepth: 2
:hidden:

self

The llamastack/distribution-fireworks distribution consists of the following provider configurations.

API	Provider(s)
agents	`inline::meta-reference`
datasetio	`remote::huggingface`, `inline::localfs`
eval	`inline::meta-reference`
inference	`remote::fireworks`
memory	`inline::faiss`, `remote::chromadb`, `remote::pgvector`
safety	`inline::llama-guard`
scoring	`inline::basic`, `inline::llm-as-judge`, `inline::braintrust`
telemetry	`inline::meta-reference`
tool_runtime	`remote::brave-search`, `remote::tavily-search`, `inline::code-interpreter`, `inline::memory-runtime`

Environment Variables

The following environment variables can be configured:

LLAMASTACK_PORT: Port for the Llama Stack distribution server (default: 5001)
FIREWORKS_API_KEY: Fireworks.AI API Key (default: ``)

Models

The following models are available by default:

meta-llama/Llama-3.1-8B-Instruct (fireworks/llama-v3p1-8b-instruct)
meta-llama/Llama-3.1-70B-Instruct (fireworks/llama-v3p1-70b-instruct)
meta-llama/Llama-3.1-405B-Instruct-FP8 (fireworks/llama-v3p1-405b-instruct)
meta-llama/Llama-3.2-1B-Instruct (fireworks/llama-v3p2-1b-instruct)
meta-llama/Llama-3.2-3B-Instruct (fireworks/llama-v3p2-3b-instruct)
meta-llama/Llama-3.2-11B-Vision-Instruct (fireworks/llama-v3p2-11b-vision-instruct)
meta-llama/Llama-3.2-90B-Vision-Instruct (fireworks/llama-v3p2-90b-vision-instruct)
meta-llama/Llama-3.3-70B-Instruct (fireworks/llama-v3p3-70b-instruct)
meta-llama/Llama-Guard-3-8B (fireworks/llama-guard-3-8b)
meta-llama/Llama-Guard-3-11B-Vision (fireworks/llama-guard-3-11b-vision)

Prerequisite: API Keys

Make sure you have access to a Fireworks API Key. You can get one by visiting fireworks.ai.

Running Llama Stack with Fireworks

You can do this via Conda (build code) or Docker which has a pre-built image.

Via Docker

This method allows you to get started quickly without having to build the distribution code.

LLAMA_STACK_PORT=5001
docker run \
  -it \
  -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
  llamastack/distribution-fireworks \
  --port $LLAMA_STACK_PORT \
  --env FIREWORKS_API_KEY=$FIREWORKS_API_KEY

Via Conda

llama stack build --template fireworks --image-type conda
llama stack run ./run.yaml \
  --port $LLAMA_STACK_PORT \
  --env FIREWORKS_API_KEY=$FIREWORKS_API_KEY

2.5 KiB Raw Blame History