Composable building blocks to build Llama Apps
Find a file
Yuan Tang 743f434860
fix: Ensure a tool call can be converted before adding to buffer (#1119)
# What does this PR do?

This fixes an issue when running the e2e agent example:
https://github.com/meta-llama/llama-stack-apps/blob/main/examples/agents/e2e_loop_with_client_tools.py

```
    |   File "/home/yutang/repos/llama-stack/llama_stack/providers/remote/inference/vllm/vllm.py", line 175, in _process_vllm_chat_completion_stream_response
    |     tool_call = convert_tool_call(choice.delta.tool_calls[0])
    |   File "/home/yutang/repos/llama-stack/llama_stack/providers/utils/inference/openai_compat.py", line 441, in convert_tool_call
    |     return ToolCall(
    |   File "/home/yutang/.conda/envs/distribution-myenv/lib/python3.10/site-packages/pydantic/main.py", line 214, in __init__
    |     validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
    | pydantic_core._pydantic_core.ValidationError: 4 validation errors for ToolCall
    | call_id
    |   Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    |     For further information visit https://errors.pydantic.dev/2.10/v/string_type
    | tool_name.enum[BuiltinTool]
    |   Input should be 'brave_search', 'wolfram_alpha', 'photogen' or 'code_interpreter' [type=enum, input_value=None, input_type=NoneType]
    |     For further information visit https://errors.pydantic.dev/2.10/v/enum
    | tool_name.str
    |   Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    |     For further information visit https://errors.pydantic.dev/2.10/v/string_type
    | arguments
    |   Input should be a valid dictionary [type=dict_type, input_value=202, input_type=int]
    |     For further information visit https://errors.pydantic.dev/2.10/v/dict_type
```

This issue happened because not all arguments have been appended to the
tool call buffer yet. The current code assumes that we are ready to
convert the tool call whenever args can be converted to JSON
successfully. In this case, `json.loads("202")` would succeed but the
rest of the arguments have not been properly parsed yet.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan

The e2e example worked successfully (although note that I ran the script
twice with each function call separately due to
https://github.com/meta-llama/llama-stack/issues/1120):
```
tool_execution> Tool:get_ticker_data Args:{'ticker_symbol': 'GOOG', 'start': '2023-01-01', 'end': '2023-12-31'}
tool_execution> Tool:get_ticker_data Response:"[{\"('Year', '')\":2023,\"('Close', 'GOOG')\":140.4254455566}]"

tool_execution> Tool:web_search Args:{'query': '42nd president of the United States'}
tool_execution> Tool:web_search Response:"{\"query\": \"42nd president of the United States\", \"top_k\": [{\"title\": \"William J. Clinton | whitehouse.gov\", \"url\": \"https://obamawhitehouse.archives.gov/1600/presidents/williamjclinton\", \"description\": \"<strong>Bill Clinton</strong> is an American politician from Arkansas who served as the 42nd President of the United States (1993-2001). He took office at the end of the Cold War, and was the first baby-boomer generation President.\", \"type\": \"search_result\"}, {\"title\": \"Bill Clinton - Wikipedia\", \"url\": \"https://en.wikipedia.org/wiki/Bill_Clinton\", \"description\": \"<strong>William Jefferson Clinton</strong> (n\\u00e9 Blythe; born August 19, 1946) is an American politician and lawyer who served as the 42nd president of the United States from 1993 to 2001. A member of the Democratic Party, he previously served as the attorney general of Arkansas from 1977 to 1979 and as the ...\", \"type\": \"search_result\"}, [{\"type\": \"video_result\", \"url\": \"https://www.youtube.com/watch?v=eR2z_1-v87Y\", \"title\": \"A Conversation with Bill Clinton, 42nd President of the United ...\", \"description\": \"William Jefferson Clinton, the first Democratic president in six decades to be elected twice, led the United States to the longest economic expansion in Amer...\"}, {\"type\": \"video_result\", \"url\": \"4484174096/\", \"title\": \"January 20, 1993, President Clinton was sworn in as the 42nd ...\", \"description\": \"WATCH: On January 20, 1993, President Bill Clinton was sworn in as the 42nd President of the United States. #InaugurationDay Video courtesy of the...\"}, {\"type\": \"video_result\", \"url\": \"https://www.youtube.com/watch?v=vI0HGQqEJh0\", \"title\": \"42nd President of the United States, Bill Clinton, shared thoughts ...\", \"description\": \"AboutPressCopyrightContact usCreatorsAdvertiseDevelopersTermsPrivacyPolicy & SafetyHow YouTube worksTest new features \\u00b7 \\u00a9 2024 Google LLC\"}, {\"type\": \"video_result\", \"url\": \"https://www.youtube.com/shorts/vI0HGQqEJh0\", \"title\": \"42nd President of the United States, Bill Clinton, shared ...\", \"description\": \"Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.\"}, {\"type\": \"video_result\", \"url\": \"https://www.youtube.com/watch?v=PHihhihVth0\", \"title\": \"Bill & Hillary Clinton returning to Little Rock for 20th ...\", \"description\": \"Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.\"}]]}"
```

All text inference tests passed.

[//]: # (## Documentation)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-15 00:19:16 -05:00
.github docs: remove changelog mention from PR template (#1049) 2025-02-11 13:24:53 -05:00
distributions fix: Gaps in doc codegen (#1035) 2025-02-10 13:24:15 -08:00
docs Update index.md to refer to v0.1.3 2025-02-14 14:29:17 -08:00
llama_stack fix: Ensure a tool call can be converted before adding to buffer (#1119) 2025-02-15 00:19:16 -05:00
rfcs docs: Fix url to the llama-stack-spec yaml/html files (#1081) 2025-02-13 12:39:26 -08:00
tests/client-sdk feat: log start, complete time to Agent steps (#1116) 2025-02-14 17:48:06 -08:00
.gitignore github: ignore non-hidden python virtual environments (#939) 2025-02-03 11:53:05 -08:00
.gitmodules impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00
.pre-commit-config.yaml chore: move all Llama Stack types from llama-models to llama-stack (#1098) 2025-02-14 09:10:59 -08:00
.readthedocs.yaml first version of readthedocs (#278) 2024-10-22 10:15:58 +05:30
CODE_OF_CONDUCT.md Initial commit 2024-07-23 08:32:33 -07:00
CONTRIBUTING.md docs: Mention convential commits format in CONTRIBUTING.md (#1075) 2025-02-13 10:57:30 -05:00
LICENSE Update LICENSE (#47) 2024-08-29 07:39:50 -07:00
MANIFEST.in Move to use pyproject.toml so it is uv compatible 2025-01-31 21:28:08 -08:00
pyproject.toml Bump version to 0.1.3 2025-02-14 19:57:18 +00:00
README.md docs: Updating wording and nits in the README.md (#992) 2025-02-11 09:53:26 -05:00
requirements.txt build: resync uv and deps on 0.1.3 (#1108) 2025-02-14 12:26:04 -08:00
SECURITY.md Create SECURITY.md 2024-10-08 13:30:40 -04:00
uv.lock build: resync uv and deps on 0.1.3 (#1108) 2025-02-14 12:26:04 -08:00

Llama Stack

PyPI version PyPI - Downloads License Discord

Quick Start | Documentation | Colab Notebook

Llama Stack standardizes the core building blocks that simplify AI application development. It codifies best practices across the Llama ecosystem. More specifically, it provides

  • Unified API layer for Inference, RAG, Agents, Tools, Safety, Evals, and Telemetry.
  • Plugin architecture to support the rich ecosystem of different API implementations in various environments, including local development, on-premises, cloud, and mobile.
  • Prepackaged verified distributions which offer a one-stop solution for developers to get started quickly and reliably in any environment.
  • Multiple developer interfaces like CLI and SDKs for Python, Typescript, iOS, and Android.
  • Standalone applications as examples for how to build production-grade AI applications with Llama Stack.
Llama Stack

Llama Stack Benefits

  • Flexible Options: Developers can choose their preferred infrastructure without changing APIs and enjoy flexible deployment choices.
  • Consistent Experience: With its unified APIs, Llama Stack makes it easier to build, test, and deploy AI applications with consistent application behavior.
  • Robust Ecosystem: Llama Stack is already integrated with distribution partners (cloud providers, hardware vendors, and AI-focused companies) that offer tailored infrastructure, software, and services for deploying Llama models.

By reducing friction and complexity, Llama Stack empowers developers to focus on what they do best: building transformative generative AI applications.

API Providers

Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack.

API Provider Builder Environments Agents Inference Memory Safety Telemetry
Meta Reference Single Node
SambaNova Hosted
Cerebras Hosted
Fireworks Hosted
AWS Bedrock Hosted
Together Hosted
Groq Hosted
Ollama Single Node
TGI Hosted and Single Node
NVIDIA NIM Hosted and Single Node
Chroma Single Node
PG Vector Single Node
PyTorch ExecuTorch On-device iOS
vLLM Hosted and Single Node

Distributions

A Llama Stack Distribution (or "distro") is a pre-configured bundle of provider implementations for each API component. Distributions make it easy to get started with a specific deployment scenario - you can begin with a local development setup (eg. ollama) and seamlessly transition to production (eg. Fireworks) without changing your application code. Here are some of the distributions we support:

Distribution Llama Stack Docker Start This Distribution
Meta Reference llamastack/distribution-meta-reference-gpu Guide
Meta Reference Quantized llamastack/distribution-meta-reference-quantized-gpu Guide
SambaNova llamastack/distribution-sambanova Guide
Cerebras llamastack/distribution-cerebras Guide
Ollama llamastack/distribution-ollama Guide
TGI llamastack/distribution-tgi Guide
Together llamastack/distribution-together Guide
Fireworks llamastack/distribution-fireworks Guide
vLLM llamastack/distribution-remote-vllm Guide

Installation

You have two ways to install this repository:

  • Install as a package: You can install the repository directly from PyPI by running the following command:

    pip install llama-stack
    
  • Install from source: If you prefer to install from the source code, make sure you have conda installed. Then, run the following commands:

     mkdir -p ~/local
     cd ~/local
     git clone git@github.com:meta-llama/llama-stack.git
    
     conda create -n stack python=3.10
     conda activate stack
    
     cd llama-stack
     pip install -e .
    

Documentation

Please checkout our Documentation page for more details.

Llama Stack Client SDKs

Language Client SDK Package
Python llama-stack-client-python PyPI version
Swift llama-stack-client-swift Swift Package Index
Typescript llama-stack-client-typescript NPM version
Kotlin llama-stack-client-kotlin Maven version

Check out our client SDKs for connecting to a Llama Stack server in your preferred language, you can choose from python, typescript, swift, and kotlin programming languages to quickly build your applications.

You can find more example scripts with client SDKs to talk with the Llama Stack server in our llama-stack-apps repo.