llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Hardik Shah 8efa53daf1 fix: Agent telemetry inputs/outputs should be structured (#1302 ) Original telemetry outputs for agent turns look like this. Note: how output was a `str(message)` making it difficult to read them back for downstream tasks ( eg. building eval datasets ) ``` { │ │ 'input': [ │ │ │ '{"role":"system","content":"You are a helpful assistant. Use search tool to answer the questions. "}', │ │ │ '{"role":"user","content":"Which teams played in the NBA western conference finals of 2024","context":null}' │ │ ], │ │ 'output': "content: tool_calls: [ToolCall(call_id='8b7294ec-a83f-4798-ad8f-6bed662f08b6', tool_name=<BuiltinTool.brave_search: 'brave_search'>, arguments={'query': 'NBA Western Conference Finals 2024 teams'})]" │ }, ``` Updated the outputs to be structured . ## Test ```python import uuid from llama_stack_client.lib.agents.agent import Agent from llama_stack_client.lib.agents.event_logger import EventLogger from llama_stack_client.types.agent_create_params import AgentConfig model_id = "meta-llama/Llama-3.1-8B-Instruct" agent_config = AgentConfig( model=model_id, instructions="You are a helpful assistant who will use the web search tools to help with answering questions.\nOnly provide final answer in short without writing full sentences. Use web search", toolgroups=["builtin::websearch"], enable_session_persistence=True, ) agent = Agent(client, agent_config) session_id = agent.create_session(uuid.uuid4().hex) response = agent.create_turn( messages=[ { "role": "user", "content": "latest news about llama stack", } ], session_id=session_id, stream=False, ) pprint(response) ``` Output: ``` Turn( │ input_messages=[UserMessage(content='latest news about llama stack', role='user', context=None)], │ output_message=CompletionMessage( │ │ content="The latest news about Llama Stack is that Meta has released Llama 3.2, which includes small and medium-sized vision LLMs (11B and 90B) and lightweight, text-only models (1B and 3B) that fit onto select edge and mobile devices. Additionally, Llama Stack distributions have been released to simplify the way developers work with Llama models in different environments. However, a critical vulnerability has been discovered in Meta's Llama-Stack, which puts AI applications at risk.", │ │ role='assistant', │ │ stop_reason='end_of_turn', │ │ tool_calls=[] │ ), │ session_id='77379546-4598-485a-b4f4-84e5da28c513', │ started_at=datetime.datetime(2025, 2, 27, 11, 2, 43, 915243, tzinfo=TzInfo(-08:00)), │ steps=[ │ │ InferenceStep( │ │ │ api_model_response=CompletionMessage( │ │ │ │ content='', │ │ │ │ role='assistant', │ │ │ │ stop_reason='end_of_turn', │ │ │ │ tool_calls=[ │ │ │ │ │ ToolCall( │ │ │ │ │ │ arguments={'query': 'latest news llama stack'}, │ │ │ │ │ │ call_id='84c0fa10-e24a-4f91-a9ff-415a9ec0bb0b', │ │ │ │ │ │ tool_name='brave_search' │ │ │ │ │ ) │ │ │ │ ] │ │ │ ), │ │ │ step_id='81c16bd3-eb00-4721-8edc-f386e07391a3', │ │ │ step_type='inference', │ │ │ turn_id='2c6b5273-4b16-404f-bed2-c0025fd63b45', │ │ │ completed_at=datetime.datetime(2025, 2, 27, 11, 2, 44, 637149, tzinfo=TzInfo(-08:00)), │ │ │ started_at=datetime.datetime(2025, 2, 27, 11, 2, 43, 915831, tzinfo=TzInfo(-08:00)) │ │ ), │ │ ToolExecutionStep( │ │ │ step_id='4782d609-a62e-45f5-8d2a-25a43db46288', │ │ │ step_type='tool_execution', │ │ │ tool_calls=[ │ │ │ │ ToolCall( │ │ │ │ │ arguments={'query': 'latest news llama stack'}, │ │ │ │ │ call_id='84c0fa10-e24a-4f91-a9ff-415a9ec0bb0b', │ │ │ │ │ tool_name='brave_search' │ │ │ │ ) │ │ │ ], │ │ │ tool_responses=[ │ │ │ │ ToolResponse( │ │ │ │ │ call_id='84c0fa10-e24a-4f91-a9ff-415a9ec0bb0b', │ │ │ │ │ content='{"query": "latest news llama stack", "top_k": [{"title": "Llama 3.2: Revol. ....... Hacker News.", "score": 0.6186197, "raw_content": null}]}', │ │ │ │ │ tool_name='brave_search', │ │ │ │ │ metadata=None │ │ │ │ ) │ │ │ ], │ │ │ turn_id='2c6b5273-4b16-404f-bed2-c0025fd63b45', │ │ │ completed_at=datetime.datetime(2025, 2, 27, 11, 2, 46, 272176, tzinfo=TzInfo(-08:00)), │ │ │ started_at=datetime.datetime(2025, 2, 27, 11, 2, 44, 640743, tzinfo=TzInfo(-08:00)) │ │ ), │ │ InferenceStep( │ │ │ api_model_response=CompletionMessage( │ │ │ │ content="The latest news about Llama Stack is that Meta has released Llama 3.2, which includes small and medium-sized vision LLMs (11B and 90B) and lightweight, text-only models (1B and 3B) that fit onto select edge and mobile devices. Additionally, Llama Stack distributions have been released to simplify the way developers work with Llama models in different environments. However, a critical vulnerability has been discovered in Meta's Llama-Stack, which puts AI applications at risk.", │ │ │ │ role='assistant', │ │ │ │ stop_reason='end_of_turn', │ │ │ │ tool_calls=[] │ │ │ ), │ │ │ step_id='37994419-5da3-4e84-a010-8d9b85366262', │ │ │ step_type='inference', │ │ │ turn_id='2c6b5273-4b16-404f-bed2-c0025fd63b45', │ │ │ completed_at=datetime.datetime(2025, 2, 27, 11, 2, 48, 961275, tzinfo=TzInfo(-08:00)), │ │ │ started_at=datetime.datetime(2025, 2, 27, 11, 2, 46, 273168, tzinfo=TzInfo(-08:00)) │ │ ) │ ], │ turn_id='2c6b5273-4b16-404f-bed2-c0025fd63b45', │ completed_at=datetime.datetime(2025, 2, 27, 11, 2, 48, 962318, tzinfo=TzInfo(-08:00)), │ output_attachments=[] ) ``` ## Check for Telemetry ```python agent_logs = [] for span in client.telemetry.query_spans( attribute_filters=[ {"key": "session_id", "op": "eq", "value": session_id}, ], attributes_to_return=['input', 'output'], ): agent_logs.append(span.attributes) pprint(json.loads(agent_logs[-1]['output'])) ``` ``` { │ 'content': "The latest news about Llama Stack is that Meta has released Llama 3.2, which includes small and medium-sized vision LLMs (11B and 90B) and lightweight, text-only models (1B and 3B) that fit onto select edge and mobile devices. Additionally, Llama Stack distributions have been released to simplify the way developers work with Llama models in different environments. However, a critical vulnerability has been discovered in Meta's Llama-Stack, which puts AI applications at risk.", │ 'tool_calls': [] } ```		2025-02-27 23:06:37 -08:00
..
apis	ci: add mypy for static type checking (#1101 )	2025-02-21 13:15:40 -08:00
cli	fix: Incorrect import path for print_subcommand_description() (#1315 )	2025-02-27 18:50:41 -08:00
distribution	fix: ensure ollama embedding model is registered properly in the template	2025-02-27 22:49:06 -08:00
models/llama	feat: update the default system prompt for 3.2/3.3 models (#1310 )	2025-02-27 23:05:42 -08:00
providers	fix: Agent telemetry inputs/outputs should be structured (#1302 )	2025-02-27 23:06:37 -08:00
scripts	ci: add mypy for static type checking (#1101 )	2025-02-21 13:15:40 -08:00
strong_typing	Ensure that deprecations for fields follow through to OpenAPI	2025-02-19 13:54:04 -08:00
templates	fix: ensure ollama embedding model is registered properly in the template	2025-02-27 22:49:06 -08:00
__init__.py	export LibraryClient	2024-12-13 12:08:00 -08:00
schema_utils.py	ci: add mypy for static type checking (#1101 )	2025-02-21 13:15:40 -08:00