llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-31 13:10:01 +00:00

Author	SHA1	Message	Date
Xi Yan	cf24e9073f	scoring function back	2025-03-13 15:32:11 -07:00
Xi Yan	0c37951395	Merge branch 'pr1573' into api_2	2025-03-13 14:49:04 -07:00
Xi Yan	a6095820af	docs	2025-03-13 14:48:11 -07:00
Xi Yan	025d173606	Merge branch 'pr1573' into api_2	2025-03-13 11:05:16 -07:00
Xi Yan	78ec3d98f6	Merge branch 'main' into pr1573	2025-03-13 11:05:04 -07:00
Xi Yan	c87b7006fc	docs	2025-03-13 00:03:06 -07:00
Xi Yan	10f6528164	scoring dataset schemas	2025-03-12 23:56:19 -07:00
Xi Yan	c5f2861a7e	Merge branch 'pr1573' into api_2	2025-03-12 23:51:04 -07:00
Xi Yan	8b80a77fae	docs	2025-03-12 23:50:52 -07:00
Xi Yan	ce0784be0c	Merge branch 'pr1573' into api_2	2025-03-12 23:44:34 -07:00
Xi Yan	8a6fa41a93	more purposes	2025-03-12 23:44:18 -07:00
Xi Yan	b328db4f60	do	2025-03-12 23:41:25 -07:00
Xi Yan	f90dcd2a69	Merge branch 'pr1573' into api_2	2025-03-12 23:36:03 -07:00
Xi Yan	0df33049e3	update doc	2025-03-12 23:32:54 -07:00
Xi Yan	b4d118fc5c	update doc	2025-03-12 23:30:47 -07:00
Xi Yan	4f6f0f6a91	update doc	2025-03-12 23:27:01 -07:00
Xi Yan	25710c3b8a	scoring updates	2025-03-12 21:58:49 -07:00
ehhuang	0a0d6cb96e	fix: openapi spec gen (#1602 ) Summary: Test Plan: sh docs/openapi_generator/run_openapi_generator.sh	2025-03-12 21:55:05 -07:00
Xi Yan	3a87562e8d	scoring updates	2025-03-12 21:54:12 -07:00
Xi Yan	7b50fdb2b1	Merge branch 'pr1573' into api_2	2025-03-12 21:42:00 -07:00
Xi Yan	4cc1958af9	huggingface obey consistency	2025-03-12 21:37:13 -07:00
Xi Yan	a7abe6df74	better params fields	2025-03-12 21:31:22 -07:00
Xi Yan	93c131ed5f	purpose	2025-03-12 21:23:35 -07:00
Xi Yan	d7dbc8cf64	Merge branch 'pr1573' into api_2	2025-03-12 21:02:30 -07:00
ehhuang	a505bf45a3	feat(api): remove tool_name from ToolResponseMessage (#1599 ) Summary: This is not used anywhere. closes #1421 Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/integration/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct --record-responses	2025-03-12 19:41:48 -07:00
Xi Yan	09039eca57	source	2025-03-12 18:52:05 -07:00
Xi Yan	a3173e8284	update	2025-03-12 18:46:40 -07:00
Xi Yan	8942071b3b	Merge branch 'main' into pr1573	2025-03-12 18:23:39 -07:00
Dinesh Yeduguru	99bbe0e70b	feat: Add new compact MetricInResponse type (#1593 ) # What does this PR do? This change adds a compact type to include metrics in response as opposed to the full MetricEvent which is relevant for internal logging purposes. ## Test Plan ``` LLAMA_STACK_CONFIG=~/.llama/distributions/fireworks/fireworks-run.yaml pytest -s -v agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct llama stack run ~/.llama/distributions/fireworks/fireworks-run.yaml curl --request POST \ --url http://localhost:8321/v1/inference/chat-completion \ --header 'content-type: application/json' \ --data '{ "model_id": "meta-llama/Llama-3.1-70B-Instruct", "messages": [ { "role": "user", "content": { "type": "text", "text": "where do humans live" } } ], "stream": false }' { "metrics": [ { "metric": "prompt_tokens", "value": 10, "unit": null }, { "metric": "completion_tokens", "value": 522, "unit": null }, { "metric": "total_tokens", "value": 532, "unit": null } ], "completion_message": { "role": "assistant", "content": "Humans live in various parts of the world...............", "stop_reason": "out_of_tokens", "tool_calls": [] }, "logprobs": null } ```	2025-03-12 15:45:44 -07:00
ehhuang	b7a9c45477	chore: deprecate ToolResponseMessage in agent.resume API (#1566 ) # Summary: closes #1431 # Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/integration/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct	2025-03-12 12:10:21 -07:00
Xi Yan	b4d868a1e5	include benchmarks	2025-03-12 00:43:24 -07:00
Xi Yan	124040af77	params -> fn	2025-03-12 00:20:41 -07:00
Xi Yan	bb86aaf787	update	2025-03-12 00:19:48 -07:00
Xi Yan	af4216f34f	Merge branch 'pr1573' into api_2	2025-03-12 00:19:25 -07:00
Xi Yan	1d80ec7f81	upgrade doc	2025-03-12 00:17:58 -07:00
Xi Yan	0abedd070c	comment	2025-03-12 00:13:27 -07:00
Xi Yan	bec5a46915	single type	2025-03-11 23:20:16 -07:00
Xi Yan	58d9cb1276	docs	2025-03-11 22:46:52 -07:00
Xi Yan	f9ea90c4f7	docs	2025-03-11 22:45:48 -07:00
Xi Yan	e477164448	remove json_schema_type decorator	2025-03-11 22:08:30 -07:00
Xi Yan	98dfc99584	docs	2025-03-11 22:06:55 -07:00
Xi Yan	2bb6ca818a	scoring api update	2025-03-11 21:53:47 -07:00
Xi Yan	0e47c65051	update	2025-03-11 18:29:55 -07:00
Xi Yan	0e8a53ab69	openapi	2025-03-11 15:03:48 -07:00
Sébastien Han	83a2c78615	feat(api): list agents / sessions and get agent (#1410 ) # What does this PR do? Add support for listing agents, describing an agent, and retrieving session IDs for a given agent. This is only the API definition, the implementations will come separately. Closes: https://github.com/meta-llama/llama-stack/issues/1294 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-03-11 10:33:46 -07:00
ehhuang	6cf79437b3	feat: support ClientTool output metadata (#1426 ) # Summary: Client side change in https://github.com/meta-llama/llama-stack-client-python/pull/180 Changes the resume_turn API to accept `ToolResponse` instead of `ToolResponseMessage`: 1. `ToolResponse` contains `metadata` 2. `ToolResponseMessage` is a concept for model inputs. Here we are just submitting the outputs of tool execution. # Test Plan: Ran integration tests with newly added test using client tool with metadata LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/integration/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --record-responses	2025-03-05 14:30:27 -08:00
Xi Yan	3d9331840e	docs: api documentation for agents/eval/scoring/datasets (#1400 ) # What does this PR do? - add some docs to OpenAPI for agents/eval/scoring/datasetio [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan - read [//]: # (## Documentation)	2025-03-05 09:40:24 -08:00
Xi Yan	e9a37bad63	chore: rename task_config to benchmark_config (#1397 ) # What does this PR do? - This was missed from previous deprecation: https://github.com/meta-llama/llama-stack/pull/1186 - Part of https://github.com/meta-llama/llama-stack/issues/1396 [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan ``` pytest -v -s --nbval-lax ./llama-stack/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb ``` [//]: # (## Documentation)	2025-03-04 12:44:04 -08:00
Xi Yan	158b6dc404	chore: deprecate allow_turn_resume (#1377 ) # What does this PR do? - Deprecate allow_turn_resume flag as this is used for staying backward compat. - Closes https://github.com/meta-llama/llama-stack/issues/1363 [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan ``` LLAMA_STACK_CONFIG=fireworks pytest -v tests/api/agents/test_agents.py --inference-model "meta-llama/Llama-3.3-70B-Instruct" --record-responses ``` <img width="1054" alt="image" src="https://github.com/user-attachments/assets/d31de2d4-0953-41e1-a71a-7e1579fa351a" /> [//]: # (## Documentation)	2025-03-04 12:22:11 -08:00
Ashwin Bharambe	5547ef953c	feat: enhance OpenAPI spec to include Error types (#1320 ) # What does this PR do? An API spec must talk about Error handling. This was a pretty glaring omission so far. This PR begins to address it by adding a set of standard error responses we can attach to all our API calls. At a future point, we can add specific error types where necessary (although we should not hurry to do that; it is best done very late.) ## Test Plan Checked that Stainless SDK generation succeeds.	2025-02-28 11:16:12 -08:00

1 2

68 commits