llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-31 13:49:59 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	cfaf9e0e8b	revert some unintentional changes by copying source of truth to llama-models	2025-04-07 11:01:24 -07:00
Ashwin Bharambe	53a8086e37	several fixes	2025-04-07 10:31:20 -07:00
Ashwin Bharambe	e2e2820c9a	refactor: move all llama code to models/llama out of meta reference	2025-04-06 22:42:32 -07:00
Hardik Shah	28e262ecdc	feat: make multi-turn tool call tests work with llama4 (#1886 ) Running full Tool Calling required some updates to work e2e. - Remove `python_start` and `python_end` tags - Tool Call messages and Tool Resposne messages should end with `<\|eom\|>` - System prompt needed updates ``` You are a helpful assisant who can can answer general questions or invoke tools when necessary. In addition to tool calls, you should also augment your responses by using the tool outputs. ``` ### Test Plan - Start server with meta-reference ``` LLAMA_STACK_DISABLE_VERSION_CHECK=1 LLAMA_MODELS_DEBUG=1 INFERENCE_MODEL=meta-llama/$MODEL llama stack run meta-reference-gpu ``` - Added NEW tests with 5 test cases for multi-turn tool calls ``` pytest -s -v --stack-config http://localhost:8321 tests/integration/inference/test_text_inference.py --text-model meta-llama/Llama-4-Scout-17B-16E-Instruct ``` - Also verified all vision and agent tests pass	2025-04-06 19:14:21 -07:00
Ashwin Bharambe	3f92b2bf85	fix: kill the usage of python_start and python_end tokens	2025-04-05 19:00:26 -07:00
Ashwin Bharambe	b8f1561956	feat: introduce llama4 support (#1877 ) As title says. Details in README, elsewhere.	2025-04-05 11:53:35 -07:00

6 commits