llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-27 23:11:59 +00:00

Author	SHA1	Message	Date
Sébastien Han	b45cc42202	chore: remove usage of load_tiktoken_bpe The `load_tiktoken_bpe()` function depends on blobfile to load tokenizer.model files. However, blobfile brings in pycryptodomex, which is primarily used for JWT signing in GCP - functionality we don’t require, as we always load tokenizers from local files. pycryptodomex implements its own cryptographic primitives, which are known to be problematic and insecure. While blobfile could potentially switch to the more secure PyCA cryptography library, the project appears inactive, so this transition may not happen soon. Fortunately, `load_tiktoken_bpe()` is a simple function that just reads a BPE file and returns a dictionary mapping byte sequences to their mergeable ranks. It’s straightforward enough for us to implement ourselves. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-05-27 20:47:06 +02:00
ehhuang	ad86a68a32	feat: support '-' in tool names (#1807 ) # What does this PR do? titled ## Test Plan added new unit tests pytest -s -v tests/unit/models/llama/llama3/test_tool_utils.py	2025-04-12 14:23:03 -07:00
ehhuang	af8b4484a3	fix: update default tool call system prompt (#1712 ) # What does this PR do? closes #1584 This should be a rather innocuous change. ## Test Plan Verify that there's no more tool call parsing error for example in issue <img width="1216" alt="image" src="https://github.com/user-attachments/assets/a5a6f4e8-2093-4ca2-bc06-794b707a0429" /> LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/integration/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct	2025-03-19 22:49:24 -07:00
Hardik Shah	65ca85ba6b	fix: Updating `ToolCall.arguments` to allow for json strings that can be decoded on client side (#1685 ) ### What does this PR do? Currently, `ToolCall.arguments` is a `Dict[str, RecursiveType]`. However, on the client SDK side -- the `RecursiveType` gets deserialized into a number ( both int and float get collapsed ) and hence when params are `int` they get converted to float which might break client side tools that might be doing type checking. Closes: https://github.com/meta-llama/llama-stack/issues/1683 ### Test Plan Stainless changes -- https://github.com/meta-llama/llama-stack-client-python/pull/204 ``` pytest -s -v --stack-config=fireworks tests/integration/agents/test_agents.py --text-model meta-llama/Llama-3.1-8B-Instruct ```	2025-03-19 10:36:19 -07:00
Ihar Hrachyshka	a64021bb47	fix: Disable async loop warning messages during test run (#1526 ) # What does this PR do? The test class by default enables debug mode, which produces some unexpected warnings like: ``` tests/unit/models/test_prompt_adapter.py::PrepareMessagesTests::test_completion_message_encoding WARNING 2025-03-10 20:41:48,577 asyncio:1904 uncategorized: Executing <Task pending name='Task-1' coro=<IsolatedAsyncioTestCase._asyncioLoopRunner() running at /home/ec2-user/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/unittest/async_case.py:95 > wait_for=<Future pending cb=[Task.task_wakeup()] created at /home/ec2-user/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/base_events.py:42 9> created at /home/ec2-user/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/unittest/async_case.py:11 7> took 0.231 seconds PASSED ``` I suggest we disable these since they are not very useful and can confuse other developers. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan Run tests. The warnings are no longer seen. [//]: # (## Documentation) Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-03-10 15:29:08 -07:00
Ashwin Bharambe	4ca58eb987	refactor: tests/unittests -> tests/unit; tests/api -> tests/integration	2025-03-04 09:57:00 -08:00

6 commits