llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-08-13 05:17:26 +00:00

Author	SHA1	Message	Date
Xi Yan	b1b45ed320	add comment	2025-02-20 22:46:17 -08:00
Xi Yan	fa4a56cf6c	refactor	2025-02-20 22:41:23 -08:00
Xi Yan	2c06704d63	refactor	2025-02-20 22:40:51 -08:00
Xi Yan	99bc54b033	fix duplicate tool msg	2025-02-20 22:16:37 -08:00
Xi Yan	0de38a2b48	Merge branch 'agents-unify-tools-2' into agents-unify-tools-3	2025-02-20 21:49:56 -08:00
Xi Yan	e2bfd165d2	add flag allow_turn_resume	2025-02-20 21:49:06 -08:00
Xi Yan	9c40529e93	fix tool execution step from tool response	2025-02-20 21:36:50 -08:00
Xi Yan	97f9580b1a	rename	2025-02-20 19:49:50 -08:00
Xi Yan	9a07e709ee	rename	2025-02-20 19:48:54 -08:00
Xi Yan	6d08a935ba	merge	2025-02-20 19:48:01 -08:00
Xi Yan	702e74da8e	Merge branch 'agents-unify-tools' into agents-unify-tools-2	2025-02-20 19:46:59 -08:00
Xi Yan	fa0dfdeac2	resume request	2025-02-20 19:46:43 -08:00
Xi Yan	9f2f6c9b30	Merge branch 'agents-unify-tools-2' into agents-unify-tools-3	2025-02-20 19:45:01 -08:00
Xi Yan	b14854943f	Merge branch 'agents-unify-tools' into agents-unify-tools-2	2025-02-20 19:43:52 -08:00
Xi Yan	122b20c142	continue to resume	2025-02-20 19:42:56 -08:00
Xi Yan	5e00e9f260	persist pending tool execution	2025-02-20 19:33:21 -08:00
Xi Yan	4923270122	continue turn	2025-02-20 18:00:57 -08:00
Xi Yan	22355e3b1f	add back 2/n	2025-02-20 17:53:29 -08:00
Xi Yan	157cf320d9	add back 2/n	2025-02-20 17:52:01 -08:00
Xi Yan	ee3c174bb3	add back 2/n	2025-02-20 17:40:39 -08:00
Xi Yan	cd36a77e20	3/n	2025-02-20 17:38:21 -08:00
Xi Yan	01f90dfe0c	Merge branch 'agents-unify-tools-2' into agents-unify-tools-3	2025-02-20 17:27:27 -08:00
Xi Yan	7677f01beb	Merge branch 'agents-unify-tools' into agents-unify-tools-2	2025-02-20 17:27:11 -08:00
Xi Yan	c7e84253e7	Merge branch 'agents-unify-tools' into agents-unify-tools-3	2025-02-20 17:26:58 -08:00
Xi Yan	8fe38d128d	streaming flag	2025-02-20 16:58:45 -08:00
Xi Yan	5fbb159cf6	fix test	2025-02-20 16:48:17 -08:00
Xi Yan	96c521ada6	temp debug	2025-02-20 16:30:27 -08:00
Xi Yan	fb0d992f99	temp debuug	2025-02-20 15:48:55 -08:00
Xi Yan	4dbe3fd9e6	Merge branch 'agents-unify-tools' into agents-unify-tools-2	2025-02-20 15:29:11 -08:00
Xi Yan	5644d10c82	remove usermessages	2025-02-20 15:26:37 -08:00
Xi Yan	beea9ac133	Merge branch 'agents-unify-tools' into agents-unify-tools-2	2025-02-20 15:07:50 -08:00
Xi Yan	afee71604f	api	2025-02-20 15:07:18 -08:00
Xi Yan	07c9222b6f	debug	2025-02-20 14:54:45 -08:00
Xi Yan	5eea2bc44d	Merge branch 'agents-unify-tools' into agents-unify-tools-2	2025-02-20 14:41:47 -08:00
Xi Yan	57ca2c6365	Merge branch 'main' into agents-unify-tools	2025-02-20 14:41:29 -08:00
Ashwin Bharambe	736560ceba	Remove os.getenv() from ollama config	2025-02-20 14:30:32 -08:00
Xi Yan	7b0ff5718e	Merge branch 'agents-unify-tools' into agents-unify-tools-2	2025-02-20 14:19:52 -08:00
Xi Yan	9c9a607b41	merge	2025-02-20 14:17:31 -08:00
LESSuseLESS	2cbe9395b0	feat: D69478008 [llama-stack] turning tests into data-driven (#1180 ) # What does this PR do? We have several places running tests for different purposes. - oss llama stack - provider tests - e2e tests - provider llama stack - unit tests - e2e tests It would be nice if they can share the same set of test data, so we maintain the consistency between spec and implementation. This is what this diff is about, isolating test data from test coding, so that we can reuse the same data at different places by writing different test coding. ## Test Plan == Set up Ollama local server == Run a provider test conda activate stack OLLAMA_URL="http://localhost:8321" \ pytest -v -s -k "ollama" --inference-model="llama3.2:3b-instruct-fp16" \ llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_completion_structured_output // test_structured_output should also work == Run an e2e test conda activate sherpa with-proxy pip install llama-stack export INFERENCE_MODEL=llama3.2:3b-instruct-fp16 export LLAMA_STACK_PORT=8322 with-proxy llama stack build --template ollama with-proxy llama stack run --env OLLAMA_URL=http://localhost:8321 ollama - Run test client, LLAMA_STACK_PORT=8322 LLAMA_STACK_BASE_URL="http://localhost:8322" \ pytest -v -s --inference-model="llama3.2:3b-instruct-fp16" \ tests/client-sdk/inference/test_text_inference.py::test_text_completion_structured_output // test_text_chat_completion_structured_output should also work ## Notes - This PR was automatically generated by oss_sync - Please refer to D69478008 for more details.	2025-02-20 14:13:06 -08:00
ehhuang	1166afdf76	fix: some telemetry APIs don't currently work (#1188 ) Summary: This bug is surfaced by using the http LS client. The issue is that non-scalar values in 'GET' method are `body` params in fastAPI, but our spec generation script doesn't respect that. We fix by just making them POST method instead. Test Plan: Test API call with newly sync'd client (https://github.com/meta-llama/llama-stack-client-python/pull/149) <img width="1114" alt="image" src="https://github.com/user-attachments/assets/7710aca5-d163-4e00-a465-14e6fcaac2b2" />	2025-02-20 14:09:25 -08:00
Xi Yan	ea1faae50e	chore!: deprecate eval/tasks (#1186 ) # What does this PR do? - Fully deprecate eval/tasks [//]: # (If resolving an issue, uncomment and update the line below) Closes #1088 NOTE: this will be a breaking change. We have introduced the new API in 0.1.3 . Notebook has been updated to use the new endpoints. ## Test Plan ``` pytest -v -s --nbval-lax ./docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb ``` <img width="611" alt="image" src="https://github.com/user-attachments/assets/79f6efe1-81ba-494e-bf36-1fc0c2b9bc6f" /> cc @SLR722 for awareness [//]: # (## Documentation)	2025-02-20 14:06:21 -08:00
Xi Yan	7676756778	merge	2025-02-20 14:04:25 -08:00
Ashwin Bharambe	07ccf908f7	ModelAlias -> ProviderModelEntry	2025-02-20 14:02:36 -08:00
Xi Yan	a44d230676	rename	2025-02-20 14:02:17 -08:00
Xi Yan	ff87677102	rename	2025-02-20 14:00:08 -08:00
Xi Yan	82109749ea	rename	2025-02-20 13:56:47 -08:00
Kevin Cogan	561295af76	docs: Fix Links, Add Podman Instructions, Vector DB Unregister, and Example Script (#1129 ) # What does this PR do? This PR improves the documentation in several ways: - Fixed incorrect link in `tools.md` to ensure all references point to the correct resources. - Added instructions for running the `code-interpreter` agent in a Podman container, helping users configure and execute the tool in containerized environments. - Introduced an unregister command for single and multiple vector databases, making it easier to manage vector DBs. - Provided a simple example script for using the `code-interpreter` agent, giving users a practical reference for implementation. These updates enhance the clarity, usability, and completeness of the documentation. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan The following steps were performed to verify the accuracy of the changes: 1. Validated all fixed link by checking their destinations to ensure correctness. 2. Ran the `code-interpreter` agent in a Podman container following the new instructions to confirm functionality. 3. Executed the vector database unregister commands and verified that both single and multiple databases were correctly removed. 4. Tested the new example script for `code-interpreter`, ensuring it runs without errors. All changes were reviewed and tested successfully, improving the documentation's accuracy and ease of use. [//]: # (## Documentation)	2025-02-20 13:52:14 -08:00
Xi Yan	7a111e39f6	Merge branch 'main' into agents-unify-tools	2025-02-20 13:51:32 -08:00
Vladimir Ivić	f7161611c6	feat: adding endpoints for files and uploads (#1070 ) Summary: Adds spec definitions for file uploads operations. This API focuses around two high level operations: * Initiating and managing upload session * Accessing uploaded file information Usage examples: To start a file upload session: ``` curl -X POST https://localhost:8321/v1/files \ -d '{ "key": "image123.jpg', "bucket": "images", "mime_type": "image/jpg", "size": 12345 }' # Returns { “id”: <session_id> “url”: “https://localhost:8321/v1/files/session:<session_id>”, "offset": 0, "size": 12345 } ``` To upload file content to an existing session ``` curl -i -X POST "https://localhost:8321/v1/files/session:<session_id> \ --data-binary @<path_to_local_file> # Returns { "key": "image123.jpg", "bucket": "images", "mime_type": "image/jpg", "bytes": 12345, "created_at": 1737492240 } # Implementing on server side (Flask example for simplicity): @app.route('/uploads/{upload_id}', methods=['POST']) def upload_content_to_session(upload_id): try: # Get the binary file data from the request body file_data = request.data # Save the file to disk save_path = f"./uploads/{upload_id}" with open(save_path, 'wb') as f: f.write(file_data) return {__uploaded_file_json__}, 200 except Exception as e: return 500 ``` To read information about an existing upload session ``` curl -i -X GET "https://localhost:8321/v1/files/session:<session_id> # Returns { “id”: <session_id> “url”: “https://localhost:8321/v1/files/session:<session_id>”, "offset": 1024, "size": 12345 } ``` To list buckets ``` GET /files # Returns { "data": [ {"name": "bucket1"}, {"name": "bucket2"}, ] } ``` To list all files in a bucket ``` GET /files/{bucket} # Returns { "data": [ { "key": "shiba.jpg", "bucket": "dogs", "mime_type": "image/jpg", "bytes": 82334, "created_at": 1737492240, }, { "key": "persian_cat.jpg", "mime_type": "image/jpg", "bucket": "cats", "bytes": 39924, "created_at": 1727493440, }, ] } ``` To get specific file info ``` GET /files/{bucket}/{key} { "key": "shiba.jpg", "bucket": "dogs", "mime_type": "image/jpg", "bytes": 82334, "created_at": 1737492240, } ``` To delete specific file ``` DELETE /files/{bucket}/{key} { "key": "shiba.jpg", "bucket": "dogs", "mime_type": "image/jpg", "bytes": 82334, "created_at": 1737492240, } ```	2025-02-20 13:09:00 -08:00
Xi Yan	7dae81cb68	tmp	2025-02-20 12:57:18 -08:00

1 2 3 4 5 ...

1252 commits