llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-08-04 01:39:00 +00:00

Author	SHA1	Message	Date
Xi Yan	a6091fa158	server	2024-12-26 18:35:06 -08:00
Xi Yan	74de9bebd1	registry	2024-12-26 18:34:00 -08:00
Xi Yan	27da763af9	more fixes	2024-12-26 18:30:42 -08:00
Xi Yan	6596caed55	vllm	2024-12-26 18:25:28 -08:00
Xi Yan	206554e853	stack imports	2024-12-26 18:23:40 -08:00
Xi Yan	3c84f491ec	imports	2024-12-26 18:21:53 -08:00
Xi Yan	7c12cda244	llama guard	2024-12-26 18:18:01 -08:00
Xi Yan	f58e92f8d3	prompt guard	2024-12-26 18:15:55 -08:00
Xi Yan	61be406b49	scoring	2024-12-26 18:14:53 -08:00
Xi Yan	fcac7cfafa	braintrust	2024-12-26 18:13:43 -08:00
Xi Yan	71d50ab368	telemetry & sample	2024-12-26 18:12:51 -08:00
Xi Yan	c4b9b3cb52	huggingface	2024-12-26 18:11:10 -08:00
Xi Yan	d40e527471	bedrock	2024-12-26 18:10:23 -08:00
Xi Yan	28428c320a	databricks	2024-12-26 18:08:50 -08:00
Xi Yan	6f7f02fbad	fireworks	2024-12-26 18:08:08 -08:00
Xi Yan	f97638a323	ollama import remove	2024-12-26 18:07:18 -08:00
Xi Yan	165777a181	impls imports remove	2024-12-26 18:05:19 -08:00
Xi Yan	b641902bfa	impls imports remove	2024-12-26 18:01:45 -08:00
Xi Yan	c1ef055f39	test prompt adapter	2024-12-26 17:49:17 -08:00
Xi Yan	2fe4acd64d	text inference	2024-12-26 17:45:25 -08:00
Xi Yan	16cfe1014e	vision inference	2024-12-26 17:31:42 -08:00
Xi Yan	3b1f20ac00	memory tests fix	2024-12-26 17:27:01 -08:00
Xi Yan	3f86c19150	builds	2024-12-26 17:21:23 -08:00
Xi Yan	8a8550fe9b	cli imports	2024-12-26 17:19:40 -08:00
Xi Yan	21a6bd57ea	fix imports	2024-12-26 17:17:03 -08:00
Xi Yan	c6d3fc6fb6	datatypes	2024-12-26 17:00:56 -08:00
Xi Yan	6c6b5fb091	openai_compat	2024-12-26 16:59:06 -08:00
Xi Yan	9ab0730294	kvstore	2024-12-26 16:55:40 -08:00
Xi Yan	30fee82407	vector_store	2024-12-26 16:54:33 -08:00
Xi Yan	b7bc1c6297	telemetry	2024-12-26 16:48:54 -08:00
Xi Yan	bb0a3f5c8e	remove more imports	2024-12-26 16:43:30 -08:00
Xi Yan	93ed8aa814	remove more imports	2024-12-26 16:39:31 -08:00
Xi Yan	0a0c01fbc2	test agents imports	2024-12-26 16:32:23 -08:00
Xi Yan	9bdb7236b2	Merge branch 'main' into remove_import_stars	2024-12-26 15:50:12 -08:00
Xi Yan	88c967a3e2	fix client-sdk memory/safety test	2024-12-26 15:49:15 -08:00
Xi Yan	b05d8fd956	fix client-sdk agents/inference test	2024-12-26 15:49:14 -08:00
Xi Yan	19c99e36a0	update playground doc video	2024-12-26 15:49:14 -08:00
Xi Yan	70db039ff4	fix client-sdk memory/safety test	2024-12-26 15:48:28 -08:00
Xi Yan	b6aca4c8bb	fix client-sdk agents/inference test	2024-12-26 15:44:34 -08:00
Xi Yan	da26d22f90	remove imports 1/n	2024-12-26 15:19:06 -08:00
Xi Yan	4e1d0a2fc5	update playground doc video	2024-12-26 14:50:19 -08:00
Xi Yan	28ce511986	fix --endpoint docs	2024-12-26 14:32:07 -08:00
Ikko Eltociear Ashimine	7ba95a8e74	docs: update evals_reference/index.md (#675 ) # What does this PR do? minor fix ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-26 11:32:37 -08:00
Aidan Do	21fb92d7cf	Add 3.3 70B to Ollama inference provider (#681 ) # What does this PR do? Adds 3.3 70B support to Ollama inference provider ## Test Plan <details> <summary>Manual</summary> ```bash # 42GB to download ollama pull llama3.3:70b ollama run llama3.3:70b --keepalive 60m export LLAMA_STACK_PORT=5000 pip install -e . \ && llama stack build --template ollama --image-type conda \ && llama stack run ./distributions/ollama/run.yaml \ --port $LLAMA_STACK_PORT \ --env INFERENCE_MODEL=Llama3.3-70B-Instruct \ --env OLLAMA_URL=http://localhost:11434 export LLAMA_STACK_PORT=5000 llama-stack-client --endpoint http://localhost:$LLAMA_STACK_PORT \ inference chat-completion \ --model-id Llama3.3-70B-Instruct \ --message "hello, what model are you?" ``` <img width="1221" alt="image" src="https://github.com/user-attachments/assets/dcffbdd9-94c8-4d47-9f95-4ef6c3756294" /> </details> ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-25 22:15:58 -08:00
Yuan Tang	fa371fdc9e	Removed unnecessary CONDA_PREFIX env var in installation guide (#683 ) This is not needed since `conda activate stack` has already been executed.	2024-12-23 13:17:30 -08:00
Yuan Tang	987e651755	Add missing venv option in --image-type (#677 ) "venv" option is supported but not mentioned in the prompt. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2024-12-21 21:10:13 -08:00
Botao Chen	bae197c37e	Fix post training apis broken by torchtune release (#674 ) There is a torchtune release this morning https://github.com/pytorch/torchtune/releases/tag/v0.5.0 and breaks post training apis ## test spinning up server and the post training works again after the fix <img width="1314" alt="Screenshot 2024-12-20 at 4 08 54 PM" src="https://github.com/user-attachments/assets/dfae724d-ebf0-4846-9715-096efa060cee" /> ## Note We need to think hard of how to avoid this happen again and have a fast follow up on this after holidays	2024-12-20 16:12:02 -08:00
Botao Chen	06cb0c837e	[torchtune integration] post training + eval (#670 ) ## What does this PR do? - Add related Apis in experimental-post-training template to enable eval on the finetuned checkpoint in the template - A small bug fix on meta reference eval - A small error handle improvement on post training ## Test Plan From client side issued an E2E post training request https://github.com/meta-llama/llama-stack-client-python/pull/70 and get eval results successfully <img width="1315" alt="Screenshot 2024-12-20 at 12 06 59 PM" src="https://github.com/user-attachments/assets/a09bd524-59ae-490c-908f-2e36ccf27c0a" />	2024-12-20 13:43:13 -08:00
Dinesh Yeduguru	c8be0bf1c9	Tools API with brave and MCP providers (#639 ) This PR adds a new Tools api and adds two tool runtime providers: brave and MCP. Test plan: ``` curl -X POST 'http://localhost:5000/alpha/toolgroups/register' \ -H 'Content-Type: application/json' \ -d '{ "tool_group_id": "simple_tool", "tool_group": { "type": "model_context_protocol", "endpoint": {"uri": "http://localhost:56000/sse"} }, "provider_id": "model-context-protocol" }' curl -X POST 'http://localhost:5000/alpha/toolgroups/register' \ -H 'Content-Type: application/json' \ -d '{ "tool_group_id": "search", "provider_id": "brave-search", "tool_group": { "type": "user_defined", "tools": [ { "name": "brave_search", "description": "A web search tool", "parameters": [ { "name": "query", "parameter_type": "string", "description": "The query to search" } ], "metadata": {}, "tool_prompt_format": "json" } ] } }' curl -X GET http://localhost:5000/alpha/tools/list \| jq . % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 662 100 662 0 0 333k 0 --:--:-- --:--:-- --:--:-- 646k [ { "identifier": "brave_search", "provider_resource_id": "brave_search", "provider_id": "brave-search", "type": "tool", "tool_group": "search", "description": "A web search tool", "parameters": [ { "name": "query", "parameter_type": "string", "description": "The query to search" } ], "metadata": {}, "tool_prompt_format": "json" }, { "identifier": "fetch", "provider_resource_id": "fetch", "provider_id": "model-context-protocol", "type": "tool", "tool_group": "simple_tool", "description": "Fetches a website and returns its content", "parameters": [ { "name": "url", "parameter_type": "string", "description": "URL to fetch" } ], "metadata": { "endpoint": "http://localhost:56000/sse" }, "tool_prompt_format": "json" } ] curl -X POST 'http://localhost:5000/alpha/tool-runtime/invoke' \ -H 'Content-Type: application/json' \ -d '{ "tool_name": "fetch", "args": { "url": "http://google.com/" } }' curl -X POST 'http://localhost:5000/alpha/tool-runtime/invoke' \ -H 'Content-Type: application/json' -H 'X-LlamaStack-ProviderData: {"api_key": "<KEY>"}' \ -d '{ "tool_name": "brave_search", "args": { "query": "who is meta ceo" } }' ```	2024-12-19 21:25:17 -08:00
Aidan Do	17fdb47e5e	Add Llama 70B 3.3 to fireworks (#654 ) # What does this PR do? - Makes Llama 70B 3.3 available for fireworks ## Test Plan ```shell pip install -e . \ && llama stack build --config distributions/fireworks/build.yaml --image-type conda \ && llama stack run distributions/fireworks/run.yaml \ --port 5000 ``` ```python response = client.inference.chat_completion( model_id="Llama3.3-70B-Instruct", messages=[ {"role": "user", "content": "hello world"}, ], ) ``` ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-19 17:32:49 -08:00

1 2 3 4 5 ...

807 commits