llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-13 04:22:35 +00:00

Author	SHA1	Message	Date
Xi Yan	8593c94b91	Merge branch 'main' into ollama_docker	2024-10-21 11:13:01 -07:00
Xi Yan	8a50426d47	vllm	2024-10-21 11:12:26 -07:00
Xi Yan	88187bc5f6	vllm	2024-10-21 11:07:04 -07:00
Xi Yan	acfcbca14a	tmp add back build to avoid merge conflicts	2024-10-21 11:04:26 -07:00
Xi Yan	202667f3db	delete templates	2024-10-21 11:03:34 -07:00
Xi Yan	3ca822f4cd	build templates	2024-10-21 11:02:32 -07:00
Xi Yan	ca2e7f52bd	vllm	2024-10-21 11:00:50 -07:00
nehal-a2z	8ef3d3d239	Update event_logger.py (#275 ) spelling error	2024-10-21 10:48:50 -07:00
nehal-a2z	c995219731	Update event_logger.py (#275 ) spelling error	2024-10-21 10:46:53 -07:00
raghotham	af52c22c5e	Create .readthedocs.yaml Trying out readthedocs	2024-10-21 10:46:47 -07:00
Yuan Tang	74e6356b51	Add vLLM inference provider for OpenAI compatible vLLM server (#178 ) This PR adds vLLM inference provider for OpenAI compatible vLLM server.	2024-10-21 10:46:45 -07:00
Ashwin Bharambe	391dedd1c0	update ollama for llama-guard3	2024-10-21 10:46:40 -07:00
Ashwin Bharambe	89759a0ad3	Improve an important error message	2024-10-21 10:46:40 -07:00
Ashwin Bharambe	5863f65874	Make all methods `async def` again; add completion() for meta-reference (#270 ) PR #201 had made several changes while trying to fix issues with getting the stream=False branches of inference and agents API working. As part of this, it made a change which was slightly gratuitous. Namely, making chat_completion() and brethren "def" instead of "async def". The rationale was that this allowed the user (within llama-stack) of this to use it as: ``` async for chunk in api.chat_completion(params) ``` However, it causes unnecessary confusion for several folks. Given that clients (e.g., llama-stack-apps) anyway use the SDK methods (which are completely isolated) this choice was not ideal. Let's revert back so the call now looks like: ``` async for chunk in await api.chat_completion(params) ``` Bonus: Added a completion() implementation for the meta-reference provider. Technically should have been another PR :)	2024-10-21 10:46:40 -07:00
Ashwin Bharambe	92aca57bfa	Small rename	2024-10-21 10:46:40 -07:00
Ashwin Bharambe	6f4537b4c4	Allow overridding checkpoint_dir via config	2024-10-21 10:46:40 -07:00
Ashwin Bharambe	a90ab5878b	Add an option to not use elastic agents for meta-reference inference (#269 )	2024-10-21 10:46:40 -07:00
Xi Yan	2f5c410c73	[bugfix] fix case for agent when memory bank registered without specifying provider_id (#264 ) * fix case where memory bank is registered without provider_id * memory test * agents unit test	2024-10-21 10:46:40 -07:00
Xi Yan	29c8edb4f6	readme	2024-10-21 09:11:25 -07:00
Xi Yan	5ea36b0274	readme	2024-10-21 09:03:05 -07:00
Xi Yan	d4caab3c67	developer cookbook	2024-10-21 09:01:34 -07:00
Xi Yan	302fa5c4bb	build/developer cookbook/new api provider	2024-10-21 09:01:22 -07:00
raghotham	cae5b0708b	Create .readthedocs.yaml Trying out readthedocs	2024-10-21 11:48:19 +05:30
Yuan Tang	a27a2cd2af	Add vLLM inference provider for OpenAI compatible vLLM server (#178 ) This PR adds vLLM inference provider for OpenAI compatible vLLM server.	2024-10-20 18:43:25 -07:00
Ashwin Bharambe	59c43736e8	update ollama for llama-guard3	2024-10-19 17:26:18 -07:00
Ashwin Bharambe	8cfbb9d38b	Improve an important error message	2024-10-19 17:19:54 -07:00
Ashwin Bharambe	2089427d60	Make all methods `async def` again; add completion() for meta-reference (#270 ) PR #201 had made several changes while trying to fix issues with getting the stream=False branches of inference and agents API working. As part of this, it made a change which was slightly gratuitous. Namely, making chat_completion() and brethren "def" instead of "async def". The rationale was that this allowed the user (within llama-stack) of this to use it as: ``` async for chunk in api.chat_completion(params) ``` However, it causes unnecessary confusion for several folks. Given that clients (e.g., llama-stack-apps) anyway use the SDK methods (which are completely isolated) this choice was not ideal. Let's revert back so the call now looks like: ``` async for chunk in await api.chat_completion(params) ``` Bonus: Added a completion() implementation for the meta-reference provider. Technically should have been another PR :)	2024-10-18 20:50:59 -07:00
Xi Yan	f58441cc21	readme	2024-10-18 18:55:29 -07:00
Xi Yan	100b5fecd4	readme	2024-10-18 18:53:49 -07:00
Xi Yan	955743ba7a	kill distribution/templates	2024-10-18 17:32:11 -07:00
Xi Yan	c830235936	rename	2024-10-18 17:28:26 -07:00
Xi Yan	cbb423a32f	move distribution/templates to distributions/	2024-10-18 17:21:50 -07:00
Xi Yan	b4aca0aeb6	move distribution folders	2024-10-18 17:05:41 -07:00
Ashwin Bharambe	95a96afe34	Small rename	2024-10-18 14:41:38 -07:00
Xi Yan	fd90d2ae97	readme	2024-10-18 14:30:44 -07:00
Ashwin Bharambe	71a905e93f	Allow overridding checkpoint_dir via config	2024-10-18 14:28:06 -07:00
Xi Yan	a3f748a875	readme for distributions	2024-10-18 14:21:44 -07:00
Ashwin Bharambe	33afd34e6f	Add an option to not use elastic agents for meta-reference inference (#269 )	2024-10-18 12:51:10 -07:00
Xi Yan	dcac9e4874	update compose file	2024-10-18 11:12:27 -07:00
Xi Yan	542ffbee72	comment	2024-10-17 19:37:22 -07:00
Xi Yan	293d8f2895	docker compose ollama	2024-10-17 19:31:29 -07:00
Xi Yan	be3c5c034d	[bugfix] fix case for agent when memory bank registered without specifying provider_id (#264 ) * fix case where memory bank is registered without provider_id * memory test * agents unit test	2024-10-17 17:28:17 -07:00
Ashwin Bharambe	9fcf5d58e0	Allow overriding MODEL_IDS for inference test	2024-10-17 10:03:27 -07:00
Xi Yan	02be26098a	getting started	2024-10-16 23:56:21 -07:00
Xi Yan	cf9e5b76b2	Update getting_started.md	2024-10-16 23:52:29 -07:00
Xi Yan	7cc47da8f2	Update getting_started.md	2024-10-16 23:50:31 -07:00
Xi Yan	d787d1e84f	config templates restructure, docs (#262 ) * wip * config templates * readmes	2024-10-16 23:25:10 -07:00
Tam	a07dfffbbf	initial changes (#261 ) Update the parsing logic for comma-separated list and download function	2024-10-16 23:15:59 -07:00
ATH	319a6b5f83	Update getting_started.md (#260 )	2024-10-16 18:05:36 -07:00
Xi Yan	c4d5d6bb91	Docker compose scripts for remote adapters (#241 ) * tgi docker compose * path * wait for tgi server to start before starting server * update provider-id * move scripts to distribution/ folder * add readme * readme	2024-10-15 16:32:53 -07:00

1 2 3 4 5 ...

348 commits