llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	ec4fc800cc	[API Updates] Model / shield / memory-bank routing + agent persistence + support for private headers (#92 ) This is yet another of those large PRs (hopefully we will have less and less of them as things mature fast). This one introduces substantial improvements and some simplifications to the stack. Most important bits: * Agents reference implementation now has support for session / turn persistence. The default implementation uses sqlite but there's also support for using Redis. * We have re-architected the structure of the Stack APIs to allow for more flexible routing. The motivating use cases are: - routing model A to ollama and model B to a remote provider like Together - routing shield A to local impl while shield B to a remote provider like Bedrock - routing a vector memory bank to Weaviate while routing a keyvalue memory bank to Redis * Support for provider specific parameters to be passed from the clients. A client can pass data using `x_llamastack_provider_data` parameter which can be type-checked and provided to the Adapter implementations.	2024-09-23 14:22:22 -07:00
Hardik Shah	8bf8c07eb3	Respect user sent instructions in agent config and add them to system prompt	2024-09-21 16:46:10 -07:00
Xi Yan	06abd7e6c8	update MemoryToolDefinition	2024-09-20 17:51:53 -07:00
Ashwin Bharambe	942cb87a3c	remove apis/stack.py	2024-09-20 09:37:08 -07:00
Hardik Shah	33db4d2e45	ignore config dir	2024-09-20 00:24:49 -07:00
Hardik Shah	7e9e6117e3	do not assume CONDA_PREFIX exists during configuration	2024-09-19 23:39:34 -07:00
Hardik Shah	8fa49593e0	Allow TGI adaptor to have non-standard llama model names (#84 ) Co-authored-by: Hardik Shah <hjshah@fb.com>	2024-09-19 21:42:15 -07:00
Hardik Shah	42d29f3a5a	Allow TGI adaptor to have non-standard llama model names	2024-09-19 21:37:02 -07:00
Xi Yan	59af1c8fec	fix memory url parsing (#81 )	2024-09-19 13:35:03 -07:00
Ashwin Bharambe	132f9429b1	Add a test for CLI, but not fully done so disabled	2024-09-19 13:27:07 -07:00
Ashwin Bharambe	8b3ffa33de	Add another test case	2024-09-19 13:02:57 -07:00
Ashwin Bharambe	abb43936ab	Add a test runner and 2 very simple tests for agents	2024-09-19 12:22:48 -07:00
Xi Yan	543222ac39	update inference prompt msg	2024-09-19 12:03:24 -07:00
Xi Yan	a30b919ae1	update inference prompt msg	2024-09-19 12:03:24 -07:00
Ashwin Bharambe	9eb01dd664	Add DOCKER_BINARY / DOCKER_OPTS to all scripts	2024-09-19 10:26:41 -07:00
Xi Yan	ca4b87aa05	fix memory client	2024-09-19 09:29:40 -07:00
Xi Yan	6302a1ee90	fix prompt with name args (#80 )	2024-09-18 23:48:31 -07:00
Ashwin Bharambe	c63d6cbd08	list(...keys()) so dict_keys does not show up	2024-09-18 23:24:07 -07:00
Xi Yan	880ed37026	Update cli_reference.md	2024-09-18 23:05:24 -07:00
Xi Yan	5c4a2dc0e1	Update getting_started.md	2024-09-18 23:03:14 -07:00
Ashwin Bharambe	f5eda1decf	Add default for max_seq_len	2024-09-18 21:59:10 -07:00
Ashwin Bharambe	9ab27e852b	Bug fixes for memory	2024-09-18 21:54:02 -07:00
Ashwin Bharambe	8cdc2f0cfb	No RunShieldRequest	2024-09-18 20:38:21 -07:00
Xi Yan	f3f5873e9e	regenerate openapi spec	2024-09-18 19:28:05 -07:00
Xi Yan	9f1be108ce	Bump version to 0.0.20	2024-09-18 19:06:07 -07:00
Xi Yan	455a6e4bb9	update MANIFEST	2024-09-18 18:58:50 -07:00
Ashwin Bharambe	dff9eab48f	Remove "APIs to serve" prompt	2024-09-18 18:26:26 -07:00
Xi Yan	f5d5e32d62	fix docker configure	2024-09-18 17:23:37 -07:00
Xi Yan	5ec64ac68c	moving rfc->docs	2024-09-18 16:54:24 -07:00
Xi Yan	2c1ad10710	move openapi from rfcs->docs	2024-09-18 16:09:17 -07:00
Xi Yan	21058be0c1	Bump version to 0.0.19	2024-09-18 15:48:38 -07:00
Xi Yan	45e20ff431	update getting started	2024-09-18 15:40:48 -07:00
Xi Yan	2f9e952813	update getting started guide	2024-09-18 15:35:54 -07:00
Hardik Shah	29ce73ff7a	update requirements, added prompt-toolkit	2024-09-18 15:21:45 -07:00
Xi Yan	1128f69674	CLI: add build templates support, move imports (#77 ) * list templates implementation * relative path * finalize templates * remove imports * remove templates from name, name templates * fix docker * fix docker	2024-09-18 14:25:53 -07:00
Xi Yan	6b21523c28	CLI - add back build wizard, configure with name instead of build.yaml (#74 ) * add back wizard for build * conda build path move * polish message * run with name only * prompt for build * improve comments * update msgs * add new lines * move build.yaml * address comments * validator for providers * move imports * Please enter -> enter * comments, get started guide * nits * fix cprint import * fix imports	2024-09-18 11:41:56 -07:00
Xi Yan	e6fdb9df29	fix context retriever (#75 )	2024-09-18 08:24:36 -07:00
Ashwin Bharambe	055770a791	Stop asking for "apis to serve" as part of configure	2024-09-17 22:41:10 -07:00
Dalton Flanagan	eea0a83bd1	Update getting_started.md config is now a positional argument	2024-09-18 00:47:41 -04:00
Ashwin Bharambe	9fd431e710	make shield imports more lazy	2024-09-17 21:27:37 -07:00
Ashwin Bharambe	81ff7476d3	Bump version to 0.0.18	2024-09-17 20:08:04 -07:00
Ashwin Bharambe	3e27131a69	Don't import `pkg_resources` until you need it	2024-09-17 20:01:22 -07:00
Ashwin Bharambe	25adc83de8	Fix for safety	2024-09-17 19:56:58 -07:00
Ashwin Bharambe	9487ad8294	API Updates (#73 ) * API Keys passed from Client instead of distro configuration * delete distribution registry * Rename the "package" word away * Introduce a "Router" layer for providers Some providers need to be factorized and considered as thin routing layers on top of other providers. Consider two examples: - The inference API should be a routing layer over inference providers, routed using the "model" key - The memory banks API is another instance where various memory bank types will be provided by independent providers (e.g., a vector store is served by Chroma while a keyvalue memory can be served by Redis or PGVector) This commit introduces a generalized routing layer for this purpose. * update `apis_to_serve` * llama_toolchain -> llama_stack * Codemod from llama_toolchain -> llama_stack - added providers/registry - cleaned up api/ subdirectories and moved impls away - restructured api/api.py - from llama_stack.apis.<api> import foo should work now - update imports to do llama_stack.apis.<api> - update many other imports - added __init__, fixed some registry imports - updated registry imports - create_agentic_system -> create_agent - AgenticSystem -> Agent * Moved some stuff out of common/; re-generated OpenAPI spec * llama-toolchain -> llama-stack (hyphens) * add control plane API * add redis adapter + sqlite provider * move core -> distribution * Some more toolchain -> stack changes * small naming shenanigans * Removing custom tool and agent utilities and moving them client side * Move control plane to distribution server for now * Remove control plane from API list * no codeshield dependency randomly plzzzzz * Add "fire" as a dependency * add back event loggers * stack configure fixes * use brave instead of bing in the example client * add init file so it gets packaged * add init files so it gets packaged * Update MANIFEST * bug fix --------- Co-authored-by: Hardik Shah <hjshah@fb.com> Co-authored-by: Xi Yan <xiyan@meta.com> Co-authored-by: Ashwin Bharambe <ashwin@meta.com>	2024-09-17 19:51:35 -07:00
Xi Yan	f294eac5f5	Bump version to 0.0.17	2024-09-16 13:10:05 -07:00
Xi Yan	5839c61002	stage back models api	2024-09-16 13:00:39 -07:00
Xi Yan	82b5c0460e	models api	2024-09-16 12:57:05 -07:00
Ashwin Bharambe	a36699cd11	Rename the "package" word away	2024-09-16 12:22:47 -07:00
Xi Yan	98c55b63b4	delete distribution registry	2024-09-16 12:11:59 -07:00
Ashwin Bharambe	6f5d9a3df8	provider_type -> provider_id ... less confusing	2024-09-16 12:10:13 -07:00

... 38 39 40 41 42 ...

2107 commits