Commit graph

2107 commits

Author SHA1 Message Date
Ashwin Bharambe
ec4fc800cc
[API Updates] Model / shield / memory-bank routing + agent persistence + support for private headers (#92)
This is yet another of those large PRs (hopefully we will have less and less of them as things mature fast). This one introduces substantial improvements and some simplifications to the stack.

Most important bits:

* Agents reference implementation now has support for session / turn persistence. The default implementation uses sqlite but there's also support for using Redis.

* We have re-architected the structure of the Stack APIs to allow for more flexible routing. The motivating use cases are:
  - routing model A to ollama and model B to a remote provider like Together
  - routing shield A to local impl while shield B to a remote provider like Bedrock
  - routing a vector memory bank to Weaviate while routing a keyvalue memory bank to Redis

* Support for provider specific parameters to be passed from the clients. A client can pass data using `x_llamastack_provider_data` parameter which can be type-checked and provided to the Adapter implementations.
2024-09-23 14:22:22 -07:00
Hardik Shah
8bf8c07eb3 Respect user sent instructions in agent config and add them to system prompt 2024-09-21 16:46:10 -07:00
Xi Yan
06abd7e6c8 update MemoryToolDefinition 2024-09-20 17:51:53 -07:00
Ashwin Bharambe
942cb87a3c remove apis/stack.py 2024-09-20 09:37:08 -07:00
Hardik Shah
33db4d2e45 ignore config dir 2024-09-20 00:24:49 -07:00
Hardik Shah
7e9e6117e3 do not assume CONDA_PREFIX exists during configuration 2024-09-19 23:39:34 -07:00
Hardik Shah
8fa49593e0
Allow TGI adaptor to have non-standard llama model names (#84)
Co-authored-by: Hardik Shah <hjshah@fb.com>
2024-09-19 21:42:15 -07:00
Hardik Shah
42d29f3a5a Allow TGI adaptor to have non-standard llama model names 2024-09-19 21:37:02 -07:00
Xi Yan
59af1c8fec
fix memory url parsing (#81) 2024-09-19 13:35:03 -07:00
Ashwin Bharambe
132f9429b1 Add a test for CLI, but not fully done so disabled 2024-09-19 13:27:07 -07:00
Ashwin Bharambe
8b3ffa33de Add another test case 2024-09-19 13:02:57 -07:00
Ashwin Bharambe
abb43936ab Add a test runner and 2 very simple tests for agents 2024-09-19 12:22:48 -07:00
Xi Yan
543222ac39 update inference prompt msg 2024-09-19 12:03:24 -07:00
Xi Yan
a30b919ae1 update inference prompt msg 2024-09-19 12:03:24 -07:00
Ashwin Bharambe
9eb01dd664 Add DOCKER_BINARY / DOCKER_OPTS to all scripts 2024-09-19 10:26:41 -07:00
Xi Yan
ca4b87aa05 fix memory client 2024-09-19 09:29:40 -07:00
Xi Yan
6302a1ee90
fix prompt with name args (#80) 2024-09-18 23:48:31 -07:00
Ashwin Bharambe
c63d6cbd08 list(...keys()) so dict_keys does not show up 2024-09-18 23:24:07 -07:00
Xi Yan
880ed37026
Update cli_reference.md 2024-09-18 23:05:24 -07:00
Xi Yan
5c4a2dc0e1
Update getting_started.md 2024-09-18 23:03:14 -07:00
Ashwin Bharambe
f5eda1decf Add default for max_seq_len 2024-09-18 21:59:10 -07:00
Ashwin Bharambe
9ab27e852b Bug fixes for memory 2024-09-18 21:54:02 -07:00
Ashwin Bharambe
8cdc2f0cfb No RunShieldRequest 2024-09-18 20:38:21 -07:00
Xi Yan
f3f5873e9e regenerate openapi spec 2024-09-18 19:28:05 -07:00
Xi Yan
9f1be108ce Bump version to 0.0.20 2024-09-18 19:06:07 -07:00
Xi Yan
455a6e4bb9 update MANIFEST 2024-09-18 18:58:50 -07:00
Ashwin Bharambe
dff9eab48f Remove "APIs to serve" prompt 2024-09-18 18:26:26 -07:00
Xi Yan
f5d5e32d62 fix docker configure 2024-09-18 17:23:37 -07:00
Xi Yan
5ec64ac68c moving rfc->docs 2024-09-18 16:54:24 -07:00
Xi Yan
2c1ad10710 move openapi from rfcs->docs 2024-09-18 16:09:17 -07:00
Xi Yan
21058be0c1 Bump version to 0.0.19 2024-09-18 15:48:38 -07:00
Xi Yan
45e20ff431 update getting started 2024-09-18 15:40:48 -07:00
Xi Yan
2f9e952813 update getting started guide 2024-09-18 15:35:54 -07:00
Hardik Shah
29ce73ff7a update requirements, added prompt-toolkit 2024-09-18 15:21:45 -07:00
Xi Yan
1128f69674
CLI: add build templates support, move imports (#77)
* list templates implementation

* relative path

* finalize templates

* remove imports

* remove templates from name, name templates

* fix docker

* fix docker
2024-09-18 14:25:53 -07:00
Xi Yan
6b21523c28
CLI - add back build wizard, configure with name instead of build.yaml (#74)
* add back wizard for build

* conda build path move

* polish message

* run with name only

* prompt for build

* improve comments

* update msgs

* add new lines

* move build.yaml

* address comments

* validator for providers

* move imports

* Please enter -> enter

* comments, get started guide

* nits

* fix cprint import

* fix imports
2024-09-18 11:41:56 -07:00
Xi Yan
e6fdb9df29
fix context retriever (#75) 2024-09-18 08:24:36 -07:00
Ashwin Bharambe
055770a791 Stop asking for "apis to serve" as part of configure 2024-09-17 22:41:10 -07:00
Dalton Flanagan
eea0a83bd1
Update getting_started.md
config is now a positional argument
2024-09-18 00:47:41 -04:00
Ashwin Bharambe
9fd431e710 make shield imports more lazy 2024-09-17 21:27:37 -07:00
Ashwin Bharambe
81ff7476d3 Bump version to 0.0.18 2024-09-17 20:08:04 -07:00
Ashwin Bharambe
3e27131a69 Don't import pkg_resources until you need it 2024-09-17 20:01:22 -07:00
Ashwin Bharambe
25adc83de8 Fix for safety 2024-09-17 19:56:58 -07:00
Ashwin Bharambe
9487ad8294
API Updates (#73)
* API Keys passed from Client instead of distro configuration

* delete distribution registry

* Rename the "package" word away

* Introduce a "Router" layer for providers

Some providers need to be factorized and considered as thin routing
layers on top of other providers. Consider two examples:

- The inference API should be a routing layer over inference providers,
  routed using the "model" key
- The memory banks API is another instance where various memory bank
  types will be provided by independent providers (e.g., a vector store
  is served by Chroma while a keyvalue memory can be served by Redis or
  PGVector)

This commit introduces a generalized routing layer for this purpose.

* update `apis_to_serve`

* llama_toolchain -> llama_stack

* Codemod from llama_toolchain -> llama_stack

- added providers/registry
- cleaned up api/ subdirectories and moved impls away
- restructured api/api.py
- from llama_stack.apis.<api> import foo should work now
- update imports to do llama_stack.apis.<api>
- update many other imports
- added __init__, fixed some registry imports
- updated registry imports
- create_agentic_system -> create_agent
- AgenticSystem -> Agent

* Moved some stuff out of common/; re-generated OpenAPI spec

* llama-toolchain -> llama-stack (hyphens)

* add control plane API

* add redis adapter + sqlite provider

* move core -> distribution

* Some more toolchain -> stack changes

* small naming shenanigans

* Removing custom tool and agent utilities and moving them client side

* Move control plane to distribution server for now

* Remove control plane from API list

* no codeshield dependency randomly plzzzzz

* Add "fire" as a dependency

* add back event loggers

* stack configure fixes

* use brave instead of bing in the example client

* add init file so it gets packaged

* add init files so it gets packaged

* Update MANIFEST

* bug fix

---------

Co-authored-by: Hardik Shah <hjshah@fb.com>
Co-authored-by: Xi Yan <xiyan@meta.com>
Co-authored-by: Ashwin Bharambe <ashwin@meta.com>
2024-09-17 19:51:35 -07:00
Xi Yan
f294eac5f5 Bump version to 0.0.17 2024-09-16 13:10:05 -07:00
Xi Yan
5839c61002 stage back models api 2024-09-16 13:00:39 -07:00
Xi Yan
82b5c0460e models api 2024-09-16 12:57:05 -07:00
Ashwin Bharambe
a36699cd11 Rename the "package" word away 2024-09-16 12:22:47 -07:00
Xi Yan
98c55b63b4 delete distribution registry 2024-09-16 12:11:59 -07:00
Ashwin Bharambe
6f5d9a3df8 provider_type -> provider_id ... less confusing 2024-09-16 12:10:13 -07:00