Commit graph

123 commits

Author SHA1 Message Date
Ashwin Bharambe
a0bf20f19a Move control plane to distribution server for now 2024-09-17 12:55:50 -07:00
Ashwin Bharambe
099ac81bc7 Removing custom tool and agent utilities and moving them client side 2024-09-17 12:38:28 -07:00
Ashwin Bharambe
fa864f70da small naming shenanigans 2024-09-17 12:09:18 -07:00
Ashwin Bharambe
4c045d9ed9 Some more toolchain -> stack changes 2024-09-17 11:58:10 -07:00
Ashwin Bharambe
17172a8bf9 move core -> distribution 2024-09-17 11:38:38 -07:00
Ashwin Bharambe
bbf0b59ae4 add redis adapter + sqlite provider 2024-09-16 22:02:42 -07:00
Ashwin Bharambe
b31f0f6e4e add control plane API 2024-09-16 21:41:24 -07:00
Ashwin Bharambe
d1959e6889 llama-toolchain -> llama-stack (hyphens) 2024-09-16 21:33:20 -07:00
Ashwin Bharambe
6665d31cdf Moved some stuff out of common/; re-generated OpenAPI spec 2024-09-16 21:21:11 -07:00
Ashwin Bharambe
76b354a081 Codemod from llama_toolchain -> llama_stack
- added providers/registry
- cleaned up api/ subdirectories and moved impls away
- restructured api/api.py
- from llama_stack.apis.<api> import foo should work now
- update imports to do llama_stack.apis.<api>
- update many other imports
- added __init__, fixed some registry imports
- updated registry imports
- create_agentic_system -> create_agent
- AgenticSystem -> Agent
2024-09-16 20:03:25 -07:00
Ashwin Bharambe
2cf731faea llama_toolchain -> llama_stack 2024-09-16 17:21:08 -07:00
Ashwin Bharambe
f372355409 update apis_to_serve 2024-09-16 17:19:32 -07:00
Ashwin Bharambe
b6a3ef51da Introduce a "Router" layer for providers
Some providers need to be factorized and considered as thin routing
layers on top of other providers. Consider two examples:

- The inference API should be a routing layer over inference providers,
  routed using the "model" key
- The memory banks API is another instance where various memory bank
  types will be provided by independent providers (e.g., a vector store
  is served by Chroma while a keyvalue memory can be served by Redis or
  PGVector)

This commit introduces a generalized routing layer for this purpose.
2024-09-16 17:04:45 -07:00
Ashwin Bharambe
5c1f2616b5 Rename the "package" word away 2024-09-16 12:23:56 -07:00
Xi Yan
33030c8926 delete distribution registry 2024-09-16 12:23:56 -07:00
Ashwin Bharambe
6f5d9a3df8 provider_type -> provider_id ... less confusing 2024-09-16 12:10:13 -07:00
Xi Yan
ce6c868499
Update cli_reference.md 2024-09-16 12:02:46 -07:00
Xi Yan
ed4272e31e
Update getting_started.md 2024-09-16 11:55:10 -07:00
Xi Yan
d9147f3184
CLI Update: build -> configure -> run (#69)
* remove configure from build

* remove config from build

* configure to regenerate file

* update memory providers

* remove comments

* udpate build script

* add reedme

* update doc

* rename getting started

* update build cli

* update docker build script

* configure update

* clean up configure

* [tmp fix] hardware requirement tmp fix

* clean up build

* fix configure

* add example build files for conda & docker

* remove resolve_distribution_spec

* remove available_distribution_specs

* example build files

* update example build files

* more clean up on build

* add name args to override name

* move distribution to yaml files

* generate distribution specs

* getting started guide

* getting started

* add build yaml to Dockerfile

* cleanup distribution_dependencies

* configure from  docker image name

* build relative paths

* minor comment

* getting started

* Update getting_started.md

* Update getting_started.md

* address comments, configure within docker file

* remove distribution types!

* update getting started

* update documentation

* remove listing distribution

* minor heading

* address nits, remove docker_image=null

* gitignore
2024-09-16 11:02:26 -07:00
Ashwin Bharambe
73b71d9689 Handle Annotated types more correctly 2024-09-14 14:12:35 -07:00
Ashwin Bharambe
53ab18d6bb Bump version to 0.0.16 2024-09-14 08:09:45 -07:00
Ashwin Bharambe
49ce36426f Make llama model download error message a bit better 2024-09-14 08:06:55 -07:00
Ashwin Bharambe
7a283ea076 Bump version to 0.0.15 2024-09-13 17:23:12 -07:00
Ashwin Bharambe
498cf03617 add pypdf 2024-09-13 17:04:43 -07:00
Ashwin Bharambe
19a14cd273 Nuke hardware_requirements from SKUs 2024-09-13 16:39:02 -07:00
raghotham
d8b3fdbd54
Update README.md 2024-09-13 08:56:47 -07:00
Xi Yan
6a863f9b78 Bump version to 0.0.14 2024-09-12 21:24:07 -07:00
Xi Yan
16635508bd Bump version to 0.0.14 2024-09-12 15:11:15 -07:00
Xi Yan
5712566061
Remove request wrapper migration (#64)
* [1/n] migrate inference/chat_completion

* migrate inference/completion

* inference/completion

* inference regenerate openapi spec

* safety api

* migrate agentic system

* migrate apis without implementations

* re-generate openapi spec

* remove hack from openapi generator

* fix inference

* fix inference

* openapi generator rerun

* Simplified Telemetry API and tying it to logger (#57)

* Simplified Telemetry API and tying it to logger

* small update which adds a METRIC type

* move span events one level down into structured log events

---------

Co-authored-by: Ashwin Bharambe <ashwin@meta.com>

* fix api to work with openapi generator

* fix agentic calling inference

* together adapter inference

* update inference adapters

---------

Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
Co-authored-by: Ashwin Bharambe <ashwin@meta.com>
2024-09-12 15:03:49 -07:00
Hardik Shah
1d0e91d802
Support data: in URL for memory. Add ootb support for pdfs (#67)
* support data: in URL for memory. Add ootb support for pdfs

* moved utility to common and updated data_url parsing logic

---------

Co-authored-by: Hardik Shah <hjshah@fb.com>
2024-09-12 13:00:21 -07:00
Celina Hanouti
736092f6bc
[Inference] Use huggingface_hub inference client for TGI adapter (#53)
* Use huggingface_hub inference client for TGI inference

* Update the default value for TGI URL

* Use InferenceClient.text_generation for TGI inference

* Fixes post-review and split TGI adapter into local and Inference Endpoints ones

* Update CLI reference and add typing

* Rename TGI Adapter class

* Use HfApi to get the namespace when not provide in the hf endpoint name

* Remove unecessary method argument

* Improve TGI adapter initialization condition

* Move helper into impl file + fix merging conflicts
2024-09-12 09:11:35 -07:00
Ashwin Bharambe
191cd28831
Simplified Telemetry API and tying it to logger (#57)
* Simplified Telemetry API and tying it to logger

* small update which adds a METRIC type

* move span events one level down into structured log events

---------

Co-authored-by: Ashwin Bharambe <ashwin@meta.com>
2024-09-11 14:25:37 -07:00
Xi Yan
1433aaf9f7 add CODEOWNERS file 2024-09-11 11:40:37 -07:00
Xi Yan
89300df5dc
Add config file based CLI (#60)
* config file for build

* fix build command

* configure script with config

* fix configure script to work with config file

* update build.sh

* update readme

* distribution_type -> distribution

* fix run-config/config-file to config

* move import to inline

* only consume config as argument

* update configure to only consume config

* update readme

* update readme
2024-09-11 11:39:46 -07:00
Xi Yan
58def874a9
add safety to openapi spec (#62) 2024-09-10 17:47:13 -07:00
Hardik Shah
a11d92601b
Enable Bing search (#59)
* add tool for bing search

* simplify search tool and enable configuration for search engine

* dropped commented code

---------

Co-authored-by: Hardik Shah <hjshah@fb.com>
2024-09-10 12:34:29 -07:00
Dalton Flanagan
2b63074676 add /inference/chat_completion to SSE special case 2024-09-10 01:14:11 -04:00
Xi Yan
4f021de10f
API spec update, client demo with Stainless SDK (#58)
* [wip] client w/ stainless sdk

* update generator & yaml spec

* update wrapper request

* update script

* agentic system client sdk

* add comment todos

* remove client sdk examples
2024-09-09 13:09:47 -07:00
Ashwin Bharambe
741310f78e rename observability -> Telemetry; regen Spec 2024-09-07 15:23:53 -07:00
Ashwin Bharambe
70e682fbdf Update distribution_id -> distribution_type, provider_id -> provider_type 2024-09-07 08:42:28 -07:00
Ashwin Bharambe
3f090d1975
Add Chroma and PGVector adapters (#56)
Co-authored-by: Ashwin Bharambe <ashwin@meta.com>
2024-09-06 18:53:17 -07:00
Hardik Shah
5de6ed946e
Query generators for RAG query (#54)
* Query generators for rag query

* use agent.inference_api instead of passing host/port again

* drop classes for functions

---------

Co-authored-by: Hardik Shah <hjshah@fb.com>
2024-09-06 13:10:39 -07:00
Yufei (Benny) Chen
406c3b24d4
upgrade llama_models (#55) 2024-09-06 12:03:13 -07:00
Ashwin Bharambe
dd1e1ceb13 Add bubblewrap to the container 2024-09-05 16:45:58 -07:00
Ashwin Bharambe
f6b5e394ab Remove dependence on os.environ["USER"] 2024-09-05 15:37:30 -07:00
Ashwin Bharambe
6c69e09c6a Bump version to 0.0.13 2024-09-04 23:10:38 -07:00
Ashwin Bharambe
21bedc1596
[inference] Add a TGI adapter (#52)
* TGI adapter and some refactoring of other inference adapters

* Use the lower-level `generate_stream()` method for correct tool calling

---------

Co-authored-by: Ashwin Bharambe <ashwin@meta.com>
2024-09-04 22:49:33 -07:00
Ashwin Bharambe
6ad7365676 A little clean up for the Fireworks and Together adapters 2024-09-04 22:34:15 -07:00
raghotham
225cd75074
Update cli_reference.md
Made it easier to follow along with numbered steps
2024-09-04 18:50:10 -07:00
Ashwin Bharambe
bfee50aa83 A few more fixes to the OpenAPI generator 2024-09-04 10:29:20 -07:00