llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-08 06:44:32 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	d6fcdefec7	Bump version to 0.0.63	2024-12-17 23:15:27 -08:00
Ashwin Bharambe	eea478618d	Bump version to 0.0.62	2024-12-17 18:19:47 -08:00
Ashwin Bharambe	02b43be9d7	Bump version to 0.0.61	2024-12-10 10:18:44 -08:00
Ashwin Bharambe	1ad691bb04	Bump version to 0.0.60	2024-12-09 22:19:51 -08:00
Ashwin Bharambe	baae4f7b51	Bump version to 0.0.59	2024-12-09 21:22:20 -08:00
Ashwin Bharambe	2c5c73f7ca	Bump version to 0.0.58	2024-12-06 08:36:00 -08:00
dltn	4c7b1a8fb3	Bump version to 0.0.57	2024-12-02 19:48:46 -08:00
Dinesh Yeduguru	fe48b9fb8c	Bump version to 0.0.56	2024-11-30 12:27:31 -08:00
Ashwin Bharambe	45fd73218a	Bump version to 0.0.55	2024-11-23 09:03:58 -08:00
Ashwin Bharambe	2137b0af40	Bump version to 0.0.54	2024-11-21 16:28:30 -08:00
Ashwin Bharambe	dd5466e17d	Bump version to 0.0.53	2024-11-19 16:44:15 -08:00
Ashwin Bharambe	394519d68a	Add llama-stack-client as a legitimate dependency for llama-stack	2024-11-19 11:44:35 -08:00
Xi Yan	f6aaa9c708	Bump version to 0.0.50	2024-11-08 17:28:39 -08:00
Ashwin Bharambe	3ca294c359	Bump version to 0.0.49	2024-11-04 20:38:00 -08:00
Xi Yan	4d60ab8531	Bump version to 0.0.48	2024-11-04 17:37:32 -08:00
Ashwin Bharambe	8a3b64d1be	Bump version to 0.0.47	2024-10-27 22:30:38 -07:00
Ashwin Bharambe	426d821e7f	Bump version to 0.0.46	2024-10-25 13:10:55 -07:00
Ashwin Bharambe	0538cc297e	Bump version to 0.0.45	2024-10-24 12:14:18 -07:00
Ashwin Bharambe	8aa8847b4a	Bump version to 0.0.44	2024-10-24 08:41:39 -07:00
Xi Yan	dbb5ce43fc	Bump version to 0.0.43	2024-10-21 19:10:01 -07:00
Xi Yan	209cd3d35e	Bump version to 0.0.42	2024-10-14 11:13:04 -07:00
Ashwin Bharambe	89d24a07f0	Bump version to 0.0.41	2024-10-10 10:27:03 -07:00
Ashwin Bharambe	bfb0e92034	Bump version to 0.0.40	2024-10-04 09:33:43 -07:00
Ashwin Bharambe	dc75aab547	Add setuptools dependency	2024-10-04 09:30:54 -07:00
Dalton Flanagan	441052b0fd	avoid jq since non-standard on macOS	2024-10-04 10:11:43 -04:00
Dalton Flanagan	9bf2e354ae	CLI now requires jq	2024-10-04 10:05:59 -04:00
Ashwin Bharambe	8d41e6caa9	Bump version to 0.0.39	2024-10-03 11:31:03 -07:00
Ashwin Bharambe	c02a90e4c8	Bump version to 0.0.38	2024-10-03 05:42:47 -07:00
Ashwin Bharambe	9b93ee2c2b	Bump version to 0.0.37	2024-10-02 10:15:08 -07:00
Ashwin Bharambe	a80b707ff8	Ensure we always ask for pydantic>=2	2024-10-02 06:29:06 -07:00
Ashwin Bharambe	c8fa26482d	Bump version to 0.0.36	2024-09-25 11:58:15 -07:00
Ashwin Bharambe	a227edb480	Bump version to 0.0.35	2024-09-25 10:34:59 -07:00
Ashwin Bharambe	56aed59eb4	Support for Llama3.2 models and Swift SDK (#98 )	2024-09-25 10:29:58 -07:00
Ashwin Bharambe	7b35a4c827	Bump version to 0.0.24	2024-09-24 10:15:20 -07:00
Ashwin Bharambe	cd850c16de	Bump version to 0.0.23	2024-09-24 09:08:40 -07:00
Ashwin Bharambe	9eb5ec3e4b	Bump version to 0.0.21	2024-09-23 14:23:21 -07:00
Xi Yan	21058be0c1	Bump version to 0.0.19	2024-09-18 15:48:38 -07:00
Hardik Shah	29ce73ff7a	update requirements, added prompt-toolkit	2024-09-18 15:21:45 -07:00
Ashwin Bharambe	81ff7476d3	Bump version to 0.0.18	2024-09-17 20:08:04 -07:00
Ashwin Bharambe	9487ad8294	API Updates (#73 ) * API Keys passed from Client instead of distro configuration * delete distribution registry * Rename the "package" word away * Introduce a "Router" layer for providers Some providers need to be factorized and considered as thin routing layers on top of other providers. Consider two examples: - The inference API should be a routing layer over inference providers, routed using the "model" key - The memory banks API is another instance where various memory bank types will be provided by independent providers (e.g., a vector store is served by Chroma while a keyvalue memory can be served by Redis or PGVector) This commit introduces a generalized routing layer for this purpose. * update `apis_to_serve` * llama_toolchain -> llama_stack * Codemod from llama_toolchain -> llama_stack - added providers/registry - cleaned up api/ subdirectories and moved impls away - restructured api/api.py - from llama_stack.apis.<api> import foo should work now - update imports to do llama_stack.apis.<api> - update many other imports - added __init__, fixed some registry imports - updated registry imports - create_agentic_system -> create_agent - AgenticSystem -> Agent * Moved some stuff out of common/; re-generated OpenAPI spec * llama-toolchain -> llama-stack (hyphens) * add control plane API * add redis adapter + sqlite provider * move core -> distribution * Some more toolchain -> stack changes * small naming shenanigans * Removing custom tool and agent utilities and moving them client side * Move control plane to distribution server for now * Remove control plane from API list * no codeshield dependency randomly plzzzzz * Add "fire" as a dependency * add back event loggers * stack configure fixes * use brave instead of bing in the example client * add init file so it gets packaged * add init files so it gets packaged * Update MANIFEST * bug fix --------- Co-authored-by: Hardik Shah <hjshah@fb.com> Co-authored-by: Xi Yan <xiyan@meta.com> Co-authored-by: Ashwin Bharambe <ashwin@meta.com>	2024-09-17 19:51:35 -07:00
Xi Yan	f294eac5f5	Bump version to 0.0.17	2024-09-16 13:10:05 -07:00
Ashwin Bharambe	53ab18d6bb	Bump version to 0.0.16	2024-09-14 08:09:45 -07:00
Ashwin Bharambe	7a283ea076	Bump version to 0.0.15	2024-09-13 17:23:12 -07:00
Xi Yan	6a863f9b78	Bump version to 0.0.14	2024-09-12 21:24:07 -07:00
Yufei (Benny) Chen	406c3b24d4	upgrade llama_models (#55 )	2024-09-06 12:03:13 -07:00
Ashwin Bharambe	7bc7785b0d	API Updates: fleshing out RAG APIs, introduce "llama stack" CLI command (#51 ) * add tools to chat completion request * use templates for generating system prompts * Moved ToolPromptFormat and jinja templates to llama_models.llama3.api * <WIP> memory changes - inlined AgenticSystemInstanceConfig so API feels more ergonomic - renamed it to AgentConfig, AgentInstance -> Agent - added a MemoryConfig and `memory` parameter - added `attachments` to input and `output_attachments` to the response - some naming changes * InterleavedTextAttachment -> InterleavedTextMedia, introduce memory tool * flesh out memory banks API * agentic loop has a RAG implementation * faiss provider implementation * memory client works * re-work tool definitions, fix FastAPI issues, fix tool regressions * fix agentic_system utils * basic RAG seems to work * small bug fixes for inline attachments * Refactor custom tool execution utilities * Bug fix, show memory retrieval steps in EventLogger * No need for api_key for Remote providers * add special unicode character ↵ to showcase newlines in model prompt templates * remove api.endpoints imports * combine datatypes.py and endpoints.py into api.py * Attachment / add TTL api * split batch_inference from inference * minor import fixes * use a single impl for ChatFormat.decode_assistant_mesage * use interleaved_text_media_as_str() utilityt * Fix api.datatypes imports * Add blobfile for tiktoken * Add ToolPromptFormat to ChatFormat.encode_message so that tools are encoded properly * templates take optional --format={json,function_tag} * Rag Updates * Add `api build` subcommand -- WIP * fix * build + run image seems to work * <WIP> adapters * bunch more work to make adapters work * api build works for conda now * ollama remote adapter works * Several smaller fixes to make adapters work Also, reorganized the pattern of __init__ inside providers so configuration can stay lightweight * llama distribution -> llama stack + containers (WIP) * All the new CLI for api + stack work * Make Fireworks and Together into the Adapter format * Some quick fixes to the CLI behavior to make it consistent * Updated README phew * Update cli_reference.md * llama_toolchain/distribution -> llama_toolchain/core * Add termcolor * update paths * Add a log just for consistency * chmod +x scripts * Fix api dependencies not getting added to configuration * missing import lol * Delete utils.py; move to agentic system * Support downloading of URLs for attachments for code interpreter * Simplify and generalize `llama api build` yay * Update `llama stack configure` to be very simple also * Fix stack start * Allow building an "adhoc" distribution * Remote `llama api []` subcommands * Fixes to llama stack commands and update docs * Update documentation again and add error messages to llama stack start * llama stack start -> llama stack run * Change name of build for less confusion * Add pyopenapi fork to the repository, update RFC assets * Remove conflicting annotation * Added a "--raw" option for model template printing --------- Co-authored-by: Hardik Shah <hjshah@fb.com> Co-authored-by: Ashwin Bharambe <ashwin@meta.com> Co-authored-by: Dalton Flanagan <6599399+dltn@users.noreply.github.com>	2024-09-03 22:39:39 -07:00
Ashwin Bharambe	870cd7bb8b	Add blobfile for tiktoken	2024-08-26 14:50:53 -07:00
Hardik Shah	37da47ef8e	upgrade pydantic to latest	2024-08-12 15:14:21 -07:00
Ashwin Bharambe	e830814399	Introduce Llama stack distributions (#22 ) * Add distribution CLI scaffolding * More progress towards `llama distribution install` * getting closer to a distro definition, distro install + configure works * Distribution server now functioning * read existing configuration, save enums properly * Remove inference uvicorn server entrypoint and llama inference CLI command * updated dependency and client model name * Improved exception handling * local imports for faster cli * undo a typo, add a passthrough distribution * implement full-passthrough in the server * add safety adapters, configuration handling, server + clients * cleanup, moving stuff to common, nuke utils * Add a Path() wrapper at the earliest place * fixes * Bring agentic system api to toolchain Add adapter dependencies and resolve adapters using a topological sort * refactor to reduce size of `agentic_system` * move straggler files and fix some important existing bugs * ApiSurface -> Api * refactor a method out * Adapter -> Provider * Make each inference provider into its own subdirectory * installation fixes * Rename Distribution -> DistributionSpec, simplify RemoteProviders * dict key instead of attr * update inference config to take model and not model_dir * Fix passthrough streaming, send headers properly not part of body :facepalm * update safety to use model sku ids and not model dirs * Update cli_reference.md * minor fixes * add DistributionConfig, fix a bug in model download * Make install + start scripts do proper configuration automatically * Update CLI_reference * Nuke fp8_requirements, fold fbgemm into common requirements * Update README, add newline between API surface configurations * Refactor download functionality out of the Command so can be reused * Add `llama model download` alias for `llama download` * Show message about checksum file so users can check themselves * Simpler intro statements * get ollama working * Reduce a bunch of dependencies from toolchain Some improvements to the distribution install script * Avoid using `conda run` since it buffers everything * update dependencies and rely on LLAMA_TOOLCHAIN_DIR for dev purposes * add validation for configuration input * resort imports * make optional subclasses default to yes for configuration * Remove additional_pip_packages; move deps to providers * for inline make 8b model the default * Add scripts to MANIFEST * allow installing from test.pypi.org * Fix #2 to help with testing packages * Must install llama-models at that same version first * fix PIP_ARGS --------- Co-authored-by: Hardik Shah <hjshah@fb.com> Co-authored-by: Hardik Shah <hjshah@meta.com>	2024-08-08 13:38:41 -07:00
Hardik Shah	156bfa0e15	Added Ollama as an inference impl (#20 ) * fix non-streaming api in inference server * unit test for inline inference * Added non-streaming ollama inference impl * add streaming support for ollama inference with tests * addressing comments --------- Co-authored-by: Hardik Shah <hjshah@fb.com>	2024-07-31 22:08:37 -07:00

1 2

53 commits