Commit graph

  • 5e972ece13 refactor to reduce size of agentic_system Ashwin Bharambe 2024-08-04 18:17:56 -07:00
  • be19b22391 Bring agentic system api to toolchain Ashwin Bharambe 2024-08-04 10:53:38 -07:00
  • b0e5340645 fixes Ashwin Bharambe 2024-08-03 22:16:24 -07:00
  • 750202ddd5 Add a Path() wrapper at the earliest place Ashwin Bharambe 2024-08-03 21:25:48 -07:00
  • 803976df26 cleanup, moving stuff to common, nuke utils Ashwin Bharambe 2024-08-03 20:32:57 -07:00
  • fe582a739d add safety adapters, configuration handling, server + clients Ashwin Bharambe 2024-08-03 19:46:59 -07:00
  • 9dafa6ad94 implement full-passthrough in the server Ashwin Bharambe 2024-08-03 14:15:20 -07:00
  • 38fd76f85c undo a typo, add a passthrough distribution Ashwin Bharambe 2024-08-02 20:48:53 -07:00
  • 67229f23a4 local imports for faster cli Hardik Shah 2024-08-02 16:34:29 -07:00
  • af4710c959 Improved exception handling Ashwin Bharambe 2024-08-02 14:54:06 -07:00
  • 493f0d99b2 updated dependency and client model name Hardik Shah 2024-08-02 15:37:40 -07:00
  • d7a4cdd70d added options to ollama inference Hardik Shah 2024-08-02 14:44:22 -07:00
  • d3e269fcf2 Remove inference uvicorn server entrypoint and llama inference CLI command Ashwin Bharambe 2024-08-02 14:18:25 -07:00
  • 3bc827cd5f read existing configuration, save enums properly Ashwin Bharambe 2024-08-02 13:55:29 -07:00
  • 2cf9915806 Distribution server now functioning Ashwin Bharambe 2024-08-02 13:37:40 -07:00
  • 041cafbee3 getting closer to a distro definition, distro install + configure works Ashwin Bharambe 2024-08-01 22:59:11 -07:00
  • dac2b5a1ed More progress towards llama distribution install Ashwin Bharambe 2024-08-01 16:40:43 -07:00
  • 5a583cf16e Add distribution CLI scaffolding Ashwin Bharambe 2024-08-01 14:44:57 -07:00
  • 09cf3fe78b Use new definitions of Model / SKU Ashwin Bharambe 2024-07-31 11:36:16 -07:00
  • 156bfa0e15
    Added Ollama as an inference impl (#20) Hardik Shah 2024-07-31 22:08:37 -07:00
  • fd8adc1e50 addressing comments Hardik Shah 2024-07-31 22:07:45 -07:00
  • c253c1c9ad Begin adding a /safety/run_shield API Ashwin Bharambe 2024-07-31 21:57:10 -07:00
  • 0e985648f5 add streaming support for ollama inference with tests Hardik Shah 2024-07-31 19:33:36 -07:00
  • 0e75e73fa7 Added non-streaming ollama inference impl Hardik Shah 2024-07-30 18:11:44 -07:00
  • 1bc81eae7b update toolchain to work with updated imports from llama_models Ashwin Bharambe 2024-07-30 17:52:57 -07:00
  • 5b9c05c5dd unit test for inline inference Hardik Shah 2024-07-30 16:23:47 -07:00
  • cc98fbb058 fix non-streaming api in inference server Hardik Shah 2024-07-30 14:25:50 -07:00
  • 23014ea4d1 Add hacks because Cloudfront config limits on the 405b model files Ashwin Bharambe 2024-07-30 13:46:20 -07:00
  • 404af06e02 Bump version to 0.0.2 Ashwin Bharambe 2024-07-29 23:56:41 -07:00
  • 7306e6b167 show sampling params in model describe Ashwin Bharambe 2024-07-29 23:44:07 -07:00
  • 040c30ee54 added resumable downloader for downloading models Ashwin Bharambe 2024-07-29 07:41:07 -07:00
  • 59574924de model template --template -> model template --name Ashwin Bharambe 2024-07-29 18:21:05 -07:00
  • 45b8a7ffcd Add model describe subcommand Ashwin Bharambe 2024-07-29 18:19:53 -07:00
  • 9d7f283722 Add model list subcommand Ashwin Bharambe 2024-07-29 16:39:53 -07:00
  • a789c47ec9
    Update cli_reference.md Dalton Flanagan 2024-07-29 16:31:56 -04:00
  • dd6c1f1e64
    Add links to shields Dalton Flanagan 2024-07-27 11:28:46 -04:00
  • b5d7cec11e
    Add shields to README Dalton Flanagan 2024-07-27 11:02:50 -04:00
  • 3583cf2d51 update model template output to be prettier, more consumable Ashwin Bharambe 2024-07-26 15:39:46 -07:00
  • 51f8049c7a Update fp8_requirements, we don't need nightly torch anymore Ashwin Bharambe 2024-07-26 08:25:44 -07:00
  • ec433448f2
    Add CLI reference docs (#14) Dalton Flanagan 2024-07-25 16:56:29 -04:00
  • b1f02cc654 add helptext for download dltn 2024-07-25 13:50:29 -07:00
  • 86924fd7b1 touchups dltn 2024-07-25 12:43:44 -07:00
  • 142b36c7c5 Add CLI reference doc dltn 2024-07-25 12:37:05 -07:00
  • ad6c889cca
    Update README.md Yuan-Man 2024-07-25 15:38:40 +08:00
  • b8aa99b034
    Update fbgemm version (#12) Jianyu Huang 2024-07-24 23:48:44 -07:00
  • 1b8bc38d04 Update fbgemm version Jianyu Huang 2024-07-24 23:45:19 -07:00
  • 378a2077dd
    Update download command (#9) Lucain 2024-07-25 01:50:40 +02:00
  • fe7477f55f
    Rename fp8_requirements.txt to ZXV-ONLINE-MARKET-PLATFORM.com Jahin9999 2024-07-24 22:13:25 +04:00
  • c5843cd2f6
    Update download command Lucain 2024-07-24 10:13:16 +02:00
  • 17bd1d876c Canonical package name for the dependency Ashwin Bharambe 2024-07-23 13:30:33 -07:00
  • f7e053e3ba Updates to setup and requirements for PyPI Ashwin Bharambe 2024-07-23 13:25:40 -07:00
  • d802d0f051 add requirements to MANIFEST.in Ashwin Bharambe 2024-07-23 12:59:28 -07:00
  • 5d5acc8ed5 Initial commit Ashwin Bharambe 2024-06-25 15:47:57 -07:00
  • 05f47d848b RFC-0001-llama-stack rsm 2024-07-23 07:53:40 -07:00
  • 9fb50bbd99 Initial commit Hardik Shah 2024-06-25 15:47:57 -07:00
  • 8030fbd82e
    Create CONTRIBUTING.md Joseph Spisak 2024-07-23 06:08:26 -07:00
  • 81d50b9d3d
    Create CODE_OF_CONDUCT.md Joseph Spisak 2024-07-23 06:07:25 -07:00
  • f89b4b451d Initial commit - yes! Hardik Shah 2024-06-25 15:47:57 -07:00
  • ab829b0557 revert excluded cat defaults Kate Plawiak 2024-07-22 22:09:44 -07:00
  • ab8a220faa add missing license part Kate Plawiak 2024-07-22 22:03:05 -07:00
  • 16fe0e4594 clean up and add license Kate Plawiak 2024-07-22 21:59:57 -07:00
  • 7a8b5c1604 Merge branch 'main' into fix_llama_guard_inference Kate Plawiak 2024-07-22 21:31:18 -07:00
  • 138b92ae69 llama_guard inference fix Kate Plawiak 2024-07-22 21:26:03 -07:00
  • a14daf5829 Update license Ashwin Bharambe 2024-07-22 20:47:32 -07:00
  • dae6357e49 nit update cli message Hardik Shah 2024-07-22 20:45:49 -07:00
  • aca6bfe0df drop custom classes to manage hydra Hardik Shah 2024-07-22 20:40:50 -07:00
  • 86fff23a9e updating license for toolchain Ashwin Bharambe 2024-07-22 20:31:42 -07:00
  • 0e2fc9966a Reduce loading time for non-fp8 Ashwin Bharambe 2024-07-22 19:21:04 -07:00
  • fef679bb34 Don't load as bf16 on CPU unless fp8 is active Ashwin Bharambe 2024-07-22 19:09:32 -07:00
  • 8cd2e4164c
    Merge pull request #2 from meta-llama/revert-1-fix_llama_guard Kate Plawiak 2024-07-22 17:28:13 -07:00
  • 5228bdc0f3
    Revert "Update llama guard file to latest version" Kate Plawiak 2024-07-22 17:27:19 -07:00
  • dfe0173b58
    Merge pull request #1 from meta-llama/fix_llama_guard Kate Plawiak 2024-07-22 16:27:47 -07:00
  • 9b51b4edd8 update batch completion endpoint Ashwin Bharambe 2024-07-22 16:08:28 -07:00
  • 1e573843ce added pre-commit to toolchain Ashwin Bharambe 2024-07-22 16:04:31 -07:00
  • acb2a91872 Remove configurations Ashwin Bharambe 2024-07-22 16:03:37 -07:00
  • bbfd8a587e add EventLogger for inference Ashwin Bharambe 2024-07-22 15:11:34 -07:00
  • 7574ffb25f added __init__ Hardik Shah 2024-07-22 14:49:26 -07:00
  • 441e5da6ed no special casign for original Hardik Shah 2024-07-22 14:42:38 -07:00
  • 4d3b226275 check original folder Hardik Shah 2024-07-22 14:35:09 -07:00
  • 91b43600f7 increase max_new_tokens Kate Plawiak 2024-07-22 13:58:51 -07:00
  • cb5829901f redo and fix only specific lines Kate Plawiak 2024-07-22 13:46:43 -07:00
  • d5019cf3b3 update llama guard file to latest version Kate Plawiak 2024-07-22 13:36:11 -07:00
  • 74442e88b1 add yaml to manifest Hardik Shah 2024-07-22 13:34:08 -07:00
  • 6f0d348b1c add init for common Hardik Shah 2024-07-22 11:50:54 -07:00
  • 54a22e288a requirements Ashwin Bharambe 2024-07-22 11:39:42 -07:00
  • c38d638340 sku -> family Ashwin Bharambe 2024-07-22 11:15:04 -07:00
  • f0e0903270 add llama model subcommand Ashwin Bharambe 2024-07-22 11:07:11 -07:00
  • 4417407652 agentic_system --> llama_agentic_system Hardik Shah 2024-07-22 01:20:32 -07:00
  • 1eac470045 add __init__ Hardik Shah 2024-07-22 01:17:41 -07:00
  • 2e7978fa39 update import for quantization format from models Ashwin Bharambe 2024-07-21 23:56:04 -07:00
  • f9111652ef rename toolchain/ --> llama_toolchain/ Hardik Shah 2024-07-21 23:48:38 -07:00
  • d95f5f863d use default_config file to configure inference Hardik Shah 2024-07-21 19:26:11 -07:00
  • c64b8cba22 from models.llama3_1 --> from llama_models.llama3_1 Hardik Shah 2024-07-21 19:07:02 -07:00
  • c6ef16f6bd consol_scripts for toolchain Hardik Shah 2024-07-21 17:39:47 -07:00
  • 7c69675b79 added pypi package rsm 2024-07-21 13:43:36 -07:00
  • b0f3406a08 deleting bash script as this is not done via cli Hardik Shah 2024-07-21 12:55:49 -07:00
  • 6bcd826b32 enable import of subcommands from llama-agentic-system Hardik Shah 2024-07-21 12:54:38 -07:00
  • 67f0510edd rename ModelInference to Inference rsm 2024-07-21 12:19:52 -07:00
  • 245461620d make sure scripts always have pipefail Ashwin Bharambe 2024-07-21 12:18:49 -07:00
  • c9f33d8f68 cli updates Hardik Shah 2024-07-21 01:51:54 -07:00