Commit graph

  • 0e985648f5 add streaming support for ollama inference with tests Hardik Shah 2024-07-31 19:33:36 -07:00
  • 0e75e73fa7 Added non-streaming ollama inference impl Hardik Shah 2024-07-30 18:11:44 -07:00
  • 1bc81eae7b update toolchain to work with updated imports from llama_models Ashwin Bharambe 2024-07-30 17:52:57 -07:00
  • 5b9c05c5dd unit test for inline inference Hardik Shah 2024-07-30 16:23:47 -07:00
  • cc98fbb058 fix non-streaming api in inference server Hardik Shah 2024-07-30 14:25:50 -07:00
  • 23014ea4d1 Add hacks because Cloudfront config limits on the 405b model files Ashwin Bharambe 2024-07-30 13:46:20 -07:00
  • 404af06e02 Bump version to 0.0.2 Ashwin Bharambe 2024-07-29 23:56:41 -07:00
  • 7306e6b167 show sampling params in model describe Ashwin Bharambe 2024-07-29 23:44:07 -07:00
  • 040c30ee54 added resumable downloader for downloading models Ashwin Bharambe 2024-07-29 07:41:07 -07:00
  • 59574924de model template --template -> model template --name Ashwin Bharambe 2024-07-29 18:21:05 -07:00
  • 45b8a7ffcd Add model describe subcommand Ashwin Bharambe 2024-07-29 18:19:53 -07:00
  • 9d7f283722 Add model list subcommand Ashwin Bharambe 2024-07-29 16:39:53 -07:00
  • a789c47ec9
    Update cli_reference.md Dalton Flanagan 2024-07-29 16:31:56 -04:00
  • dd6c1f1e64
    Add links to shields Dalton Flanagan 2024-07-27 11:28:46 -04:00
  • b5d7cec11e
    Add shields to README Dalton Flanagan 2024-07-27 11:02:50 -04:00
  • 3583cf2d51 update model template output to be prettier, more consumable Ashwin Bharambe 2024-07-26 15:39:46 -07:00
  • 51f8049c7a Update fp8_requirements, we don't need nightly torch anymore Ashwin Bharambe 2024-07-26 08:25:44 -07:00
  • ec433448f2
    Add CLI reference docs (#14) Dalton Flanagan 2024-07-25 16:56:29 -04:00
  • b1f02cc654 add helptext for download dltn 2024-07-25 13:50:29 -07:00
  • 86924fd7b1 touchups dltn 2024-07-25 12:43:44 -07:00
  • 142b36c7c5 Add CLI reference doc dltn 2024-07-25 12:37:05 -07:00
  • ad6c889cca
    Update README.md Yuan-Man 2024-07-25 15:38:40 +08:00
  • b8aa99b034
    Update fbgemm version (#12) Jianyu Huang 2024-07-24 23:48:44 -07:00
  • 1b8bc38d04 Update fbgemm version Jianyu Huang 2024-07-24 23:45:19 -07:00
  • 378a2077dd
    Update download command (#9) Lucain 2024-07-25 01:50:40 +02:00
  • fe7477f55f
    Rename fp8_requirements.txt to ZXV-ONLINE-MARKET-PLATFORM.com Jahin9999 2024-07-24 22:13:25 +04:00
  • c5843cd2f6
    Update download command Lucain 2024-07-24 10:13:16 +02:00
  • 17bd1d876c Canonical package name for the dependency Ashwin Bharambe 2024-07-23 13:30:33 -07:00
  • f7e053e3ba Updates to setup and requirements for PyPI Ashwin Bharambe 2024-07-23 13:25:40 -07:00
  • d802d0f051 add requirements to MANIFEST.in Ashwin Bharambe 2024-07-23 12:59:28 -07:00
  • 5d5acc8ed5 Initial commit Ashwin Bharambe 2024-06-25 15:47:57 -07:00
  • 05f47d848b RFC-0001-llama-stack rsm 2024-07-23 07:53:40 -07:00
  • 9fb50bbd99 Initial commit Hardik Shah 2024-06-25 15:47:57 -07:00
  • 8030fbd82e
    Create CONTRIBUTING.md Joseph Spisak 2024-07-23 06:08:26 -07:00
  • 81d50b9d3d
    Create CODE_OF_CONDUCT.md Joseph Spisak 2024-07-23 06:07:25 -07:00
  • f89b4b451d Initial commit - yes! Hardik Shah 2024-06-25 15:47:57 -07:00
  • ab829b0557 revert excluded cat defaults Kate Plawiak 2024-07-22 22:09:44 -07:00
  • ab8a220faa add missing license part Kate Plawiak 2024-07-22 22:03:05 -07:00
  • 16fe0e4594 clean up and add license Kate Plawiak 2024-07-22 21:59:57 -07:00
  • 7a8b5c1604 Merge branch 'main' into fix_llama_guard_inference Kate Plawiak 2024-07-22 21:31:18 -07:00
  • 138b92ae69 llama_guard inference fix Kate Plawiak 2024-07-22 21:26:03 -07:00
  • a14daf5829 Update license Ashwin Bharambe 2024-07-22 20:47:32 -07:00
  • dae6357e49 nit update cli message Hardik Shah 2024-07-22 20:45:49 -07:00
  • aca6bfe0df drop custom classes to manage hydra Hardik Shah 2024-07-22 20:40:50 -07:00
  • 86fff23a9e updating license for toolchain Ashwin Bharambe 2024-07-22 20:31:42 -07:00
  • 0e2fc9966a Reduce loading time for non-fp8 Ashwin Bharambe 2024-07-22 19:21:04 -07:00
  • fef679bb34 Don't load as bf16 on CPU unless fp8 is active Ashwin Bharambe 2024-07-22 19:09:32 -07:00
  • 8cd2e4164c
    Merge pull request #2 from meta-llama/revert-1-fix_llama_guard Kate Plawiak 2024-07-22 17:28:13 -07:00
  • 5228bdc0f3
    Revert "Update llama guard file to latest version" Kate Plawiak 2024-07-22 17:27:19 -07:00
  • dfe0173b58
    Merge pull request #1 from meta-llama/fix_llama_guard Kate Plawiak 2024-07-22 16:27:47 -07:00
  • 9b51b4edd8 update batch completion endpoint Ashwin Bharambe 2024-07-22 16:08:28 -07:00
  • 1e573843ce added pre-commit to toolchain Ashwin Bharambe 2024-07-22 16:04:31 -07:00
  • acb2a91872 Remove configurations Ashwin Bharambe 2024-07-22 16:03:37 -07:00
  • bbfd8a587e add EventLogger for inference Ashwin Bharambe 2024-07-22 15:11:34 -07:00
  • 7574ffb25f added __init__ Hardik Shah 2024-07-22 14:49:26 -07:00
  • 441e5da6ed no special casign for original Hardik Shah 2024-07-22 14:42:38 -07:00
  • 4d3b226275 check original folder Hardik Shah 2024-07-22 14:35:09 -07:00
  • 91b43600f7 increase max_new_tokens Kate Plawiak 2024-07-22 13:58:51 -07:00
  • cb5829901f redo and fix only specific lines Kate Plawiak 2024-07-22 13:46:43 -07:00
  • d5019cf3b3 update llama guard file to latest version Kate Plawiak 2024-07-22 13:36:11 -07:00
  • 74442e88b1 add yaml to manifest Hardik Shah 2024-07-22 13:34:08 -07:00
  • 6f0d348b1c add init for common Hardik Shah 2024-07-22 11:50:54 -07:00
  • 54a22e288a requirements Ashwin Bharambe 2024-07-22 11:39:42 -07:00
  • c38d638340 sku -> family Ashwin Bharambe 2024-07-22 11:15:04 -07:00
  • f0e0903270 add llama model subcommand Ashwin Bharambe 2024-07-22 11:07:11 -07:00
  • 4417407652 agentic_system --> llama_agentic_system Hardik Shah 2024-07-22 01:20:32 -07:00
  • 1eac470045 add __init__ Hardik Shah 2024-07-22 01:17:41 -07:00
  • 2e7978fa39 update import for quantization format from models Ashwin Bharambe 2024-07-21 23:56:04 -07:00
  • f9111652ef rename toolchain/ --> llama_toolchain/ Hardik Shah 2024-07-21 23:48:38 -07:00
  • d95f5f863d use default_config file to configure inference Hardik Shah 2024-07-21 19:26:11 -07:00
  • c64b8cba22 from models.llama3_1 --> from llama_models.llama3_1 Hardik Shah 2024-07-21 19:07:02 -07:00
  • c6ef16f6bd consol_scripts for toolchain Hardik Shah 2024-07-21 17:39:47 -07:00
  • 7c69675b79 added pypi package rsm 2024-07-21 13:43:36 -07:00
  • b0f3406a08 deleting bash script as this is not done via cli Hardik Shah 2024-07-21 12:55:49 -07:00
  • 6bcd826b32 enable import of subcommands from llama-agentic-system Hardik Shah 2024-07-21 12:54:38 -07:00
  • 67f0510edd rename ModelInference to Inference rsm 2024-07-21 12:19:52 -07:00
  • 245461620d make sure scripts always have pipefail Ashwin Bharambe 2024-07-21 12:18:49 -07:00
  • c9f33d8f68 cli updates Hardik Shah 2024-07-21 01:51:54 -07:00
  • 23fe353e4a cli -- llama inference configure Hardik Shah 2024-07-21 01:16:44 -07:00
  • 0df57c4447 fix bad merge with injection shield? Ashwin Bharambe 2024-07-20 23:54:32 -07:00
  • 2408bd81c8 easy script to create config Hardik Shah 2024-07-20 23:51:35 -07:00
  • 7c9ed3e58e update README a bit Ashwin Bharambe 2024-07-20 23:26:50 -07:00
  • d73fed5cc3 cleanup for fp8 and requirements etc Ashwin Bharambe 2024-07-20 23:21:41 -07:00
  • 2428701951 download inside model_name directory Hardik Shah 2024-07-20 23:16:10 -07:00
  • 0746a0f62b fp8 inference Ashwin Bharambe 2024-07-20 23:13:47 -07:00
  • ad62e2e1f3 make inference server load checkpoints for fp8 inference Ashwin Bharambe 2024-07-20 21:10:17 -07:00
  • 7d2c0b14b8 Changes from the main repo Ashwin Bharambe 2024-07-19 16:11:17 -07:00
  • 9c9b834c0f update prompt-shield to reflect latest changes in agentic Hardik Shah 2024-07-19 18:12:09 -07:00
  • ce0804556b update requirements for running standalone Hardik Shah 2024-07-19 18:11:25 -07:00
  • 2ed2881a21 fixed imports models.llama3. --> models.llama3_1.api. Hardik Shah 2024-07-19 17:42:14 -07:00
  • f94efcf2ee kill older junk Ashwin Bharambe 2024-07-19 12:32:22 -07:00
  • 95781ec85d Add toolchain from agentic system here Ashwin Bharambe 2024-07-19 12:30:35 -07:00
  • f6b2b2fb39 cleanup Ashwin Bharambe 2024-07-11 10:04:56 -07:00
  • 6d6c07b882 added more docs Raghotham Murthy 2024-07-11 03:12:28 -07:00
  • 8631d90f1e added more docs Raghotham Murthy 2024-07-11 03:11:45 -07:00
  • e657e71446 added more docs Raghotham Murthy 2024-07-11 03:10:30 -07:00
  • ab44e9c862 added more docs Raghotham Murthy 2024-07-11 03:09:13 -07:00
  • 62f2db8f62
    saving the spec changes raghotham 2024-07-11 05:02:16 -04:00
  • 0e4b9efedf added more docs Raghotham Murthy 2024-07-11 01:54:03 -07:00
  • 9070d45629 added more docs Raghotham Murthy 2024-07-11 01:48:13 -07:00