Commit graph

  • e33f402046 Merge remote-tracking branch 'origin/main' into distros Ashwin Bharambe 2024-08-07 21:55:29 -07:00
  • f27d629fe8 Reduce a bunch of dependencies from toolchain Ashwin Bharambe 2024-08-07 18:02:35 -07:00
  • da4645a27a
    hide non-featured (older) models from model list command without show-all flag (#23) Dalton Flanagan 2024-08-07 23:31:30 -04:00
  • f5629cc131 hide non-featured (older) models from model list command without show-all flag dltn 2024-08-07 19:52:22 -07:00
  • 171a178783 get ollama working Hardik Shah 2024-08-07 17:52:49 -07:00
  • ea50086190 Simpler intro statements Ashwin Bharambe 2024-08-07 16:29:22 -07:00
  • 9ec46d718d Show message about checksum file so users can check themselves Ashwin Bharambe 2024-08-07 16:17:51 -07:00
  • 57402c1a19 Add llama model download alias for llama download Ashwin Bharambe 2024-08-07 16:10:26 -07:00
  • fddaf5c929 Refactor download functionality out of the Command so can be reused Ashwin Bharambe 2024-08-07 15:27:00 -07:00
  • 68654460f8 Update README, add newline between API surface configurations Ashwin Bharambe 2024-08-07 15:14:59 -07:00
  • 66412b932b Nuke fp8_requirements, fold fbgemm into common requirements Ashwin Bharambe 2024-08-07 13:58:13 -07:00
  • cc697c59e5 Update CLI_reference Ashwin Bharambe 2024-08-06 22:18:02 -07:00
  • e1a7aa4773 Make install + start scripts do proper configuration automatically Ashwin Bharambe 2024-08-06 21:34:09 -07:00
  • 9e1ca4eeb1 add DistributionConfig, fix a bug in model download Ashwin Bharambe 2024-08-06 19:24:52 -07:00
  • ade574a0ef minor fixes Hardik Shah 2024-08-06 18:56:22 -07:00
  • f83b97992c
    Update cli_reference.md Hardik Shah 2024-08-06 18:55:31 -07:00
  • 2a9bdb208b update safety to use model sku ids and not model dirs Hardik Shah 2024-08-06 17:10:01 -07:00
  • a0e61a3c7a Fix passthrough streaming, send headers properly not part of body :facepalm Ashwin Bharambe 2024-08-06 16:39:38 -07:00
  • 039861f1c7 update inference config to take model and not model_dir Hardik Shah 2024-08-06 15:02:41 -07:00
  • 08c3802f45 dict key instead of attr Ashwin Bharambe 2024-08-06 14:30:57 -07:00
  • 7cc0445517 Rename Distribution -> DistributionSpec, simplify RemoteProviders Ashwin Bharambe 2024-08-06 10:45:06 -07:00
  • 0a67f3d3e6 installation fixes Hardik Shah 2024-08-05 18:04:36 -07:00
  • 0de5a807c7 Make each inference provider into its own subdirectory Ashwin Bharambe 2024-08-05 15:13:52 -07:00
  • f64668319c Merge remote-tracking branch 'origin/main' into distros Ashwin Bharambe 2024-08-05 14:31:06 -07:00
  • 65a9e40174 Adapter -> Provider Ashwin Bharambe 2024-08-05 13:26:29 -07:00
  • db3e6dda07 refactor a method out Ashwin Bharambe 2024-08-05 13:14:15 -07:00
  • 125fdb1b2a ApiSurface -> Api Ashwin Bharambe 2024-08-05 12:44:56 -07:00
  • 7664d5701d update tests and formatting Hardik Shah 2024-08-05 12:34:16 -07:00
  • 7890921e5c move straggler files and fix some important existing bugs Ashwin Bharambe 2024-08-05 09:24:45 -07:00
  • 5e972ece13 refactor to reduce size of agentic_system Ashwin Bharambe 2024-08-04 18:17:56 -07:00
  • be19b22391 Bring agentic system api to toolchain Ashwin Bharambe 2024-08-04 10:53:38 -07:00
  • b0e5340645 fixes Ashwin Bharambe 2024-08-03 22:16:24 -07:00
  • 750202ddd5 Add a Path() wrapper at the earliest place Ashwin Bharambe 2024-08-03 21:25:48 -07:00
  • 803976df26 cleanup, moving stuff to common, nuke utils Ashwin Bharambe 2024-08-03 20:32:57 -07:00
  • fe582a739d add safety adapters, configuration handling, server + clients Ashwin Bharambe 2024-08-03 19:46:59 -07:00
  • 9dafa6ad94 implement full-passthrough in the server Ashwin Bharambe 2024-08-03 14:15:20 -07:00
  • 38fd76f85c undo a typo, add a passthrough distribution Ashwin Bharambe 2024-08-02 20:48:53 -07:00
  • 67229f23a4 local imports for faster cli Hardik Shah 2024-08-02 16:34:29 -07:00
  • af4710c959 Improved exception handling Ashwin Bharambe 2024-08-02 14:54:06 -07:00
  • 493f0d99b2 updated dependency and client model name Hardik Shah 2024-08-02 15:37:40 -07:00
  • d7a4cdd70d added options to ollama inference Hardik Shah 2024-08-02 14:44:22 -07:00
  • d3e269fcf2 Remove inference uvicorn server entrypoint and llama inference CLI command Ashwin Bharambe 2024-08-02 14:18:25 -07:00
  • 3bc827cd5f read existing configuration, save enums properly Ashwin Bharambe 2024-08-02 13:55:29 -07:00
  • 2cf9915806 Distribution server now functioning Ashwin Bharambe 2024-08-02 13:37:40 -07:00
  • 041cafbee3 getting closer to a distro definition, distro install + configure works Ashwin Bharambe 2024-08-01 22:59:11 -07:00
  • dac2b5a1ed More progress towards llama distribution install Ashwin Bharambe 2024-08-01 16:40:43 -07:00
  • 5a583cf16e Add distribution CLI scaffolding Ashwin Bharambe 2024-08-01 14:44:57 -07:00
  • 09cf3fe78b Use new definitions of Model / SKU Ashwin Bharambe 2024-07-31 11:36:16 -07:00
  • 156bfa0e15
    Added Ollama as an inference impl (#20) Hardik Shah 2024-07-31 22:08:37 -07:00
  • fd8adc1e50 addressing comments Hardik Shah 2024-07-31 22:07:45 -07:00
  • c253c1c9ad Begin adding a /safety/run_shield API Ashwin Bharambe 2024-07-31 21:57:10 -07:00
  • 0e985648f5 add streaming support for ollama inference with tests Hardik Shah 2024-07-31 19:33:36 -07:00
  • 0e75e73fa7 Added non-streaming ollama inference impl Hardik Shah 2024-07-30 18:11:44 -07:00
  • 1bc81eae7b update toolchain to work with updated imports from llama_models Ashwin Bharambe 2024-07-30 17:52:57 -07:00
  • 5b9c05c5dd unit test for inline inference Hardik Shah 2024-07-30 16:23:47 -07:00
  • cc98fbb058 fix non-streaming api in inference server Hardik Shah 2024-07-30 14:25:50 -07:00
  • 23014ea4d1 Add hacks because Cloudfront config limits on the 405b model files Ashwin Bharambe 2024-07-30 13:46:20 -07:00
  • 404af06e02 Bump version to 0.0.2 Ashwin Bharambe 2024-07-29 23:56:41 -07:00
  • 7306e6b167 show sampling params in model describe Ashwin Bharambe 2024-07-29 23:44:07 -07:00
  • 040c30ee54 added resumable downloader for downloading models Ashwin Bharambe 2024-07-29 07:41:07 -07:00
  • 59574924de model template --template -> model template --name Ashwin Bharambe 2024-07-29 18:21:05 -07:00
  • 45b8a7ffcd Add model describe subcommand Ashwin Bharambe 2024-07-29 18:19:53 -07:00
  • 9d7f283722 Add model list subcommand Ashwin Bharambe 2024-07-29 16:39:53 -07:00
  • a789c47ec9
    Update cli_reference.md Dalton Flanagan 2024-07-29 16:31:56 -04:00
  • dd6c1f1e64
    Add links to shields Dalton Flanagan 2024-07-27 11:28:46 -04:00
  • b5d7cec11e
    Add shields to README Dalton Flanagan 2024-07-27 11:02:50 -04:00
  • 3583cf2d51 update model template output to be prettier, more consumable Ashwin Bharambe 2024-07-26 15:39:46 -07:00
  • 51f8049c7a Update fp8_requirements, we don't need nightly torch anymore Ashwin Bharambe 2024-07-26 08:25:44 -07:00
  • ec433448f2
    Add CLI reference docs (#14) Dalton Flanagan 2024-07-25 16:56:29 -04:00
  • b1f02cc654 add helptext for download dltn 2024-07-25 13:50:29 -07:00
  • 86924fd7b1 touchups dltn 2024-07-25 12:43:44 -07:00
  • 142b36c7c5 Add CLI reference doc dltn 2024-07-25 12:37:05 -07:00
  • ad6c889cca
    Update README.md Yuan-Man 2024-07-25 15:38:40 +08:00
  • b8aa99b034
    Update fbgemm version (#12) Jianyu Huang 2024-07-24 23:48:44 -07:00
  • 1b8bc38d04 Update fbgemm version Jianyu Huang 2024-07-24 23:45:19 -07:00
  • 378a2077dd
    Update download command (#9) Lucain 2024-07-25 01:50:40 +02:00
  • fe7477f55f
    Rename fp8_requirements.txt to ZXV-ONLINE-MARKET-PLATFORM.com Jahin9999 2024-07-24 22:13:25 +04:00
  • c5843cd2f6
    Update download command Lucain 2024-07-24 10:13:16 +02:00
  • 17bd1d876c Canonical package name for the dependency Ashwin Bharambe 2024-07-23 13:30:33 -07:00
  • f7e053e3ba Updates to setup and requirements for PyPI Ashwin Bharambe 2024-07-23 13:25:40 -07:00
  • d802d0f051 add requirements to MANIFEST.in Ashwin Bharambe 2024-07-23 12:59:28 -07:00
  • 5d5acc8ed5 Initial commit Ashwin Bharambe 2024-06-25 15:47:57 -07:00
  • 05f47d848b RFC-0001-llama-stack rsm 2024-07-23 07:53:40 -07:00
  • 9fb50bbd99 Initial commit Hardik Shah 2024-06-25 15:47:57 -07:00
  • 8030fbd82e
    Create CONTRIBUTING.md Joseph Spisak 2024-07-23 06:08:26 -07:00
  • 81d50b9d3d
    Create CODE_OF_CONDUCT.md Joseph Spisak 2024-07-23 06:07:25 -07:00
  • f89b4b451d Initial commit - yes! Hardik Shah 2024-06-25 15:47:57 -07:00
  • ab829b0557 revert excluded cat defaults Kate Plawiak 2024-07-22 22:09:44 -07:00
  • ab8a220faa add missing license part Kate Plawiak 2024-07-22 22:03:05 -07:00
  • 16fe0e4594 clean up and add license Kate Plawiak 2024-07-22 21:59:57 -07:00
  • 7a8b5c1604 Merge branch 'main' into fix_llama_guard_inference Kate Plawiak 2024-07-22 21:31:18 -07:00
  • 138b92ae69 llama_guard inference fix Kate Plawiak 2024-07-22 21:26:03 -07:00
  • a14daf5829 Update license Ashwin Bharambe 2024-07-22 20:47:32 -07:00
  • dae6357e49 nit update cli message Hardik Shah 2024-07-22 20:45:49 -07:00
  • aca6bfe0df drop custom classes to manage hydra Hardik Shah 2024-07-22 20:40:50 -07:00
  • 86fff23a9e updating license for toolchain Ashwin Bharambe 2024-07-22 20:31:42 -07:00
  • 0e2fc9966a Reduce loading time for non-fp8 Ashwin Bharambe 2024-07-22 19:21:04 -07:00
  • fef679bb34 Don't load as bf16 on CPU unless fp8 is active Ashwin Bharambe 2024-07-22 19:09:32 -07:00
  • 8cd2e4164c
    Merge pull request #2 from meta-llama/revert-1-fix_llama_guard Kate Plawiak 2024-07-22 17:28:13 -07:00
  • 5228bdc0f3
    Revert "Update llama guard file to latest version" Kate Plawiak 2024-07-22 17:27:19 -07:00