Commit graph

  • 05e93bd2f7 together default Xi Yan 2024-11-18 22:39:45 -08:00
  • 7693786322 Use HF names for registering fireworks and together models Ashwin Bharambe 2024-11-18 22:34:26 -08:00
  • 6765fd76ff
    fix llama stack build for together & llama stack build from templates (#479) Xi Yan 2024-11-18 22:29:16 -08:00
  • f967a740be new build msg from template Xi Yan 2024-11-18 22:22:58 -08:00
  • ea52a3ee1c minor enhancement for test fixtures Ashwin Bharambe 2024-11-18 22:20:59 -08:00
  • d25cae3995 fix together build Xi Yan 2024-11-18 22:06:10 -08:00
  • e6ae344eb4 fix together build Xi Yan 2024-11-18 21:42:18 -08:00
  • a042f41d03 fix core model ids for ollama Dinesh Yeduguru 2024-11-18 20:55:54 -08:00
  • fcc2132e6f
    remove pydantic namespace warnings using model_config (#470) Matthew Farrellee 2024-11-18 22:24:14 -05:00
  • 2108a779f2
    Update kotlin client docs (#476) Riandy 2024-11-18 19:13:20 -08:00
  • d2b7c5aeae
    add quantized model ollama support (#471) Kai Wu 2024-11-18 18:55:23 -08:00
  • 14c75c3f21 Update CONTRIBUTING to include info about pre-commit Ashwin Bharambe 2024-11-18 18:17:41 -08:00
  • fe19076838
    get stack run config based on template name (#477) Dinesh Yeduguru 2024-11-18 18:05:05 -08:00
  • edd946a207 remove async Dinesh Yeduguru 2024-11-18 18:02:32 -08:00
  • 50d539e6d7 update tests --inference-model to hf id Xi Yan 2024-11-18 17:36:58 -08:00
  • 939056e265 More documentation fixes Ashwin Bharambe 2024-11-18 17:06:13 -08:00
  • e40404625b Update to docs Ashwin Bharambe 2024-11-18 16:52:48 -08:00
  • 41069b4b05 fix on rebase Dinesh Yeduguru 2024-11-18 16:22:33 -08:00
  • 511346cd41 get stack run config based on template name Dinesh Yeduguru 2024-11-18 16:07:11 -08:00
  • 91f3009c67 No more built_at Ashwin Bharambe 2024-11-18 16:38:51 -08:00
  • afa4f0b19f Update remote vllm docs Ashwin Bharambe 2024-11-18 16:34:33 -08:00
  • fb15ff4a97 Move to use argparse, fix issues with multiple --env cmdline options Ashwin Bharambe 2024-11-18 16:31:59 -08:00
  • b87f3ac499 Allow server to accept --env key pairs Ashwin Bharambe 2024-11-18 16:17:59 -08:00
  • 1fb61137ad Add conda_env Ashwin Bharambe 2024-11-18 16:08:03 -08:00
  • b822149098 Update start conda Ashwin Bharambe 2024-11-18 16:07:27 -08:00
  • 47c37fd831 Fixes Ashwin Bharambe 2024-11-18 16:03:20 -08:00
  • 3aedde2ab4 Add a pre-commit for distro_codegen but it does not work yet Ashwin Bharambe 2024-11-18 15:20:49 -08:00
  • 9d794d8ee7
    Update index.md Riandy 2024-11-18 15:20:55 -08:00
  • 2b93d3d224
    Update README.md Riandy 2024-11-18 15:20:01 -08:00
  • 8ad282a46d
    Update index.md Riandy 2024-11-18 15:06:12 -08:00
  • 57a9b4d57f
    Allow models to be registered as long as llama model is provided (#472) Dinesh Yeduguru 2024-11-18 15:05:29 -08:00
  • 19bf8cc1f9
    Update README.md Riandy 2024-11-18 15:05:22 -08:00
  • 8595b2af85 take hugging face repo Dinesh Yeduguru 2024-11-18 14:57:56 -08:00
  • 2a31163178
    Auto-generate distro yamls + docs (#468) Ashwin Bharambe 2024-11-18 14:57:06 -08:00
  • 5dce17668c Move run-*.yaml to templates/ so they can be packaged Ashwin Bharambe 2024-11-18 14:54:20 -08:00
  • dd732f037f Docs for meta-reference-gpu Ashwin Bharambe 2024-11-18 13:58:12 -08:00
  • 38563d7c00 accept huggingface repo IDs as shield ids for llama guard Ashwin Bharambe 2024-11-18 13:42:13 -08:00
  • c62187c4a2 Adding memory provider test fakes Vladimir Ivic 2024-11-18 13:37:01 -08:00
  • eaec1b132a formated Kai Wu 2024-11-18 13:24:34 -08:00
  • a562668dcd Update Fireworks + Togther documentation Ashwin Bharambe 2024-11-18 12:52:23 -08:00
  • b2630901c3 Fix incorrect ollama port in ollama run.yaml template Vladimir Ivic 2024-11-18 12:39:02 -08:00
  • acf9af841b more validation Dinesh Yeduguru 2024-11-18 12:11:53 -08:00
  • ccb5445d2a Allow models to be registered as long as llama model is provided Dinesh Yeduguru 2024-11-18 11:58:32 -08:00
  • 1ecaf2cb3c Add ollama/pull-models.sh Ashwin Bharambe 2024-11-18 10:57:20 -08:00
  • 0784284ab5
    [Agentic Eval] add ability to run agents generation (#469) Xi Yan 2024-11-18 11:43:03 -08:00
  • ba5d755848 rename Xi Yan 2024-11-18 11:42:02 -08:00
  • 2edfda97e9 add quantized model ollama support Kai Wu 2024-11-18 10:22:50 -08:00
  • 0649cf0af7 remove pydantic namespace warnings using model_config Matthew Farrellee 2024-11-18 09:51:37 -05:00
  • 0d8de1c768 condition Xi Yan 2024-11-17 21:16:10 -08:00
  • 4cb2243b10 add benchmark scoring fn, agent->agents Xi Yan 2024-11-17 21:11:41 -08:00
  • 436afa68be agent w/ search Xi Yan 2024-11-17 20:42:23 -08:00
  • fa1d29cfdc kill built_at field in run config Ashwin Bharambe 2024-11-17 20:42:11 -08:00
  • 8fbbea8c43 refactor Xi Yan 2024-11-17 20:18:26 -08:00
  • b1d119466e Allow setting environment variables from llama stack run and fix ollama Ashwin Bharambe 2024-11-17 19:33:48 -08:00
  • a061f3f8c1 Convert ollama to the new model Ashwin Bharambe 2024-11-17 15:19:55 -08:00
  • 028530546f Convert TGI Ashwin Bharambe 2024-11-17 14:49:41 -08:00
  • 9bb07ce298 Run the script to produce vllm outputs Ashwin Bharambe 2024-11-17 14:09:36 -08:00
  • 0218e68849 Write a script to perform the codegen Ashwin Bharambe 2024-11-17 14:01:04 -08:00
  • f38e76ee98 Adding docker-compose.yaml, starting to simplify Ashwin Bharambe 2024-11-16 10:56:38 -08:00
  • 43262df033 Merge branch 'main' into add-nvidia-inference-adapter Matthew Farrellee 2024-11-15 14:09:12 -05:00
  • f1b9578f8d
    Extend shorthand support for the llama stack run command (#465) Vladimir Ivić 2024-11-15 23:16:42 -08:00
  • a171832152 Shorthand version of the stack run command Vladimir Ivic 2024-11-15 18:36:17 -08:00
  • 57bafd0f8c
    fix faiss serialize and serialize of index (#464) Dinesh Yeduguru 2024-11-15 18:02:48 -08:00
  • fe40cdf28e fix faiss serialize and serialize of index Dinesh Yeduguru 2024-11-15 16:36:34 -08:00
  • ff99025875
    await initialize in faiss (#463) Dinesh Yeduguru 2024-11-15 14:21:31 -08:00
  • e4509cb568 more progress on auto-generation Ashwin Bharambe 2024-11-15 09:35:38 -08:00
  • cfa913fdd5 Start auto-generating { build, run, doc.md } for distributions Ashwin Bharambe 2024-11-14 17:44:45 -08:00
  • 4db7c4d909 await initialize in faiss Dinesh Yeduguru 2024-11-15 12:32:40 -08:00
  • 20bf2f50c2 No more model_id warnings Ashwin Bharambe 2024-11-15 12:20:18 -08:00
  • dbe665ed19 enable streaming support, use openai-python instead of httpx Matthew Farrellee 2024-11-04 10:22:29 -05:00
  • cf3f0b0a33 Add Ollama GPU run file Martin Hickey 2024-11-15 15:50:27 +00:00
  • 73e33fb747 Fix conda env names in distribution example run template Martin Hickey 2024-11-15 15:32:52 +00:00
  • e8112b31ab
    move hf addapter->remote (#459) Xi Yan 2024-11-14 22:41:19 -05:00
  • 3ff80929c1 move hf addapter->remote Xi Yan 2024-11-14 22:36:46 -05:00
  • 788411b680 categorical score for llm as judge Xi Yan 2024-11-14 22:33:20 -05:00
  • 0850ad656a
    unregister for memory banks and remove update API (#458) Dinesh Yeduguru 2024-11-14 17:12:11 -08:00
  • aa93eeb2b7 nuke updates Dinesh Yeduguru 2024-11-14 17:05:09 -08:00
  • 690e525a36 add openapi spec Dinesh Yeduguru 2024-11-14 16:18:12 -08:00
  • 428995286d model update and delete for provider Dinesh Yeduguru 2024-11-14 16:16:44 -08:00
  • e8b699797c add support for provider update and unregister for memory banks Dinesh Yeduguru 2024-11-14 16:08:24 -08:00
  • 2eab3b7ed9 skip aggregation for llm_as_judge Xi Yan 2024-11-14 17:50:46 -05:00
  • 9b75e92852 add update and delete for memory banks Dinesh Yeduguru 2024-11-14 14:44:37 -08:00
  • bba6edd06b Fix OpenAPI generation to have text/event-stream for streamable methods Ashwin Bharambe 2024-11-14 12:51:38 -08:00
  • acbecbf8b3
    Add a verify-download command to llama CLI (#457) Ashwin Bharambe 2024-11-14 11:47:51 -08:00
  • 0ad49a7af1 small fixes Ashwin Bharambe 2024-11-14 11:47:23 -08:00
  • ba8d2369b8 Add a verify-download command to llama CLI Ashwin Bharambe 2024-11-14 09:57:21 -08:00
  • 0713607b68
    Support parallel downloads for llama model download (#448) Ashwin Bharambe 2024-11-14 09:56:22 -08:00
  • 0c750102c6
    Fix build configure deprecation message (#456) Martin Hickey 2024-11-14 17:56:03 +00:00
  • 232d3ee33a Fix build configure deprecation message Martin Hickey 2024-11-14 17:20:08 +00:00
  • 58381dbe78
    local persistence for eval tasks (#453) Xi Yan 2024-11-14 10:36:23 -05:00
  • 46f0b6606a
    init registry once (#450) Dinesh Yeduguru 2024-11-13 22:20:57 -08:00
  • efe791bab7
    Support model resource updates and deletes (#452) Dinesh Yeduguru 2024-11-13 21:55:41 -08:00
  • 05535698e2 fix tests Dinesh Yeduguru 2024-11-13 21:54:43 -08:00
  • 43af05d851 add tests Dinesh Yeduguru 2024-11-13 21:36:24 -08:00
  • 89342d352c remove comment Dinesh Yeduguru 2024-11-13 21:09:18 -08:00
  • 0ba11b82be make update a POST Dinesh Yeduguru 2024-11-13 21:06:16 -08:00
  • 9e68ed3f36 registery to handle updates and deletes Dinesh Yeduguru 2024-11-13 20:50:26 -08:00
  • 4b1b196251 add model update and delete Dinesh Yeduguru 2024-11-13 15:30:17 -08:00
  • 5e6f14f043 merge Xi Yan 2024-11-14 00:12:03 -05:00
  • 4253cfcd7f
    local persistent for hf dataset provider (#451) Xi Yan 2024-11-14 00:08:37 -05:00