Commit graph

  • 803bf0e029
    fix: solve ruff B008 warnings (#1444) Sébastien Han 2025-03-07 01:48:35 +01:00
  • ebc8258038 Merge remote-tracking branch 'origin/main' into benchmark_eval Botao Chen 2025-03-06 15:59:03 -08:00
  • 81ff6ecff4 rag eval notebook Xi Yan 2025-03-06 15:50:43 -08:00
  • 3a454be9b2
    docs: add back eval concept doc (#1456) Xi Yan 2025-03-06 15:47:20 -08:00
  • 2dc2101b73 eval concept Xi Yan 2025-03-06 15:43:24 -08:00
  • 39f0636c6c eval concept Xi Yan 2025-03-06 15:41:13 -08:00
  • 9e2aeae3ea refine Botao Chen 2025-03-06 15:25:37 -08:00
  • ca2910d27a
    docs: update test_agents to use new Agent SDK API (#1402) ehhuang 2025-03-06 15:21:12 -08:00
  • 198158a7fb docs: update test_agents to use new Agent SDK API Eric Huang 2025-03-06 15:12:52 -08:00
  • 3d71e5a036
    test: recordable mocks use json only (#1443) ehhuang 2025-03-06 14:46:29 -08:00
  • 564977c646
    docs: update eval doc (#1453) Xi Yan 2025-03-06 14:14:10 -08:00
  • db4ee7a9ff
    docs: improve rag doc (#1411) Reid 2025-03-07 06:03:52 +08:00
  • 1a95271fab
    fix: notebook vision inference (#1423) Xi Yan 2025-03-06 13:40:21 -08:00
  • d342a53ae0 use library client throughout Xi Yan 2025-03-06 12:58:30 -08:00
  • b464575a1e more fix Xi Yan 2025-03-06 12:54:03 -08:00
  • 000569b003 benchmark Xi Yan 2025-03-06 12:43:25 -08:00
  • 275fdbc23f Fixed a few issues in the docling provider. ilya-kolchinsky 2025-03-06 20:51:37 +01:00
  • 47fea967a7 update doc Xi Yan 2025-03-06 11:48:40 -08:00
  • e53bdc929a test: use json only Eric Huang 2025-03-06 11:43:26 -08:00
  • 46bc5f4a7a
    chore: log exception (#1452) ehhuang 2025-03-06 11:42:51 -08:00
  • 0db524cc26 add test cases for customizer Ubuntu 2025-03-06 19:34:02 +00:00
  • 4bbb4ddeae
    fix: resolve pydantic warning on .dict() usage (#1445) Sébastien Han 2025-03-06 20:27:47 +01:00
  • da2971005a chore: log exception Eric Huang 2025-03-06 11:25:00 -08:00
  • e8071b54dc fix: no skip_logger_removal for non-library client Ashwin Bharambe 2025-03-06 11:04:56 -08:00
  • 14c9ebbae5
    docs: Add CHANGELOG.md (#1440) Yuan Tang 2025-03-06 13:57:24 -05:00
  • 8d86137ab2
    docs: add information on how to set log level before running (#1430) Charlie Doern 2025-03-06 13:54:14 -05:00
  • bcb13c492f
    test: revamp eval related integration tests (#1433) Xi Yan 2025-03-06 10:51:35 -08:00
  • 103a3b1a4f add nvidia distribution Ubuntu 2025-03-06 18:26:53 +00:00
  • b5c6a80b2e linting fix for templates Ubuntu 2025-03-06 16:42:57 +00:00
  • a799d96a2c add latest code Ubuntu 2025-03-06 16:33:51 +00:00
  • f10a412898 Fixed multiple bugs. ilya-kolchinsky 2025-03-06 16:46:59 +01:00
  • fcb52fa3a4 fix: Import chardet and pypdf only when actually needed Ihar Hrachyshka 2025-03-06 10:25:24 -05:00
  • 5540c1a956 chore: update the config file name reidliu 2025-03-06 21:41:59 +08:00
  • 6cbc298edb Added the preprocessing chain parameter to the RAG tool insert API. ilya-kolchinsky 2025-03-06 14:22:19 +01:00
  • 4c81a72214 Added output type to PreprocessorResponse. ilya-kolchinsky 2025-03-06 14:05:05 +01:00
  • 5524210a2d docs: add information on how to set log level before running Charlie Doern 2025-03-05 16:41:01 -05:00
  • 3fc37b7be3
    fix: resolve pydantic warning on .dict() usage Sébastien Han 2025-03-06 11:41:03 +01:00
  • e622799f38
    fix: solve ruff B008 warnings Sébastien Han 2025-03-06 11:28:13 +01:00
  • 4599ee68cd
    fix: remove ruff N999 Sébastien Han 2025-03-05 09:56:50 +01:00
  • 82e94fe22f
    ci: add Github workflow which runs unittests in PR (#1442) Ashwin Bharambe 2025-03-05 18:23:28 -08:00
  • cde74939a9 ci: add Github workflow which runs unittests in PR Ashwin Bharambe 2025-03-05 18:17:23 -08:00
  • e6ae557661 fix: update testing documentation Ashwin Bharambe 2025-03-05 17:41:13 -08:00
  • 72dee96300 merge Xi Yan 2025-03-05 17:40:32 -08:00
  • 9066b2ac12 fix eval Xi Yan 2025-03-05 17:37:19 -08:00
  • 20fc6d4267
    docs: Add CHANGELOG.md Yuan Tang 2025-03-05 20:36:52 -05:00
  • 62a844c614 fix eval Xi Yan 2025-03-05 17:36:37 -08:00
  • 2541dcc162 fix eval Xi Yan 2025-03-05 17:36:02 -08:00
  • 6e65b9282d work eval Xi Yan 2025-03-05 17:12:47 -08:00
  • 2fe976ed0a
    refactor(test): introduce --stack-config and simplify options (#1404) Ashwin Bharambe 2025-03-05 17:02:02 -08:00
  • fd68b0dc9a tmp eval Xi Yan 2025-03-05 16:41:37 -08:00
  • 54abeeebce default text model Xi Yan 2025-03-05 16:24:43 -08:00
  • 5d43b9157e fix scoring Xi Yan 2025-03-05 16:20:11 -08:00
  • 546a417b09 fix scoring Xi Yan 2025-03-05 16:05:39 -08:00
  • f1e4588b0a fix report to at least not barf Ashwin Bharambe 2025-03-05 15:55:25 -08:00
  • f2464050c7 add registeration test Xi Yan 2025-03-05 15:47:20 -08:00
  • a0d6b165b0
    chore: remove unused build dir (#1379) Reid 2025-03-06 07:40:00 +08:00
  • 4d4be03176
    fix: don't import from llama_models (#1436) Ihar Hrachyshka 2025-03-05 18:30:38 -05:00
  • 4f82d361a8 update README.md Ashwin Bharambe 2025-03-05 15:22:53 -08:00
  • c19350f4ed support multiple model ids for testing Ashwin Bharambe 2025-03-05 11:21:53 -08:00
  • 113b17679d kill safety conftest Ashwin Bharambe 2025-03-04 16:54:29 -08:00
  • 8d49a10c8e remove some code from report.py which has been disabled for now Ashwin Bharambe 2025-03-04 16:53:41 -08:00
  • cd9d278d12 refactor(test): introduce --stack-config and simplify options Ashwin Bharambe 2025-03-04 16:35:53 -08:00
  • 5b0ec561dc fix: don't import from llama_models Ihar Hrachyshka 2025-03-05 18:23:19 -05:00
  • 1806de366f feat: add list built stacks reidliu 2025-03-06 07:20:33 +08:00
  • 0385309862 update provider id Xi Yan 2025-03-05 15:18:00 -08:00
  • 2091585843 unregister fix Xi Yan 2025-03-05 15:10:29 -08:00
  • 7f34968b73 datasetio pass Xi Yan 2025-03-05 14:59:03 -08:00
  • 6cf79437b3
    feat: support ClientTool output metadata (#1426) ehhuang 2025-03-05 14:30:27 -08:00
  • 28400dd205 feat: ClientTool output metadata Eric Huang 2025-03-05 14:29:32 -08:00
  • 3f105a8e38 RFC: Provider Configuration API Charlie Doern 2025-03-03 14:31:51 -05:00
  • ac717f38dc
    chore: Reduce flakes in test_text_inference on smaller models (#1428) Ben Browning 2025-03-05 16:05:30 -05:00
  • b8535417e0
    feat: record token usage for inference API (#1300) Dinesh Yeduguru 2025-03-05 12:41:45 -08:00
  • 0992fbc59a dont emit metrics when no metrics store configured Dinesh Yeduguru 2025-03-05 12:39:34 -08:00
  • 52ef06b9a8 chore: Reduce flakes in test_text_inference on smaller models Ben Browning 2025-03-05 14:48:16 -05:00
  • 9c4074ed49
    fix: Gracefully handle no choices in remote vLLM response (#1424) Ben Browning 2025-03-05 15:07:54 -05:00
  • bcc5370d2e
    feat: effective agent workflow notebook (#1372) Xi Yan 2025-03-05 11:53:25 -08:00
  • 1c6fbd95a5
    fix: regex parser to support more answer formats (#1425) yyymeta 2025-03-05 11:52:07 -08:00
  • 67071969a2 update Xi Yan 2025-03-05 11:36:01 -08:00
  • 306a2d2bff refactor Dinesh Yeduguru 2025-03-05 11:26:29 -08:00
  • 1bf40af2ab update notebook not use together's endpoint Xi Yan 2025-03-05 11:18:37 -08:00
  • 2c30f6eb1d update notebook not use together's endpoint Xi Yan 2025-03-05 11:07:37 -08:00
  • 7dd43d2862 fix: Gracefully handle no choices in remote vLLM response Ben Browning 2025-03-05 14:02:07 -05:00
  • d2c1162021 use encode_content for completion token usage Dinesh Yeduguru 2025-03-05 11:00:00 -08:00
  • 1952ffa410 feat: record token usage for inference API Dinesh Yeduguru 2025-02-27 10:57:08 -08:00
  • 7f92f308d2 fix: regex parser to support more answer formats Yang Yang 2025-03-05 10:21:31 -08:00
  • 00570fde31
    chore: Get sqlite_vec and vector_store unit tests passing (#1413) Ben Browning 2025-03-05 13:20:13 -05:00
  • 77d323c2f8
    docs: fix typo (#1416) Reid 2025-03-06 02:02:32 +08:00
  • d3508c4c76
    feat(1/n): scoring function registration for llm-as-judge (#1405) Xi Yan 2025-03-05 10:00:34 -08:00
  • f55b19e0d0 update registration Xi Yan 2025-03-05 09:53:40 -08:00
  • 3d9331840e
    docs: api documentation for agents/eval/scoring/datasets (#1400) Xi Yan 2025-03-05 09:40:24 -08:00
  • 0d18274d34
    chore: update hf source for eval notebook (#1403) Xi Yan 2025-03-05 09:38:30 -08:00
  • 24a27baf7c
    chore: Make README code blocks more easily copy pastable (#1420) Ellis Tarn 2025-03-05 09:11:01 -08:00
  • 62e0745772 chore: Make README code blocks more easily copy pastable Ellis Tarn 2025-03-05 09:01:59 -08:00
  • b981181b25 Added a draft implementation of the preprocessor chain. ilya-kolchinsky 2025-03-05 17:17:17 +01:00
  • 7a3fc77737 doc: fix typo reidliu 2025-03-05 23:09:16 +08:00
  • 2762404910 fix: Fix unit tests of datasetio providers Josh Salomon 2025-03-05 15:34:17 +02:00
  • cdac705165 chore: Get sqlite_vec and vector_store unit tests passing Ben Browning 2025-03-05 09:04:45 -05:00
  • b5321924a5 docs: improve rag example code in doc reidliu 2025-03-05 21:41:59 +08:00
  • b52a265a51
    Merge branch 'meta-llama:main' into main Francisco Arceo 2025-03-05 08:23:47 -05:00
  • 16764a2f06 Initial implementation of RAG operator using the preprocessing endpoint. ilya-kolchinsky 2025-03-05 13:43:26 +01:00