Commit graph

  • e9d9f01b8b
    docs: Add OpenAI API compatibility page (#2316) Ben Browning 2025-06-04 06:51:52 -04:00
  • db2fb7e3c4 feat: Add experimental integration tests with cached providers Derek Higgins 2025-05-14 14:41:25 +01:00
  • c4f644a1ea Remove remote::openai from openai_completion support Derek Higgins 2025-06-04 09:08:30 +01:00
  • 8afad3f63a refactor: unify stream and non-stream impls for responses Ashwin Bharambe 2025-06-03 16:51:06 -07:00
  • c4c67ac775 chore: cleanups from review feedback on openai api docs Ben Browning 2025-06-03 20:42:56 -04:00
  • ed69c1b3cc
    feat(responses): add more streaming response types (#2375) Ashwin Bharambe 2025-06-03 15:48:41 -07:00
  • 252249cf91 feat(responses): add more streaming response types Ashwin Bharambe 2025-06-03 15:41:07 -07:00
  • d96f6ec763
    chore(ui): use proxy server for backend API calls; simplified k8s deployment (#2350) ehhuang 2025-06-03 14:57:10 -07:00
  • 7c1998db25
    feat: fine grained access control policy (#2264) grs 2025-06-03 17:51:12 -04:00
  • 8bee2954be
    feat: Structured output for Responses API (#2324) Ben Browning 2025-06-03 17:43:00 -04:00
  • c70ca8344f
    fix: resolve template name to config path in llama stack run (#2361) Ignas Baranauskas 2025-06-03 22:39:12 +01:00
  • 7cd2a1c031 k8s Eric Huang 2025-06-03 12:09:01 -07:00
  • cba55808ab
    feat(distro): add more providers to starter distro, prefix conflicting models (#2362) Ashwin Bharambe 2025-06-03 12:10:46 -07:00
  • e45e4f947a bugfix Ashwin Bharambe 2025-06-03 12:00:51 -07:00
  • 471d40e80c kill verif template Ashwin Bharambe 2025-06-03 11:56:07 -07:00
  • 96cd51a0c8 Changes to access rule conditions: Gordon Sim 2025-05-29 20:21:20 +01:00
  • 528a391c5f feat(distro): add more providers to starter distro, prefix conflicting models Ashwin Bharambe 2025-06-03 11:43:19 -07:00
  • a8ba160852
    fix: resolve template name to config path in llama stack run Ignas Baranauskas 2025-06-03 19:18:38 +01:00
  • b380cb463f
    feat: add postgres deps to starter distro (#2360) Ashwin Bharambe 2025-06-03 11:04:23 -07:00
  • 032f92b3e1 feat: add postgres deps to starter distro Ashwin Bharambe 2025-06-03 10:57:08 -07:00
  • e743257d1d
    docs: Add missing dependencies in quickstart demo command (#2347) Jorge 2025-06-03 18:01:36 +02:00
  • 7e30b5a466
    fix: remove sentence-transformers from remote vllm Sébastien Han 2025-06-03 18:00:27 +02:00
  • 643c0bb747 fix: Add fictional liquids to get_boiling_point Derek Higgins 2025-05-14 12:14:37 +01:00
  • 73ca0fb37a
    chore: remove torch dep from sentence-transformers Sébastien Han 2025-06-03 15:13:11 +02:00
  • 59830e5a22
    chore: return NotImplementedError instead of ValueError Sébastien Han 2025-06-03 15:11:54 +02:00
  • 9e757c433a Add missing dependencies in quickstart command Jorge Garcia Oncins 2025-06-03 14:28:08 +02:00
  • 3c9a10d2fe
    feat: reference implementation for files API (#2330) ehhuang 2025-06-02 21:54:24 -07:00
  • ba25c5e7e1
    docs(k8s): add UI template (#2343) Ashwin Bharambe 2025-06-02 17:55:18 -07:00
  • 0ea429c163 fix Ashwin Bharambe 2025-06-02 17:53:57 -07:00
  • e92f571f47
    fix: ollama chat completion needs unique ids (#2344) Ben Browning 2025-06-02 20:43:20 -04:00
  • badf8594d1 feat: Structured output for Responses API Ben Browning 2025-05-31 13:44:20 -04:00
  • c754d9af7a fixes Ashwin Bharambe 2025-06-02 16:31:24 -07:00
  • 48fdbf7188 fix: ollama chat completion needs unique ids Ben Browning 2025-06-02 18:59:30 -04:00
  • 375546ade3 fix Ashwin Bharambe 2025-06-02 16:07:13 -07:00
  • 4540c9b3e5
    chore: revert llama-stack-client dep (#2342) ehhuang 2025-06-02 16:05:21 -07:00
  • fd54727aef docs(k8s): add UI template Ashwin Bharambe 2025-06-02 16:04:06 -07:00
  • e7ab5a3649 chore: revert llama-stack-client dep Eric Huang 2025-06-02 16:03:46 -07:00
  • dbe4e84aca
    feat(responses): implement full multi-turn support (#2295) Ashwin Bharambe 2025-06-02 15:35:49 -07:00
  • 8779f32e59 update api Ashwin Bharambe 2025-06-02 15:20:13 -07:00
  • dd9e0ec23b multi turn Eric Huang 2025-06-02 15:19:28 -07:00
  • 9011a156a7 files impl Eric Huang 2025-06-02 15:18:09 -07:00
  • cac7d404a2
    fix: remove openai dep (#2337) ehhuang 2025-06-02 15:15:12 -07:00
  • 2d40ce2271 many fixes Ashwin Bharambe 2025-06-02 15:03:52 -07:00
  • 17e9b14ccf fix: remove openai dep Eric Huang 2025-06-02 14:38:26 -07:00
  • 021976713b fix is_function_tool_call Ashwin Bharambe 2025-06-02 14:36:37 -07:00
  • 4a7bdf1b87
    Merge 71caa271ad into 76dcf47320 Charlie Doern 2025-06-02 17:32:30 -04:00
  • fd15a6832c feat(responses): implement full multi-turn support Ashwin Bharambe 2025-05-27 14:32:21 -07:00
  • 76dcf47320
    docs(mcp): add a few lines for how to specify Auth headers in MCP tools (#2336) Ashwin Bharambe 2025-06-02 14:28:38 -07:00
  • 8dcdce317d docs(mcp): add a few lines for how to specify Auth headers in MCP tools Ashwin Bharambe 2025-06-02 13:58:58 -07:00
  • 6bb174bb05
    revert: "chore: Remove zero-width space characters from OTEL service" (#2331) Sébastien Han 2025-06-02 23:21:35 +02:00
  • 3511af7c33
    fix: fireworks provider for openai compat inference endpoint (#2335) Hardik Shah 2025-06-02 14:11:15 -07:00
  • 3f43ad7c9e fix fireworks open ai compat endpoint Hardik Shah 2025-06-02 13:57:05 -07:00
  • 7fb4bdabea
    docs(kubernetes): add more fleshed-out example of a Demo Kubernetes cluster (#2329) Ashwin Bharambe 2025-06-02 13:07:08 -07:00
  • f427e3092f add ingres Ashwin Bharambe 2025-06-02 13:06:44 -07:00
  • 31a3ae60f4
    feat: openai files api (#2321) ehhuang 2025-06-02 11:45:53 -07:00
  • 44401f0a88
    fix ruff pre-commit Sumit Jaiswal 2025-06-02 23:49:55 +05:30
  • 3840ef7a98
    update the code with aysnc iterator as suggested by Ben Sumit Jaiswal 2025-06-02 23:49:08 +05:30
  • c69e52c262 feat: openai files api, api, response to string Eric Huang 2025-06-02 11:04:40 -07:00
  • 697338ec57 kill gp2 Ashwin Bharambe 2025-06-02 09:41:24 -07:00
  • 17f4414be9
    fix: remote-vllm event loop blocking unit test on Mac (#2332) Ben Browning 2025-06-02 11:24:12 -04:00
  • 1c0c6e1e17
    chore: remove usage of load_tiktoken_bpe (#2276) Sébastien Han 2025-06-02 16:33:37 +02:00
  • af65207ebd
    chore: help setuptools finding the project path (#2333) Sébastien Han 2025-06-02 16:20:46 +02:00
  • 6e9e870cca Ensure use of string representation of event.name Michael Anstis 2025-06-02 14:58:36 +01:00
  • 1dcffac3fd
    chore: help setuptools finding the project path Sébastien Han 2025-06-02 15:26:45 +02:00
  • f586bdd912 fix: remote-vllm event loop blocking unit test on Mac Ben Browning 2025-06-02 08:33:23 -04:00
  • 88edf74b6f
    fix: use unicode escape sequence for zero-width Sébastien Han 2025-06-02 10:12:49 +02:00
  • 365b896b38
    revert: "chore: Remove zero-width space characters from OTEL service name env var defaults" Sébastien Han 2025-06-02 10:05:42 +02:00
  • c7be73fb16
    refactor: remove container from list of run image types (#2178) Mark Campbell 2025-06-02 08:57:55 +01:00
  • b413c7562b
    fix review cosmetic comment Sumit Jaiswal 2025-06-02 12:45:17 +05:30
  • afa9db5a6b
    fix pre-commit issues Sumit Jaiswal 2025-06-01 16:00:18 +05:30
  • ae85dd6182
    fix unit tc failure due to updated logic Sumit Jaiswal 2025-05-31 08:05:04 +05:30
  • 9c42598aee
    fix review around /models api call Sumit Jaiswal 2025-05-30 16:14:31 +05:30
  • 6a96b6c264
    update the API Sumit Jaiswal 2025-05-30 00:23:56 +05:30
  • 6d1cf140ba
    to add health status check for remote vllm Sumit Jaiswal 2025-05-29 02:10:13 +05:30
  • 6cbb3366f2 more fixes, gah Ashwin Bharambe 2025-06-01 17:07:18 -07:00
  • 6f4f51f8d9 apply anti affinity and separate PVCs for the models so the two vllms can be mapped to two nodes and avoid causing unnecessary memory pressure Ashwin Bharambe 2025-06-01 16:54:36 -07:00
  • 4121166784 split off safety so it can be applied one at a time Ashwin Bharambe 2025-06-01 15:59:00 -07:00
  • d93f6c9e5b play around with util Ashwin Bharambe 2025-06-01 15:34:43 -07:00
  • a36b0c5fe3 docs(kubernetes): add a more fleshed out example of a Demo Kubernetes cluster Ashwin Bharambe 2025-06-01 14:25:54 -07:00
  • 319300fe24
    updates to fix pre-commit checks Sumit Jaiswal 2025-06-01 17:51:02 +05:30
  • 6ec2ed4196
    feat: New OpenAI compat embeddings API (#2314) Hardik Shah 2025-05-31 22:11:47 -07:00
  • 455939e63c
    fix: Responses streaming tools don't concatenate None and str (#2326) Ben Browning 2025-05-31 21:24:04 -04:00
  • 2818e444f2
    feat: Enable ingestion of precomputed embeddings (#2317) Francisco Arceo 2025-05-31 04:03:37 -06:00
  • dfdf854865
    fix: Fix requirements from broken github-actions[bot] (#2323) Francisco Arceo 2025-05-30 20:05:47 -06:00
  • 769d8f5428
    build: Bump version to 0.2.9 github-actions[bot] 2025-05-30 19:43:09 +00:00
  • fbc8fc6eb5
    feat: support postgresql inference store (#2310) ehhuang 2025-05-29 14:33:09 -07:00
  • b21050935e
    feat: New OpenAI compat embeddings API (#2314) Hardik Shah 2025-05-31 22:11:47 -07:00
  • 277f8690ef
    fix: Responses streaming tools don't concatenate None and str (#2326) Ben Browning 2025-05-31 21:24:04 -04:00
  • 0529cfe6c3 Release candidate 0.0.0.dev20250601003125 rc-0.0.0.dev20250601003125 github-actions[bot] 2025-06-01 00:32:09 +00:00
  • a6d8e6831b fix: Responses streaming tools don't concatenate None and str Ben Browning 2025-05-31 15:40:01 -04:00
  • f328436831
    feat: Enable ingestion of precomputed embeddings (#2317) Francisco Arceo 2025-05-31 04:03:37 -06:00
  • be4d924ffe skip ollama openai_embeddings test Hardik Shah 2025-05-30 21:30:03 -07:00
  • a3d83ea653
    Merge branch 'main' into precomputed-embeddings Francisco Arceo 2025-05-30 20:05:59 -06:00
  • 31ce208bda
    fix: Fix requirements from broken github-actions[bot] (#2323) Francisco Arceo 2025-05-30 20:05:47 -06:00
  • 1dd4c06d42 fix: Fix requirements from broken github-actions[bot] Francisco Javier Arceo 2025-05-30 21:53:29 -04:00
  • a9daf358c4
    Merge branch 'main' into precomputed-embeddings Francisco Arceo 2025-05-30 19:36:55 -06:00
  • 09bdaf07a1 update dqdrant test Francisco Javier Arceo 2025-05-30 21:35:13 -04:00
  • 1e2d1643fe build: Bump version to 0.2.9 github-actions[bot] 2025-05-30 19:43:09 +00:00
  • 57b52d67f9 Release candidate 0.0.0.dev20250531002418 rc-0.0.0.dev20250531002418 github-actions[bot] 2025-05-31 00:25:06 +00:00
  • 681e697fff updated tests and refactored the validation for readability Francisco Javier Arceo 2025-05-30 17:07:20 -04:00