Ashwin Bharambe
9dafa6ad94
implement full-passthrough in the server
2024-08-03 14:15:20 -07:00
Ashwin Bharambe
af4710c959
Improved exception handling
2024-08-02 15:52:15 -07:00
Hardik Shah
493f0d99b2
updated dependency and client model name
2024-08-02 15:37:40 -07:00
Ashwin Bharambe
d3e269fcf2
Remove inference uvicorn server entrypoint and llama inference CLI command
2024-08-02 14:18:25 -07:00
Ashwin Bharambe
2cf9915806
Distribution server now functioning
2024-08-02 13:37:40 -07:00
Ashwin Bharambe
041cafbee3
getting closer to a distro definition, distro install + configure works
2024-08-01 23:12:43 -07:00
Ashwin Bharambe
09cf3fe78b
Use new definitions of Model / SKU
2024-07-31 22:44:35 -07:00
Hardik Shah
156bfa0e15
Added Ollama as an inference impl ( #20 )
...
* fix non-streaming api in inference server
* unit test for inline inference
* Added non-streaming ollama inference impl
* add streaming support for ollama inference with tests
* addressing comments
---------
Co-authored-by: Hardik Shah <hjshah@fb.com>
2024-07-31 22:08:37 -07:00
Ashwin Bharambe
5d5acc8ed5
Initial commit
2024-07-23 08:32:33 -07:00