api
|
Added non-streaming ollama inference impl
|
2024-07-30 18:11:44 -07:00 |
quantization
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
__init__.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
client.py
|
fix non-streaming api in inference server
|
2024-07-30 14:25:50 -07:00 |
generation.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
inference.py
|
fix non-streaming api in inference server
|
2024-07-30 14:25:50 -07:00 |
model_parallel.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
ollama.py
|
Added non-streaming ollama inference impl
|
2024-07-30 18:11:44 -07:00 |
parallel_utils.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
server.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |