|
api
|
Added non-streaming ollama inference impl
|
2024-07-30 18:11:44 -07:00 |
|
quantization
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
|
__init__.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
|
api_instance.py
|
addressing comments
|
2024-07-31 22:07:45 -07:00 |
|
client.py
|
fix non-streaming api in inference server
|
2024-07-30 14:25:50 -07:00 |
|
generation.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
|
model_parallel.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
|
parallel_utils.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |
|
server.py
|
Initial commit
|
2024-07-23 08:32:33 -07:00 |