llama-stack-mirror/llama_toolchain/inference
2024-07-31 19:33:36 -07:00
..
api Added non-streaming ollama inference impl 2024-07-30 18:11:44 -07:00
quantization Initial commit 2024-07-23 08:32:33 -07:00
__init__.py Initial commit 2024-07-23 08:32:33 -07:00
api_instance.py Added non-streaming ollama inference impl 2024-07-30 18:11:44 -07:00
client.py fix non-streaming api in inference server 2024-07-30 14:25:50 -07:00
event_logger.py fix non-streaming api in inference server 2024-07-30 14:25:50 -07:00
generation.py Initial commit 2024-07-23 08:32:33 -07:00
inference.py add streaming support for ollama inference with tests 2024-07-31 19:33:36 -07:00
model_parallel.py Initial commit 2024-07-23 08:32:33 -07:00
ollama.py add streaming support for ollama inference with tests 2024-07-31 19:33:36 -07:00
parallel_utils.py Initial commit 2024-07-23 08:32:33 -07:00
server.py Initial commit 2024-07-23 08:32:33 -07:00