llama-stack-mirror/toolchain/inference
2024-07-21 19:07:02 -07:00
..
api from models.llama3_1 --> from llama_models.llama3_1 2024-07-21 19:07:02 -07:00
quantization from models.llama3_1 --> from llama_models.llama3_1 2024-07-21 19:07:02 -07:00
__init__.py Add toolchain from agentic system here 2024-07-19 12:30:35 -07:00
api_instance.py rename ModelInference to Inference 2024-07-21 12:20:32 -07:00
client.py rename ModelInference to Inference 2024-07-21 12:20:32 -07:00
generation.py from models.llama3_1 --> from llama_models.llama3_1 2024-07-21 19:07:02 -07:00
inference.py from models.llama3_1 --> from llama_models.llama3_1 2024-07-21 19:07:02 -07:00
model_parallel.py from models.llama3_1 --> from llama_models.llama3_1 2024-07-21 19:07:02 -07:00
parallel_utils.py Add toolchain from agentic system here 2024-07-19 12:30:35 -07:00
server.py rename ModelInference to Inference 2024-07-21 12:20:32 -07:00