llama-stack-mirror/llama_toolchain/data/default_inference_config.yaml at bbfd8a587e19cfaec561c5b2f30b2b4ffd753c5b - phoenix-oss/llama-stack-mirror - Git for basel.kvant.cloud

phoenix-oss/llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 04:04:14 +00:00

Hardik Shah 74442e88b1 add yaml to manifest

2024-07-22 13:34:08 -07:00

9 lines

270 B

YAML

Raw Blame History

 inference_config:
   impl_type: "inline"
   inline_config:
     checkpoint_type: "pytorch"
     checkpoint_dir: {checkpoint_dir}/
     tokenizer_path: {checkpoint_dir}/tokenizer.model
     model_parallel_size: {model_parallel_size}
     max_seq_len: 2048
     max_batch_size: 1