A Bi-layered Parallel Training Architecture for Large-Scale Convolutional Neural Networks
Abstract: Benefitting from large-scale training datasets and the complex training network, Convolutional Neural Networks (CNNs) are widely applied in various fields with high accuracy. However, the training process of CNNs is very time-consuming, where large amounts of training samples and iterative operations are required to obtain high-quality weight parameters. In this paper, we focus on the time-consuming training process of large-scale CNNs and propose a Bi-layered Parallel Training (BPT-CNN) architecture in distributed computing environments. BPTCNN consists of two main components: (a) an outer-layer parallel training for multiple CNN sub networks on separate data subsets, and (b) an inner-layer parallel training for each sub network. In the outerlayer parallelism, we address critical issues of distributed and parallel