Large scale deep learning and simulation enabled precision medicine for cancer

A presentation by Sam Jacobs, Ph.D.



Despite remarkable progress in deep learning, substantial time is still required in training data and computationally intensive neural networks (e.g. 3.5 billion weakly labeled Instagram images trained on state-of-the-art neural network for 22 days using 336 GPUs). Scientific applications of interest to the US national laboratories, Department of Energy (DOE) and National Cancer Institute (NCI) are of equal or more challenging dimensions. Leveraging the HPC resources available at the US national laboratories, we can train on large-scale data sets and/or increasingly complex models with either improved time to solution or the opportunity to achieve a higher quality solution via more extensive training? In this talk, I will present our on-going work in large-scale training of deep neural networks. In particular, I will discuss an HPC-centric deep learning toolkit, Livermore Big Artificial Neural Network (LBANN), and a new multi-level tournament voting algorithm, Livermore Tournament Fast Batching (LTFB) for data parallel training, hyperparameter exploration, and uncertainty quantification of neural networks at scale.  I will start off with a brief overview of parallel and distributed deep learning, discuss the LBANN and LTFB abstractions and components, and provide experimental studies of LTFB algorithm and its application in drug discovery for cancer.