Deep Learning for the Life Sciences: Applying Deep Learning to Genomics, Microscopy, Drug Discovery & More
For additional information about the book, see:
We review this book, chapter-by-chapter.
1. Why Life Science?
Noting the vast increase in the amount of data available in chemistry and biology, the authors explain why they believe that neural networks (so called ‘deep learning’ although most of the models in the book do not have many layers) are particularly useful for analysis. The chapter ends with a description of the chapters to follow.
2. Introduction to Deep Learning
A cursory glance at the book excerpts available at the publisher’s website would clearly indicate that this book is not a machine learning primer. The book quite rightly does not provide a description of Python, as it is presumed that knowledge of the language is a prerequisite. The same should be true of neural networks. This chapter is filler.
3. Machine Learning with DeepChem
Chapter 3 covers: DeepChem datasets, building a multilayer perceptron to predict toxicity of an unnamed molecule, creating a convolutional neural network for MNIST using DeepChem. Two main problems of this book are demonstrated in this chapter. The book heavily promotes DeepChem, a code that thinly wraps the TensorFlow implementation of neural networks, as well as utilities for converting molecular features into matrices of real numbers and domain specific post processing of results. The natural question to ask is why not simply create the first and third of these while avoiding the obvious duplicated effort of the 2nd? The authors attempt to make their case for including the TensorFlow wrapper, but then clearly show that it is not needed by the CNN MNIST example. What possible reason would someone have to use DeepChem for this instead of TensorFlow, Keras, PyTorch, etc.? The authors even attempt to argue that other neural network frameworks don’t have support for certain architectures such as graph convolutional networks. A simple search shows that there exist Python packages to do just this in Keras. Keras has the flexibility to allow users to create custom layers, so it is not surprising that people have done this and released their results. The value of DeepChem lies in converting molecular information to numerical matrices and some post processing routines. The creators should have focused on this. Also, this should have been a much larger part of the book, as it would have been very helpful to non-chemists and students to get up to speed with molecular data sets.
There should have been additional examples and the explanations should have been expanded. Unfortunately, this issue applies to most of the book. Consider, the book has only 209 pages (excluding the index) and there is plenty of filler, so one wonders why there are so few examples.
4. Machine Learning for Molecules
Various aspects of molecules are presented, as well as methods to describe them in text (SMILES), extended-connectivity fingerprints, and RDKit molecular descriptors. An example of a graph convolutional network for predicting solubility from the Delaney data set in MoleculeNet is shown. SMARTS string searching is covered.
The authors mention that DeepChem has builtin functionality for Weave models, message passing networks, deep tensor networks and more, yet no examples are provided.
5. Biophysical Machine Learning
This chapter focuses on building a protein-ligand binding affinity prediction model. There is a useful discussion about experimental methods for determining protein structure as well the implication of their limitations for use in machine learning algorithms. Similar remarks are made regarding data quality for the Protein Data Bank.
Helpful coverage of basic aspects of proteins and also peptides, their dynamic nature, amino acid components, types of chemical bonds, are provided. Then the authors discus featurization techniques, how to transform biophysical information into matrices of real numbers for use in machine learning algorithms. The authors do a good job in describing the severe difficulties involved as well as limitations of such methods as grid and atomic featurzation. Also stressed is the fact that proteins cannot be modeled with 2 dimensional matrices, as could the small molecules of Chapter 4, thus introducing the concept of voxels, the 3 dimensional equivalent of 2 dimensional pixels.
Next is a description of random forest and multilayer perceptron models to predict binding affinities for the 2D3U protein-ligand complex. As the data file is large, the reader is referred to the book’s code repository to examine it (2d3u.pdb [this is a large file]). This is where we fail to understand why the authors did not expand upon their work. No attempt at hyperparameter optimization was attempted. Early stopping, dropout, regularization were not used for the MLP. The correlation coefficient was reported as the error metric, but not the mean squared error or even a brief table of actual versus predicted binding values. The correlation coefficient results were 0.99 and 0.359 for training and testing respectively for the MLP. Offhandedly, it is mentioned that 3D CNNs and graph convolutional nets were tried but with similar results. These results were not shown. One wonders what the authors think that they demonstrated with such poor results and unwillingness to attempt to improve upon them or discuss in detail why their methods failed.
The chapter ends with a brief and rather pointless discussion of antibody-antigen interactions, noting that there are no machine learning results available.
6. Deep Learning for Genomics
Chapter 6 commences with cartoon and realistic descriptions of various aspects of DNA, RNA, transcription, and gene regulation. An emphasis is made on the necessity of including more than simple sequences as input to machine learning algorithms to predict desired results. This is demonstrated by using a 1D CNN to predict transcription factor binding for JUND in HepG2 using only 101 sequence segments, then repeating the analysis by including chromatin accessibility values resulting in improved performance. This problem is severely imbalanced (it is a binary classification problem with 99% being no) which is handled by a weighted error scheme. However, such imbalanced datasets are common in biology, so this issue should have been address in more detail (perhaps by exploring the imbalanced-learn package).
Next, a 1D CNN is used to predict the effectiveness of siRNA in silencing a target gene.
The problems in this chapter are clearly described and the results are better than those in Chapter 5. Even dropout is used and we infer that there was limited hyperparameter searching. However, it would have been nice to see an RNN and CNN-RNN multimodal input nets as comparisons.
7. Machine Learning for Microscopy
We found this chapter to be the most interesting of the book. Here are two quotes (from pages 106 and 111) that shed light on our opinion.
It seems intuitively obvious that deep learning can make an impact in microscopy, since deep learning excels at image handling and microscopy is all about image capture. But it’s worth asking: what parts of microscopy can’t deep learning do much for now? As we see later in this chapter, preparing a sample for microscopic imaging can require considerable sophistication. In addition, sample preparation requires considerable physical dexterity, because the experimenter must be capable of fixing the sample as a physical object. How can we possibly automate or speed up this process?
Recent research has started to leverage the power of deep learning techniques to reconstruct super-resolution views. These techniques claim orders of magnitude improvements in the speed of super-resolution microscopy by enabling reconstructions from sparse, rapidly acquired images. While still in its infancy, this shows promise as a future application area for deep learning.
In a field dominated by hype, it is rare and refreshing to see acknowledgement of limitations and measured speculation about applications.
There are two interesting examples in the chapter. First, is computing cell counts from images in the Broad Bioimage Benchmark Collection. Second, is cell segmentation, separating cells from background in images from the same database. This model is a U-Net, which is similar to a CNN autoencoder.
8. Deep Learning for Medicine
The authors cover past failed attempts to apply machine learning to clinical medicine (expert systems and Bayesian networks), mention electronic health records, and then somehow conclude:
… but machine learning healthcare systems will soon change your personal healthcare experiences, along with the experiences of millions if not billions of others.
Nothing in this chapter supports such a conclusion. Whether this assertion is reasonable or not, the authors should have provided an example from the clinic or refrained from such a pronouncement.
A working example of detecting diabetic retinopathy disease progression from images is provided. However, this is rather trivial problem taken from Kaggle.
The chapter concludes with a couple of pages of subjective opinions of the authors.
Deleting the first part of the chapter, including a more challenging image problem in which a CNN performs poorly, and eliminating pontification at the end, would have benefited the reader.
9. Generative Models
Chapter 9 commences with useful descriptions of variational autoencoders (VAE) and generative adversarial networks. Followed by a description about how such generative models could be used as part of the process of discovering new molecules in industrial chemistry and medicine.
Later in the chapter there is a detailed example of applying a VAE to SMILES representation of small molecules, generating new molecules, and applying various filters so that only “drug like” molecules remain. The authors point out the promise of such an approach as well as current limitations. By implication and hints at what Insilico Medicine is doing, it appears that domain knowledge can be encoded as part of a reinforcement learning strategy to assist generative models in producing molecules of interest.
While this was our favorite chapter, it was marred by some wild hype and pointless speculation, par for the course in deep learning discussions.
10. Interpretation of Deep Models
Gleaning domain specific insights from neural network models is notoriously difficult. As is the related problem of determining feature importances, what features have the greatest impact on predictions. The authors use saliency mapping to address these issues. Saliency is defined as the derivative of outputs with respect to all inputs, where greater implies more salient.
In an interesting application of saliency mapping, the authors revisit the transcription binding problem of chapter 6 and show how known binding motiffs, specific base sequences, can be extracted from the CNN.
Using a dropout mask, randomly applying dropout when making predictions from a trained neural network model to create many predictions, the authors demonstrate one method of predicting the uncertainty of model outputs. This provides some measure of how confident one should be of model predictions. As noted, this method is used when constructing a single model is computationally expensive, otherwise, one could simply use an ensemble.
11. A Virtual Screening Workflow Example
In a ligand-based virtual screen, we search for molecules that function similarly to one or more known molecules.
A ligand-based virtual screen typically starts with a set of known molecules identified through any of a variety of experimental methods. Computational methods are then used to develop a model based on experimental data, and this model is used to virtually screen a large set of molecules to find new chemical starting points.
A complete example of this process, using a graph convolutional net, is shown in this chapter. This is the level of detail that we expected for the entire book. We note that there is plenty of room for using this example to experiment with different featurization methods, network architectures, parameter optimization, and ensembling.
12. Prospects and Perspectives
Hype and wild speculation characterize the last chapter. One wonders what prompted the authors to include it in the book.
While our review is certainly not very positive, it is not entirely negative. Our impression is that the book was a missed opportunity. Given the knowledge and experience of the authors, we expected a book that was filled with the type of details such as those in chapter 11. Additionally, we hoped that they would refrain from the never ending hype and unbridled speculation that characterizes so much commentary about machine learning.
However, we are satisfied with our purchase as the book did show many interesting avenues for the application of machine learning in the life sciences.