In addition to using the tree-structured Parzen algorithm via Optuna to find hyperparameters for a CNN with Keras for the the MNIST handwritten digits data set classification problem, we add asynchronous successive halving, a pruning algorithm, to halt training when preliminary results are unpromising.
Paper: A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility – Tang et al 2020
In this paper, we describe a self-attention-based message-passing neural network (SAMPN) model, which is a modification of Deepchem’s MPN  and is state-of-the-art in deep learning. It directly learns the most relevant features of each QSAR/QSAPR task in the learning process and assigns the degree of importance for substructures to improve the interpretability of prediction.
This is a very interesting paper in which the authors use a message passing neural network on a carefully selected data set to predict antibacterial activity against E. coli. Then they apply their model to other data sets, while also prioritizing molecules that are different (via minimum Tanimoto similarity) from existing antibiotics, to find candidates for new antibiotics.
Optuna is a Python package for general function optimization. It also has specialized coding to integrate it with many popular machine learning packages to allow the use of pruning algorithms to make hyperparameter searching more efficient. In this article we use Optuna to optimize hyperparameters for Sci-kit Learn machine learning algorithms.
Paper: DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures – O’Boyle and Dalke 2018
SMILES (Simplified Molecular Input Line Entry System) representations of molecules have found many uses in machine learning algorithms, especially those derived from natural language processing techniques. However, they were not designed for machine learning and thus suffer from various syntax issues that can hamper machine learning methods, especially generative methods. DeepSMILES is a modification of SMILES explicitly designed to address these issues.
Paper: Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space – Nigam et al 2020
In this paper, the authors use a genetic algorithm operating on the SELFIES (SELF-referencIng Embedded Strings) representation of molecules to explore the vast space of small molecules. A neural network is used to guide the exploration process. Also, fitness functions are constructed to generate molecules with specific properties.
This paper introduces Optuna, a Python package for performing hyperparameter optimization and pruning for machine learning algorithms.
In Hyperparameter Search With Bayesian Optimization for Scikit-learn Classification and Ensembling we applied the Bayesian Optimization (BO) package to the Scikit-learn ExtraTreesClassifier algorithm. Here we do the same for XGBoost.
Paper: Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar – Drori et al 2019
We formulate the AutoML problem of pipeline synthesis as a single-player game, in which the player starts from an empty pipeline, and in each step is allowed to perform edit operations to add, remove, or replace pipeline components according to a pipeline grammar.
Variational autoencoders provide a principled framework for learning deep latent-variable models and corresponding inference models. In this work, we provide an introduction to variational autoencoders and some important extensions.
At no cost to you, Machine Learning Applied earns a commission from qualified purchases when you click on the links below.