Hyperparameter Search With GPyOpt: Part 2 – XGBoost Classification and Ensembling

GPyOpt is a Python open-source library for Bayesian Optimization developed by the Machine Learning group of the University of Sheffield. It is based on GPy, a Python framework for Gaussian process modelling.

In this article, we demonstrate how to use this package to perform hyperparameter search for a classification problem with XGBoost.

Paper: Seq2seq Fingerprint: An Unsupervised Deep Molecular Embedding for Drug Discovery – Xu et al 2017

In this paper, we propose a novel unsupervised molecular embedding method, providing a continuous feature vector for each molecule to perform further tasks, e.g., solubility classification. In the proposed method, a multi-layered Gated Recurrent Unit (GRU) network is used to map the input molecule into a continuous feature vector of fixed dimensionality, and then another deep GRU network is employed to decode the continuous vector back to the original molecule.

Paper: CheMixNet: Mixed DNN Architectures for Predicting Chemical Properties using Multiple Molecular Representations – Paul et al 2018

In this interesting paper, the authors use a multi input neural network to predict various small molecule properties in which one branch is a multilayer perceptron with MACCS fingerprints and the other is one of a RNN, 1D CNN, 1D CNN-RNN with SMILES.

Paper: A Tutorial on Bayesian Optimization – Frazier 2018

In this tutorial, we describe how Bayesian optimization works, including Gaussian process regression and three common acquisition functions: expected improvement, entropy search, and knowledge gradient. We conclude with a discussion of Bayesian optimization software and future research directions in the field.

Paper: Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space – Nigam et al 2020

In this paper, the authors use a genetic algorithm operating on the SELFIES (SELF-referencIng Embedded Strings) representation of molecules to explore the vast space of small molecules. A neural network is used to guide the exploration process. Also, fitness functions are constructed to generate molecules with specific properties.

Paper: Auto-Keras: An Efficient Neural Architecture Search System – Jin et al. 2019

Network morphism, which keeps the functionality of a neural network while changing its neural architecture, could be helpful for NAS by enabling more efficient training during the search. In this paper, we propose a novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search. The framework develops a neural network kernel and a tree-structured acquisition function optimization algorithm to efficiently explores the search space.

Paper: MolGAN: An implicit generative model for small molecular graphs – De Cao and Kipf 2018

We introduce MolGAN, an implicit, likelihood-free generative model for small molecular graphs that circumvents the need for expensive graph matching procedures or node ordering heuristics of previous likelihood-based methods. Our method adapts generative adversarial networks (GANs) to operate directly on graph-structured data. We combine our approach with a reinforcement learning objective to encourage the generation of molecules with specific desired chemical properties.

At no cost to you, Machine Learning Applied earns a commission from qualified purchases when you click on the links below.

Pin It on Pinterest