Select Page

## Statistical Projections Of Asset Returns Via Exact Pattern Matches And The K Nearest Neighbors Algorithm – Using Historical Calibration Data

1. Introduction We use machine learning algorithms to match current input data with historical data. As we know the future prices for past prices, we use these matched future prices to form percent returns, zscore (standard deviation units) returns and up/down...

## Calendar Returns

Calendar returns is an analysis of close-to-close returns for specific calendar periods: year, quarter, month, and week. Results are computed for: Minimum, 25th Percentile, Median, 75th Percentile, and Maximum. They are available as CSV files and for each asset there...

## Portfolio Diversification Via Hierarchical Clustering

In this article, we cluster stock price time series with hierarchical clustering and Euclidean, correlation, and Jensen-Shannon distances to answer two questions regarding portfolio diversification. How diversified is a given portfolio? How can a diversified portfolio be constructed?

## Portfolio Diversification Via K-means

Introduction We use the K-means algorithm to answer two questions regarding portfolio diversification. How diversified is a given portfolio? How can a diversified portfolio be constructed? Additionally, we use the multidimensional scaling (MDS) algorithm to visualize...

## Mechanical Trading System: Entry = Array Of Dual Moving Averages, Exit = Fixed Period

In this article, we present a mechanical trading system that is a generalization of a dual moving average cross over system with a fixed time period for exits.

## Statistical Projections Of Asset Returns Via Exact Pattern Matches And The K Nearest Neighbors Algorithm – Using Historical Calibration Data

1. Introduction We use machine learning algorithms to match current input data with historical data. As we know the future prices for past prices, we use these matched future prices to form percent returns, zscore (standard deviation units) returns and up/down...

## Calendar Returns

Calendar returns is an analysis of close-to-close returns for specific calendar periods: year, quarter, month, and week. Results are computed for: Minimum, 25th Percentile, Median, 75th Percentile, and Maximum. They are available as CSV files and for each asset there...

## Portfolio Diversification Via Hierarchical Clustering

In this article, we cluster stock price time series with hierarchical clustering and Euclidean, correlation, and Jensen-Shannon distances to answer two questions regarding portfolio diversification. How diversified is a given portfolio? How can a diversified portfolio be constructed?

## Portfolio Diversification Via K-means

Introduction We use the K-means algorithm to answer two questions regarding portfolio diversification. How diversified is a given portfolio? How can a diversified portfolio be constructed? Additionally, we use the multidimensional scaling (MDS) algorithm to visualize...

## Mechanical Trading System: Entry = Array Of Dual Moving Averages, Exit = Fixed Period

In this article, we present a mechanical trading system that is a generalization of a dual moving average cross over system with a fixed time period for exits.

## Visualizing Correlations Among Dow 30 Stocks Via NetworkX

NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.

Using daily adjusted close data from 20201118 to 20201218 for Dow 30 stocks, we compute correlation coefficients, apply a threshold of 0.8 to find similar stocks, and produce two types of graphs with NetworkX.

## Paper: XPySom: High-Performance Self-Organizing Maps – Mancini et al 2020

In this paper, we introduce XPySom, a new open-source Python implementation of the well-known Self-Organizing Maps (SOM) technique. It is designed to achieve high performance on a single node, exploiting widely available Python libraries for vector processing on multi-core CPUs and GP-GPUs. We present results from an extensive experimental evaluation of XPySom in comparison to widely used open-source SOM implementations, showing that it outperforms the other available alternatives.

## Finding Similar Stocks Via Fast GPU Based Nearest Neighbors with Faiss

There are many ways to find stocks with similar behavior based on how one defines similarity and the data used. In this article we use a 12 period channel where, for each period, we have (current adjusted close price – minimum value)/(maximum value – minimum value). Maximum and minimum values are computed for the adjusted close prices for the past 21 trading days (representing a trading month), then 42 days, …, 252 days. Our channel will then be normalized so that all values are in the interval [0, 1]. We use the Euclidean distance measure as our similarity.

After transforming our data into normalized channels, our task then becomes finding the K nearest neighbors. We will use the Faiss Python library.

## Fast GPU Based Nearest Neighbors with Faiss

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research.

## Hyperparameter Search With GPyOpt: Part 3 – Keras (CNN) Classification and Ensembling

GPyOpt is a Python open-source library for Bayesian Optimization developed by the Machine Learning group of the University of Sheffield. It is based on GPy, a Python framework for Gaussian process modelling.

In this article, we demonstrate how to use this package to perform hyperparameter search for a classification problem with Keras.

## Hyperparameter Search With GPyOpt: Part 2 – XGBoost Classification and Ensembling

GPyOpt is a Python open-source library for Bayesian Optimization developed by the Machine Learning group of the University of Sheffield. It is based on GPy, a Python framework for Gaussian process modelling.

In this article, we demonstrate how to use this package to perform hyperparameter search for a classification problem with XGBoost.

## Paper: Seq2seq Fingerprint: An Unsupervised Deep Molecular Embedding for Drug Discovery – Xu et al 2017

In this paper, we propose a novel unsupervised molecular embedding method, providing a continuous feature vector for each molecule to perform further tasks, e.g., solubility classification. In the proposed method, a multi-layered Gated Recurrent Unit (GRU) network is used to map the input molecule into a continuous feature vector of fixed dimensionality, and then another deep GRU network is employed to decode the continuous vector back to the original molecule.

## Paper: CheMixNet: Mixed DNN Architectures for Predicting Chemical Properties using Multiple Molecular Representations – Paul et al 2018

In this interesting paper, the authors use a multi input neural network to predict various small molecule properties in which one branch is a multilayer perceptron with MACCS fingerprints and the other is one of a RNN, 1D CNN, 1D CNN-RNN with SMILES.

## Paper: A Tutorial on Bayesian Optimization – Frazier 2018

In this tutorial, we describe how Bayesian optimization works, including Gaussian process regression and three common acquisition functions: expected improvement, entropy search, and knowledge gradient. We conclude with a discussion of Bayesian optimization software and future research directions in the field.

At no cost to you, Machine Learning Applied earns a commission from qualified purchases when you click on the links below.