Quantocracy: This is a curated mashup of quantitative trading links.

Portfolio Management: A Deep Distributional RL Approach – David Pacheco Aznar

This thesis presents the development and implementation of a novel Deep Distributional Reinforcement Learning (DDRL) approach in the field of quantitative finance: the Distributional Soft Actor-Critic (DSAC) with an LSTM embedding. The model is built to further stabilize the performance of the widely used deep reinforcement learning model Soft Actor Critic (SAC) and is compared against traditional baselines such as Hierarchical Risk Parity, Minimum Variance Portfolio, DJIA and equal weight portfolio. The results show increased returns with less risk associated and stability over Soft Actor Critic and traditional baselines both in random path validation and backtest with daily frequency. The distributional component allows the model to incorporate an inherent sense of risk. The embedding enhances the temporal-dependency awareness and the observation space is composed of multiple features based upon past returns. Thus, this thesis opens the door to further research in the use of deep distributional reinforcement learning models in the context of finance.

Pyro – Deep Universal Probabilistic Programming

Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend. Pyro enables flexible and expressive deep probabilistic modeling, unifying the best of modern deep learning and Bayesian modeling. It was designed with these key principles:

Universal: Pyro can represent any computable probability distribution.
Scalable: Pyro scales to large data sets with little overhead.
Minimal: Pyro is implemented with a small core of powerful, composable abstractions.
Flexible: Pyro aims for automation when you want it, control when you need it.

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning – Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. However, ES can be considered a gradient-based algorithm because it performs stochastic gradient descent via an operation similar to a finite-difference approximation of the gradient. That raises the question of whether non-gradient-based evolutionary algorithms can work at DNN scales. Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion. The Deep GA successfully evolves networks with over four million free parameters, the largest neural networks ever evolved with a traditional evolutionary algorithm. These results (1) expand our sense of the scale at which GAs can operate, (2) suggest intriguingly that in some cases following the gradient is not the best choice for optimizing performance, and (3) make immediately available the multitude of neuroevolution techniques that improve performance. We demonstrate the latter by showing that combining DNNs with novelty search, which encourages exploration on tasks with deceptive or sparse reward functions, can solve a high-dimensional problem on which reward-maximizing algorithms (e.g.\ DQN, A3C, ES, and the GA) fail. Additionally, the Deep GA is faster than ES, A3C, and DQN (it can train Atari in ∼4 hours on one desktop or ∼1 hour distributed on 720 cores), and enables a state-of-the-art, up to 10,000-fold compact encoding technique.