In this paper the authors present a Python version of the Self-Organizing Map (SOM) algorithm that can run in parallel on CPUs and GPUs (via CuPy). Additionally, they speed up the algorithm by:

heavily exploiting higher-dimensional operations (e.g., matrix-matrix) to work on batches of input samples for each native call, resulting in a number of native calls proportional to the number of batches. Furthermore, neurons are updated once per epoch, with just the numerator/denominator accumulation required for each batch, which results in a much lower number of overall operations.

One useful aspect of this work is that is was forked from MiniSom which contains some nice visualization features via Matplotlib.

Code can be found at https://github.com/Manciukic/xpysom.

Below is the abstract of XPySom: High-Performance Self-Organizing Maps.

In this paper, we introduce XPySom, a new open-source Python implementation of the well-known Self-Organizing Maps (SOM) technique. It is designed to achieve high performance on a single node, exploiting widely available Python libraries for vector processing on multi-core CPUs and GP-GPUs. We present results from an extensive experimental evaluation of XPySom in comparison to widely used open-source SOM implementations, showing that it outperforms the other available alternatives. Indeed, our experimentation carried out using the Extended MNIST open data set shows a speed-up of about 7x and 100x when compared to the best open-source multi-core implementations we could find with multi-core and GP-GPU acceleration, respectively, achieving the same accuracy levels in terms of quantization error.