Back to Portfolio
Open Source Machine Learning

SOM Plus Clustering

An extended self-organizing map library with smarter initialization options and built-in cluster quality metrics β€” designed to make unsupervised topology learning more rigorous and reproducible.

What is it?

Standard SOM implementations leave too much to chance β€” random initialization can produce wildly different maps across runs, and most libraries provide no way to evaluate whether the resulting clusters are actually meaningful.

SOM Plus extends the base algorithm with PCA-guided initialization for deterministic, data-aware starting weights, alongside silhouette score and Davies-Bouldin index evaluation so you can objectively compare runs and configurations. It grew out of the GDP trajectory clustering project and the functional group analysis work, where reliable clustering was critical to interpretation.


What it does

πŸ—ΊοΈ

Extended SOM

Full self-organizing map implementation with configurable topology, neighbourhood functions, and learning rate decay.

πŸ“

PCA Initialization

Seeds neuron weights along principal components of the data β€” faster convergence and reproducible maps.

🎲

Random Initialization

Traditional random init still available with seeding for exact reproducibility across experiments.

πŸ“

Silhouette Score

Measures how well-separated clusters are β€” higher is better, and you get it automatically after training.

πŸ“‰

Davies-Bouldin Index

Ratio of within-cluster scatter to between-cluster separation β€” lower scores mean tighter, more distinct clusters.

πŸ”¬

Research-ready

Designed for scientific use β€” clean API, NumPy-native, and compatible with SciPy analysis pipelines.


Built with

Python NumPy SciPy scikit-learn Matplotlib

See the code

Full source, examples, and API docs on GitHub.