Self Organising Maps

I got reminded of Self Organising Maps(SOMs) at last week's Neurocomputation lecture. I learnt SOMs last year while on a craze to teach myself about neural nets. They are fascinating little buggers I tell ya!

The knowledge of SOMs had come in pretty handy earlier this year when I designed and programmed a blog analyzer/classifier intended to be contributed to the mvblogs.org project. The "classifier" part utilized SOMs to do the magic. However, sadly, I never got around to finishing an "analyzer" (which does the text and language processing) that I was happy with and soon enough my interest waned out and the effort died. I will probably tackle it sometime soon, now that my interest has been rekindled :-P Anyway, onto SOMs...

What are SOMs?
Self Organising Maps, also known as Kohonen networks in honour of its inventor, are a very interesting type of (artificial) Neural Network. It features an input neuron layer that is directly mapped to all the output layer neurons - where the output neurons are represented as being arranged as a grid.

A SOM when presented with training data, is able to train itself in such a way that "similar" data is placed closed together on the grid. By "similar" I refer to the manner in which any number of the attributes of the input data can be represented on the output by mapping the variation of the attributes. Any type of data that can be broken down or converted to a vector of numbers so that it can be mathematically manipulated can be fed to the input of a SOM. Possible input data may include text blocks, books, images, surveys etc. This makes SOMs extremely powerful and useful as a tool for making a simple 2D/3D representation of highly complex, multi-dimensional data.

The algorithm for a SOM is quite simple and very elegant. If you are keen to learn more, try the paper "The Self Organising Map" by the creator Teuvo Kohonen himself. Alternatively, this simpler guide may be more accessible and a shorter read :p

SOM eye candy
One of the coolest demonstrations of an application of a SOM is color classification. In such a setup, a SOM is fed a set of colors - as vectors with components in RGB, CMYK or whatever representation we choose - and set to the task of "organizing" them. At the end of the run, the SOM has the colors all arranged neatly by (mostly) placing similar colors close to each other.

Here is a simple sample case where I fed a SOM a collection of 80 random colors.


Random 80 colors.

I then set the SOM to churn and after 500 ticks the output has the output grid has the colors neatly arranged!


Post SOM run...

Interesting stuff eh?