SOM (Self Organizing Map)

August 10, 2012 3 분 소요

1. Basic concept of the Self Organizing Map(SOM)

A self-organizing map(SOM) is a type of artificial neural network(ANN) that uses unsupervised learning to produce a low-dimensional neural outputs.

But SOMs are little bit different from other ANNs in that they use a neighborhood function to preserve the topological properties of the input space. It consists of components called nodes or neurons like other ANNs. Each node is a weight vector of the same dimension as the input data vectors and a position in the map space.

The procedure of placing a vector from data set onto the map is to find the node with the closest weight vector, called winning neuron from the data space. Once the closest node is located than we can get new weight vector set. Finally we obtain updated output weight vectors that are trained and categorized. Once we have an unseen input than we can categorize or recognize the character by using weight vectors.

2. Summary of the SOM algorithm

Step 1. Initialize parameters such as weight vectors

learning rate

(0<

<1) and define topological neighborhood function

Step 2. Check stopping condition

Step 3. For each training vector x, perform steps 4 through 7.

Step 4. Compute the best match of a weight vector with the input

Step 5. For all units in the specified neighborhood , update

the weight vectors according to

Step 6. Adjust the learning rate parameter.

Step 7. Appropriately reduce the topological neighborhood

(k)

Step 8. Set k <- k+1; then go to step 2.

3. Simulation setting

We can simulate the SOM algorithm by using given various data set of characters.

We have 2500 image data set that is sufficient to obtain reasonable weight vectors. Appropriately I let learning rate parameter to 0.0005, initially. It becomes 1 finally.

Input images consist of 784 nodes and actually it is a 28*28 matrix like below.

We can make arbitrary random initial weight vectors.

They will be updated and become meaningful image set.

4. Source Code

To verify the algorithm I composed some matlab codes.

Source codes

% main.m

%% initialize %%

sample_No = 2500; % # of samples

eta_i = 0.0005; % initial learning rate

eta = eta_i; % learning rate

w = rand(28, 28, 100); % initial weight vector

k = 0;

%% update the synaptic weight vectors

for k = 1:2500

% culculate q(x) ------------------

x = reshape(data(k,:), 28, 28);

[index] = evaluate_q(x,w);

%----------------------------------

[x_point, y_point] = find_grid(index); % find x,y points of winning neuron

for x_index = 1:10

for y_index = 1:10

% decide within Nq

sub = [x_point-x_index y_point-y_index];

n = norm(sub,2);

if(n<150000*exp(-1-4*(k/sample_No*5)))

% update weight vectors if within Nq

grid = (x_index-1)*10+y_index;

w(:,:,grid) = w(:,:,grid) + eta*(x-w(:,:,grid));

end

% update learning rate

eta = eta_i + 0.0004*k;

end

function [index] = evaluate_q(x,w)

min = 1000;

for i = 1:100

wj = w(:,:,i);

n = norm(x-wj,2);

if(min>n)

min = n;

index = i;

end

function [x, y] = find_grid(index)

x = floor(index/10);

y = index - x*10;

5 Find winner neurons

To find best match vector I used 2-norm function such as n = norm(x-wj,2); It worked well and also I could get the index 'i' of Weight vector.

6 Neighborhood function

I used a circle model rather than a hexagon model. Because It is simple and It has good performance as well. At first neighborhood function covers all of weight vectors since it is big enough wherever it is located.

Neighborhood function become smaller exponentially that is it is updated exponentially. I used exponential function to update the Neighborhood function.

150000*exp(-1-4*(k/sample_No*5))

I just appropriately reduced the topological neighborhood Nq(k). First 40% of k neighborhood function cover all of the weight vectors. The graph is below

In 1000< k <2500, N-function covers some weight vectors

7 Learning rate

Leaning rate increase from 0.005 to 1 linearly. At first the range of neighborhood function is so huge. So learning rate need to be small enough. Finally leaning rate becomes 1 as neighborhood function range becomes small.

8 Result