RADIAL BASIS FUNCTION (RBF) NETWORKS

ADAPTIVE K-MEANS CLUSTERING ALGORITHM

One cluster center is updated every time an input vector is chosen at random from the input data set.

The cluster nearest to has its position updated using where is the learning rate.

Notice how the cluster centre is moved closer to because this equation minimizes the error vector .

This algorithm is typically used for clustering the input vectors in real-time as they are acquired sequentially.

Note that batch K-means clustering algorithm is much better than the adaptive version because it has a stopping criterion and does not require a learning rate .
The results of applying the adaptive clustering algorithm to the example shown in the previous lecture is presented below, using a learning rate of 0.1. The diamond shaped points are the cluster centers using this adaptive algorithm.

adpative.gif (3750 bytes)

RADIAL BASIS FUNCTION (RBF) NETWORKS

Using a clustering procedure (K-means batch or adaptive) creates a set of cluster centers , which can be thought of as the average input vector for the k^th cluster, or more appropriately, as the prototype vector for that cluster.

rbf2.gif (4789 bytes)

The output of a neuron in the output layer of an RBF network since the bias output .

The basis function
where is the input vector
is the j^th prototype vector
is the width of the Gaussian of that prototype or cluster centre
Euclidean distance between and

NOTE

Consider the Gaussian function , where a is the mean and is the width of the function. A plot this function versus x is shown below for , and = 1 (red curve) and 0.5 (blue curve).

gauss.gif (2976 bytes)

Notice that for x = 4, (x + a) = 0, hence f(x) = 1.
The key important feature is that if x within a width 2of the mean a, then the output of the function is a significant value. Otherwise it is very close to zero!

Hence from the basis function , we see that the output only has a significant value if the Euclidean distance between the input vector and a prototype vector is within a radius 2 around .

example:

The normalized input vectors x	Cluster centers after applying the batch-K-means algorithm are: = (-0.834, 0.27), and = (0.834,-0.27). The circles filled in blue belong to cluster cluster using Euclidian distance calculations.

For the RBF network, we shall use only two neurons in the hidden layer.

The graph below shows the output of the hidden neurons if

KEY:

Red lines are

Blue lines are

Horizontal axis j is the index number of the vector

The graph below shows the output of the hidden neurons if

Notice how the output of a hidden neuron is only significant if the Euclidean distance from a cluster center is within a radius of around the cluster center.
Notice that the width of the Gaussian function critically affects the hidden neuron outputs.

The above graphs illustrate that we need more than two hidden neurons to cover the input vector space ! In general, the number of hidden neurons, or equivalently basis functions centers needs to be much greater than the number of clusters in the data.

The basis function widths are set once the clustering procedure is complete. The basis functions should overlap to some extent in order to give a relatively smooth representation of the data.

Typically, the for a given cluster center is set to the average Euclidean distance between the center and the training vectors which belong to that cluster.

Learning the output layer of the RBF network only takes place only after and have been determined. Note that this first phase of training is all unsupervised.

In second phase of training, the weights between the hidden layer and the output layer are determined.

RECALL: MLP NEURAL NETWORK

For an RBF network, the activation function of the output neurons is linear i.e.

Hence

Therefore

BACK TO THE EXAMPLE FROM PREVIOUS LECTURES

In the previous lecture, we considered the use of two hidden neurons in the RBF network as shown below.

rbf.gif (2255 bytes)

This type of network is called a radial basis function network because the activation functions of the hidden neurons are Gaussian function, which are radially symmetric.

The word basis is also used in the name for an important reason, which is best illustrated by a plot of versus as shown below. The blue dots are the input vectors assigned to one cluster. The remaining belong to the other cluster.

Notice how the plot of plot of versus transforms the input vector space into a space which is easily separated by a straight line!

lizaphi.gif (2158 bytes)

Since there is only one neuron in the output layer, the scenario is exactly the case of a single layer perceptron, in which we recall that for n = 2, .
If we apply the weight formula , the results are shown below, in which is represented by w1 and is represented by w2. These two weights are used to plot the line over the plot of versus

Initial weights were set randomly to w0 = -0.1, w1 = -0.2, w2 = 0.1, and all 14 training vectors were applied in order for a total of three times. SOLUTION

The animation below shows what happens if the network is presented the 14 input training vectors for a total of 25 times.

Yes I know its cool. Took me a long time to create !

Reading Paper :

Growing RBF structures using self-organizing maps