README ¶
Back to All Sims (also for general info and executable downloads)
Introduction
The previous err_driven_hidden
project showed that error-driven learning with a hidden layer is needed to solve difficult problems. This simulation shows the complementary advantages of hebbian learning when combined with error driven learning to support generalization.
In this project we have expanded the size of all layers to accommodate a richer set of inputs, specifically those that involve the same combinations of horizontal and vertical lines that we saw in self_org
. But in this case we are asking the network to perform a task and so it also has an output layer. The task is to label which combination of lines is present in the input, which it has to learn. The first column of output units corresponds to the vertical lines, and the second column corresponds to the horizontal lines. For example, the event V0_H1, which is the leftmost vertical line combined with the second horizontal line from the bottom, has an output layer target of the bottom unit on the left column for V0 and the second from bottom unit on the right column for H1). Click on Lines2
to familiarize yourself with these patterns.
-
Click
Init
andStep Trial
to see one of the patterns presented to the network. You should see the correct target in the output layer at the end of the trial as the network is shown that correct target pattern. If you then click onPhase
->ActM
instead ofAct
in the Netview you will see what the network 'guessed' for the labels in the minus phase, before it was told what the correct labels were. This should usually be wrong since the network has not had the opportunity to learn the labels yet. -
Click set
Step
toRun
andStep
, and then switch toTrain Epoch Plot
to view the percent errors over epochs of training. You should see that the network has trouble learning because the default learning rule is Hebbian. Now change the learning rule in the Control Panel toError Driven
, clickInit
andStep
Run
again. You should now see that the errors go down to zero within a few epochs as the network learns to label the different combination of lines using pure Error Driven learning. You can switch to Netview and step through a few trials looking atPhase
->ActM
to convince yourself that it has really learned these. ClickWts
->r.Wt
and look at a few of the hidden units to see whether they learn systematic representations (or alternatively switch to theWeights
tab to display a grid view of all of the synaptic weights, just like inself_org
).
Question 4.10: Do the hidden units learn to represent the individual lines like they did in
self_org
? Why or why not? Explain in terms of how the learning rule is designed to adjust weights when minimizing error.
-
For this project we have not actually trained the network with all possible combinations of horizontal and vertical lines. We deliberately left out some novel combinations of lines it has not seen together before. These can then used to test the network to see if the network can correctly generalize to these new combinations without having to memorize them. Each time a new network is run, the program automatically selects 15% (randomly) of the line combinations and put them in a Test table.
-
While viewing
Act
on the netview, set run mode toTest
instead ofTrain
, and thenInit
andRun
, which will step through all of the new combinations of lines the network has never seen together before. (You can also step through one at a time if you want). Look at the output patterns on the network and compare to theAct / Targ
values which show the target (i.e. what the network should have responded). You can also switch to theTest Trial
tab to see all the test trials and what the network guessed (shown in the second to last column, as the output activations) compared to what the correct answer would have been (the target) in the last column. To get a broader sense of the performance across multiple networks you can do clickTrain
modeRun
and let it run through 10 networks with different initial weights and different permutations of training/test patterns. Switch to viewing theTest Epoch Plot
tab, where you will see a graph of the network percent error on the test data across after every 5 epochs of training as each of the 10 networks is learning. (Again you can confirm that the networks are learning the training patterns by looking atTrain Epoch Plot
).
Question 4.11: Report what you see in the output in the test trials and over epochs of learning and runs. On average, does the network generalize its learning by reporting the correct combinations of lines for these novel patterns? Consider why this might be in terms of the internal hidden representations that were learned by the hidden layer in the earlier question.
-
Now switch the learning rule from
ErrorDriven
back toHebbian
, clickInit
, andStep
Run
again. Although the network can't learn the task, if you click on theWeights
tab (orr.Wt
on the netview) you should see that the hidden units exhibit similar learned representations of the lines to what you saw inself_org
. Thus even though Hebbian learning fails to learn the output labeling task (ie it is not able to 'report' which lines it sees in the output), the internal representations that it develops are still sensitive to the independent components. -
Let's see if we can leverage this learning to improve generalization. Switch the learning rule to
ErrorHebbIn
and clickInit
. This will maintain purely Hebbian learning in the Input to Hidden projection (settingLearn.XCal
LLrn=1, MLrn=0, where L = amount of BCM long-term running average, M = error-driven medium-term). But it will now have pure Error-driven learning in the connections between Hidden and Output (LLrn=0, MLrn=1). Click 'TrainRun' and run a full set of 10 networks. Confirm first that the network can learn the training set (looking atTrain Epoch Plot
to see if the errors goes down to zero). Then, look at the 'Test Epoch Plot' tab which will again display the percent errors on the novel combinations of lines for each network after every 5 epochs of training. (You can also click theTest Trial
tab again to see if any individual network is generalizing in each trial).
Question 4.12: On average, does Hebbian learning in the hidden layer help the network perform better at generalizing to new items? Why or why not? Consider more generally how the combination of learning rules might be useful for the brain.