Artificial Neural Net - Multilayer

No preview image

1 collaborator

Uri_dolphin3 Uri Wilensky (Author)

Tags

ccl 

Tagged by Uri Wilensky almost 11 years ago

computer science 

Tagged by Reuven M. Lerner almost 11 years ago

Model group CCL | Visible to everyone | Changeable by group members (CCL)
Model was written in NetLogo 5.0.4 • Viewed 905 times • Downloaded 54 times • Run 0 times
Download the 'Artificial Neural Net - Multilayer' modelDownload this modelEmbed this model

Do you have questions or comments about this model? Ask them here! (You'll first need to log in.)


WHAT IS IT?

This is a model of a very small neural network. It is based on the Perceptron model, but instead of one layer, this network has two layers of "perceptrons". Furthermore, the layers activate each other in a nonlinear way. These two additions means it can learn operations a single layer cannot.

The goal of a network is to take input from its input nodes on the far left and classify those inputs appropriately in the output nodes on the far right. It does this by being given a lot of examples and attempting to classify them, and having a supervisor tell it if the classification was right or wrong. Based on this information the neural network updates its weight until it correctly classifies all inputs correctly.

HOW IT WORKS

Initially the weights on the links of the networks are random.

The nodes on the left are the called the input nodes, the nodes in the middle are called the hidden nodes, and the node on the right is called the output node.

The activation values of the input nodes are the inputs to the network. The activation values of the hidden nodes are equal to the activation values of inputs nodes, multiplied by their link weights, summed together, and passed through the sigmoid function. Similarly, the activation value of the output node is equal to the activation values of hidden nodes, multiplied by the link weights, summed together, and passed through the sigmoid function. The output of the network is 1 if the activation of the output node is greater than 0.5 and 0 if it is less than 0.5.

The sigmoid function maps negative values to values between 0 and 0.5, and maps positive values to values between 0.5 and 1. The values increase nonlinearly between 0 and 1 with a sharp transition at 0.5.

To train the network a lot of inputs are presented to the network along with how the network should correctly classify the inputs. The network uses a back-propagation algorithm to pass error back from the output node and uses this error to update the weights along each link.

HOW TO USE IT

To use it press SETUP to create the network and initialize the weights to small random numbers.

Press TRAIN ONCE to run one epoch of training. The number of examples presented to the network during this epoch is controlled by EXAMPLES-PER-EPOCH slider.

Press TRAIN to continually train the network.

In the view, the larger the size of the link the greater the weight it has. If the link is red then it has a positive weight. If the link is blue then it has a negative weight.

If SHOW-WEIGHTS? is on then the links will be labelled with their weights.

To test the network, set INPUT-1 and INPUT-2, then press the TEST button. A dialog box will appear telling you whether or not the network was able to correctly classify the input that you gave it.

LEARNING-RATE controls how much the neural network will learn from any one example.

TARGET-FUNCTION allows you to choose which function the network is trying to solve.

THINGS TO NOTICE

Unlike the Perceptron model, this model is able to learn both OR and XOR. It is able to learn XOR because the hidden layer (the middle nodes) and the nonlinear activation allows the network to draw two lines classifying the input into positive and negative regions. A perceptron with a linear activation can only draw a single line. As a result one of the nodes will learn essentially the OR function that if either of the inputs is on it should be on, and the other node will learn an exclusion function that if both of the inputs or on it should be on (but weighted negatively).

However unlike the perceptron model, the neural network model takes longer to learn any of the functions, including the simple OR function. This is because it has a lot more that it needs to learn. The perceptron model had to learn three different weights (the input links, and the bias link). The neural network model has to learn ten weights (4 input to hidden layer weights, 2 hidden layer to output weight and the three bias weights).

THINGS TO TRY

Manipulate the LEARNING-RATE parameter. Can you speed up or slow down the training?

Switch back and forth between OR and XOR several times during a run. Why does it take less time for the network to return to 0 error the longer the network runs?

EXTENDING THE MODEL

Add additional functions for the network to learn beside OR and XOR. This may require you to add additional hidden nodes to the network.

Back-propagation using gradient descent is considered somewhat unrealistic as a model of real neurons, because in the real neuronal system there is no way for the output node to pass its error back. Can you implement another weight-update rule that is more valid?

NETLOGO FEATURES

This model uses the link primitives. It also makes heavy use of lists.

RELATED MODELS

This is the second in the series of models devoted to understanding artificial neural networks. The first model is Perceptron.

CREDITS AND REFERENCES

The code for this model is inspired by the pseudo-code which can be found in Tom M. Mitchell's "Machine Learning" (1997).

Thanks to Craig Brozefsky for his work in improving this model.

HOW TO CITE

If you mention this model in a publication, we ask that you include these citations for the model itself and for the NetLogo software:

COPYRIGHT AND LICENSE

Copyright 2006 Uri Wilensky.

CC BY-NC-SA 3.0

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

Commercial licenses are also available. To inquire about commercial licenses, please contact Uri Wilensky at uri@northwestern.edu.

Comments and Questions

Please start the discussion about this model! (You'll first need to log in.)

Click to Run Model

links-own [weight]

breed [bias-nodes bias-node]
breed [input-nodes input-node]
breed [output-nodes output-node]
breed [hidden-nodes hidden-node]

turtles-own [
  activation     ;; Determines the nodes output
  err            ;; Used by backpropogation to feed error backwards
]

globals [
  epoch-error    ;; measurement of how many training examples the network got wrong in the epoch
  input-node-1   ;; keep the input and output nodes
  input-node-2   ;; in global variables so we can
  output-node-1  ;; refer to them directly
]

;;;
;;; SETUP PROCEDURES
;;;

to setup
  clear-all
  ask patches [ set pcolor gray ]
  set-default-shape bias-nodes "bias-node"
  set-default-shape input-nodes "circle"
  set-default-shape output-nodes "output-node"
  set-default-shape hidden-nodes "output-node"
  set-default-shape links "small-arrow-shape"
  setup-nodes
  setup-links
  propagate
  reset-ticks
end 

to setup-nodes
  create-bias-nodes 1 [ setxy -4 6 ]
  ask bias-nodes [ set activation 1 ]
  create-input-nodes 1 [
    setxy -6 -2
    set input-node-1 self
  ]
  create-input-nodes 1 [
    setxy -6 2
    set input-node-2 self
  ]
  ask input-nodes [ set activation random 2 ]
  create-hidden-nodes 1 [ setxy 0 -2 ]
  create-hidden-nodes 1 [ setxy 0  2 ]
  ask hidden-nodes [
    set activation random 2
    set size 1.5
  ]
  create-output-nodes 1 [
    setxy 5 0
    set output-node-1 self
    set activation random 2
  ]
end 

to setup-links
  connect-all bias-nodes hidden-nodes
  connect-all bias-nodes output-nodes
  connect-all input-nodes hidden-nodes
  connect-all hidden-nodes output-nodes
end 

to connect-all [nodes1 nodes2]
  ask nodes1 [
    create-links-to nodes2 [
      set weight random-float 0.2 - 0.1
    ]
  ]
end 

to recolor
  ask turtles [
    set color item (step activation) [black white]
  ]
  ask links [
    set thickness 0.05 * abs weight
    ifelse show-weights? [
      set label precision weight 4
    ] [
      set label ""
    ]
    ifelse weight > 0
      [ set color [ 255 0 0 196 ] ]   ; transparent red
      [ set color [ 0 0 255 196 ] ] ; transparent light blue
  ]
end 

;;;
;;; TRAINING PROCEDURES
;;;

to train
  set epoch-error 0
  repeat examples-per-epoch [
    ask input-nodes [ set activation random 2 ]
    propagate
    back-propagate
  ]
  set epoch-error epoch-error / examples-per-epoch
  tick
end 

;;;
;;; FUNCTIONS TO LEARN
;;;

to-report target-answer
  let a [activation] of input-node-1 = 1
  let b [activation] of input-node-2 = 1
  ;; run-result will interpret target-function as the appropriate boolean operator
  report ifelse-value run-result
    (word "a " target-function " b") [1][0]
end 

;;;
;;; PROPAGATION PROCEDURES
;;;

;; carry out one calculation from beginning to end

to propagate
  ask hidden-nodes [ set activation new-activation ]
  ask output-nodes [ set activation new-activation ]
  recolor
end 

;; Determine the activation of a node based on the activation of its input nodes

to-report new-activation  ;; node procedure
  report sigmoid sum [[activation] of end1 * weight] of my-in-links
end 

;; changes weights to correct for errors

to back-propagate
  let example-error 0
  let answer target-answer

  ask output-node-1 [
    set err activation * (1 - activation) * (answer - activation)
    set example-error example-error + ( (answer - activation) ^ 2 )
  ]
  set epoch-error epoch-error + example-error
  
  ;; The hidden layer nodes are given error values adjusted appropriately for their
  ;; link weights
  ask hidden-nodes [
    set err activation * (1 - activation) * sum [weight * [err] of end2] of my-out-links
  ]
  ask links [
    set weight weight + learning-rate * [err] of end2 * [activation] of end1
  ]
end 

;;;
;;; MISC PROCEDURES
;;;

;; computes the sigmoid function given an input value and the weight on the link

to-report sigmoid [input]
  report 1 / (1 + e ^ (- input))
end 

;; computes the step function given an input value and the weight on the link

to-report step [input]
  report ifelse-value (input > 0.5) [1][0]
end 

;;;
;;; TESTING PROCEDURES
;;;

;; test runs one instance and computes the output

to test
  ;; output the result
  ifelse test-success? input-1 input-2
    [ user-message "Correct." ]
    [ user-message "Incorrect." ]
end 

to-report test-success? [n1 n2]
  ask input-node-1 [ set activation n1 ]
  ask input-node-2 [ set activation n2 ]
  propagate
  report target-answer = step [activation] of one-of output-nodes
end 


; Copyright 2006 Uri Wilensky.
; See Info tab for full copyright and license.

There is only one version of this model, created almost 11 years ago by Uri Wilensky.

Attached files

No files

This model does not have any ancestors.

This model does not have any descendants.