Thinking as a Hobby

One of the first things I'm going to do with my network model is to test it on the XOR logical function, a simple, standard benchmark for neural networks. But I'm using a spiking network model for the first time, and I was wondering about psychologically-plausible ways to represent the input, so I started to look around.

That's when I came across "The implications of null patterns and output unit activation functions on simulation studies of learning: A case study of patterning" by Vanessa Yaremchuk, Leanne R. Willson, Marcia L. Spetch and Michael R.W. Dawson.

First, let me give an intro to the concept of XOR.

Let's say you have two buttons in front of you (A and B). Pressing both at the same time is equivalent to logical AND:

Photobucket

AND means both items, but not either one by itself, and not neither. As you can see, you can draw a straight line between the correct response and the incorrect responses. This means the problem is linearly separable, which is important, because it means that the problem is relatively simple, computationally speaking.

The logical OR problem looks like this:

Photobucket

OR means you can push either A or B, or both together, but not neither. Again, it's a linearly separable problem.

Here's XOR:

Photobucket

XOR means you can push either A by itself or B by itself, but not both together and not neither. As you can see, this problem is not linearly separable, which means there is no single straight line you can draw to separate the correct responses from the incorrect ones. It's a more difficult problem than AND or OR, so it requires more complicated computational machinery to learn.

In 1957, Rosenblatt invented the perceptron, a simple computing device based on the principles of neural computation. A simple perceptron only contains one layer of units analogous to neurons. However, in 1969, Marvin Minsky and Seymour Papert published a book called Perceptrons, in which they demonstrated that simple perceptrons as implemented could not solve the XOR problem. Later it was found that by adding additional layers, creating a multi-layer perceptron, the network could solve XOR. So basically, a simple perceptron can solve OR and AND, but not XOR.

So how does this relate to psychology? Well, Yaremchuk et al. cite the following paper:

Delamater, A. R., Sosa, W., & Katz, M. (1999). Elemental and configural processes in patterning discrimination learning. The Quarterly Journal Of Experimental Psychology B, 52, 97–124.

What they did in this paper was to train a multi-layer perceptron on the XOR problem and also train rats on the problem of negative patterning, which involves training the animals to respond to either of two stimuli, but not both together.

For example, let's say you have some rats. You want to get them to respond to one of two stimuli (e.g., a light and an audible tone) by moving to a particular corner of their cage. So to teach them negative patterning, you want them to move to the particular corner of the cage when they see the light by itself or hear the tone by itself. But you don't want them to move there when the light and tone are presented together.

So in the Delamater paper, in one case they trained the rats on all three conditions within the same sessions (let's call this Condition 1). But in another condition, they first trained the rats up on the stimuli alone (light only and tone only). Once the rats had mastered that, then they tried to teach them not to respond to the conjunction of stimuli (Condition 2). The rats learned to solve the problem faster in Condition 2 than in Condition 1.

What's strange is that when they used these same conditions on the multi-layer perceptron, they got the opposite results (the values are the number of epochs, or training iterations that it took to learn the task):

Rats
Previously reinforced: 538.30
Not previously reinforced: 759.00

Multi-layer Perceptron
Previously reinforced: 2572.50
Not previously reinforced: 2072.90

Why did the neural network perform differently from the rats? Yaremchuk et al. argue that it is because while the network was learning XOR, the rats weren't. Because they weren't explicitly trained on the null pattern (neither light nor tone), they were learning a simpler, linearly separable problem:

Photobucket

Thus, they argue that animal experiments with negative patterning are not equivalent to the logical XOR problem. They go on to carry out simulations to demonstrate how to better conceive of the problem, but this was the part I was mainly interested in.

It's interesting to think about whether or not the null pattern should be represented as real input, rather than the absence of input. When a human learns the XOR function, we process zero as a token. We don't represent it by actual nothingness. So I think I'm on pretty safe ground for psychological realism in representing the XOR problem as a set of positive inputs for all training patterns, and it's interesting to see how such a simple problem still yields food for thought after all these years.

YAREMCHUK, V., WILLSON, L., SPETCH, M., DAWSON, M. (2005). The implications of null patterns and output unit activation functions on simulation studies of learning: A case study of patterning. Learning and Motivation, 36(1), 88-103. DOI: 10.1016/j.lmot.2004.10.001




	Thinking as a Hobby Home Get Email Updates LINKS JournalScan Email Me Admin Password Remember Me 3478623 Curiosities served Share on Facebook				2008-05-20 10:46 AM Is Nothing Something? XOR, Negative Patterning, Rats, and Neural Networks Previous Entry :: Next Entry Read/Post Comments (3) One of the first things I'm going to do with my network model is to test it on the XOR logical function, a simple, standard benchmark for neural networks. But I'm using a spiking network model for the first time, and I was wondering about psychologically-plausible ways to represent the input, so I started to look around. That's when I came across "The implications of null patterns and output unit activation functions on simulation studies of learning: A case study of patterning" by Vanessa Yaremchuk, Leanne R. Willson, Marcia L. Spetch and Michael R.W. Dawson. First, let me give an intro to the concept of XOR. Let's say you have two buttons in front of you (A and B). Pressing both at the same time is equivalent to logical AND: AND means both items, but not either one by itself, and not neither. As you can see, you can draw a straight line between the correct response and the incorrect responses. This means the problem is linearly separable, which is important, because it means that the problem is relatively simple, computationally speaking. The logical OR problem looks like this: OR means you can push either A or B, or both together, but not neither. Again, it's a linearly separable problem. Here's XOR: XOR means you can push either A by itself or B by itself, but not both together and not neither. As you can see, this problem is not linearly separable, which means there is no single straight line you can draw to separate the correct responses from the incorrect ones. It's a more difficult problem than AND or OR, so it requires more complicated computational machinery to learn. In 1957, Rosenblatt invented the perceptron, a simple computing device based on the principles of neural computation. A simple perceptron only contains one layer of units analogous to neurons. However, in 1969, Marvin Minsky and Seymour Papert published a book called Perceptrons, in which they demonstrated that simple perceptrons as implemented could not solve the XOR problem. Later it was found that by adding additional layers, creating a multi-layer perceptron, the network could solve XOR. So basically, a simple perceptron can solve OR and AND, but not XOR. So how does this relate to psychology? Well, Yaremchuk et al. cite the following paper: Delamater, A. R., Sosa, W., & Katz, M. (1999). Elemental and configural processes in patterning discrimination learning. The Quarterly Journal Of Experimental Psychology B, 52, 97–124. What they did in this paper was to train a multi-layer perceptron on the XOR problem and also train rats on the problem of negative patterning, which involves training the animals to respond to either of two stimuli, but not both together. For example, let's say you have some rats. You want to get them to respond to one of two stimuli (e.g., a light and an audible tone) by moving to a particular corner of their cage. So to teach them negative patterning, you want them to move to the particular corner of the cage when they see the light by itself or hear the tone by itself. But you don't want them to move there when the light and tone are presented together. So in the Delamater paper, in one case they trained the rats on all three conditions within the same sessions (let's call this Condition 1). But in another condition, they first trained the rats up on the stimuli alone (light only and tone only). Once the rats had mastered that, then they tried to teach them not to respond to the conjunction of stimuli (Condition 2). The rats learned to solve the problem faster in Condition 2 than in Condition 1. What's strange is that when they used these same conditions on the multi-layer perceptron, they got the opposite results (the values are the number of epochs, or training iterations that it took to learn the task): Rats Previously reinforced: 538.30 Not previously reinforced: 759.00 Multi-layer Perceptron Previously reinforced: 2572.50 Not previously reinforced: 2072.90 Why did the neural network perform differently from the rats? Yaremchuk et al. argue that it is because while the network was learning XOR, the rats weren't. Because they weren't explicitly trained on the null pattern (neither light nor tone), they were learning a simpler, linearly separable problem: Thus, they argue that animal experiments with negative patterning are not equivalent to the logical XOR problem. They go on to carry out simulations to demonstrate how to better conceive of the problem, but this was the part I was mainly interested in. It's interesting to think about whether or not the null pattern should be represented as real input, rather than the absence of input. When a human learns the XOR function, we process zero as a token. We don't represent it by actual nothingness. So I think I'm on pretty safe ground for psychological realism in representing the XOR problem as a set of positive inputs for all training patterns, and it's interesting to see how such a simple problem still yields food for thought after all these years. YAREMCHUK, V., WILLSON, L., SPETCH, M., DAWSON, M. (2005). The implications of null patterns and output unit activation functions on simulation studies of learning: A case study of patterning. Learning and Motivation, 36(1), 88-103. DOI: 10.1016/j.lmot.2004.10.001 Read/Post Comments (3) Previous Entry :: Next Entry Back to Top