Psycholinguistics - Presentation Outline

Note: Written in 1996 at Middlebury College for a Psycholinguistics class.

Neural Networks and Chomsky Notion That Language is Rule-Based

Introduction

How developments in neural network computer models are beginning to question Chomsky’s notion that language is ruled-based.

The neural network models show that computers can, by being trained and without possessing given/innate rules, quite successfully form past tenses of verbs (including irregular forms) and distinguish between grammatically correct and incorrect sentences.

So, what is a neural network.

Daniel Story — UC Berkeley

1.5 ago, living in Berkeley with cousin working on Ph.D. in Astrophysics.
he was doing research into entropy of gas molecules in galaxy formation
one night, I met him up in lab; showed me something he was learning
on the monitor were a bunch of circles with lines connecting them, and they were flickering and changing connections.
looked something like this (SHOW Neural Network Model)
lines were bouncing around, and it seemed like it was doing something, I just didn’t know what
he pointed to some graph on the screen and said, “See, it’s learning.”
stared blankly and said I’d go pick up some Chinese food
what he was showing me was a very simple neural network with about 6 units that was attempting limited character recognition — and over time it was being trained to become more accurate. It WAS learning.
without rules, it was teaching itself how to recognize letters of the alphabet k. similar neural network models are being used in the Newton for handwriting recognition, in stock markets to forecast market trends, and expert systems are being made to try to drive cars and even predict complex interaction systems, like galaxy formation

The appeal of neural networks

they can learn without relying on rules

So, again, what is a neural network?

A. The Neural Network

neural networks are models based on research into how our human minds process information, store it, and make decisions based on that information
the computer model tries to recreate how interconnected neurons communicate with one another to store information and perform cognitive tasks B. Explanation
Briefly, a neural network consists of computation units (also called cells) and a set of one-way data connections joining the units. SHOWN IN FIGURE!
At certain times a unit examines its inputs and computes a signed number called an activation as its output.
The new activation is then passed along those connections leading to other units.
Each connection has a signed number called a weight that determines whether an activation that travels along it influences the receiving cell to produce a similar or different activation according the sign of the weight.

when you start looking a neural networks, you hit some complex math and difficult algorithms

I won’t go into to specifics — but they look like this:
(SHOW Sigma notation)
but, when you look at it, all it’s really doing is summing up a cell’s activation and weight and passing it along to another unit to do the same thing (based on more results) until it reaches the output
so, in this figure, when unit (u13) is reevaluated, its activation is determined by the activations of u3, u8, u9, and u10 and the weights w11,0 , w11,3, w11,8 , w11,9 and w11,10
the network learns by using several algorithms like back-propagation, by re-adjusting the weights each time it makes a mistake

What does this have to do with linguistics?

Chomsky

We’re all aware of Chomsky’s assault on behaviorism in the 1950’s
how he argued that despite their apparent differences, all languages use a similar set of underlying rules so fundamental yet so subtle that children cannot simply acquire them by listening to their parents
Chomsky says that we ware genetically predisposed to learn these grammatical rules
armed with neural networks, researchers are beginning to probe the assumptions behind Chomsky’s theories
what they are finding is that his theory, which is the heart of modern linguistics, may be flawed. B. Neural Networks can accomplish grammatical tasks without having innate rules
these researchers are not trying to recreate brain - impossible with 10^6 neurons
but even with relatively small sets of cell-units, show remarkable results
I looked at 2 studies a. David Rumelhart and James McClelland’s findings that a neural network can learn to correctly produce past-tenses of verbs b. and another by Lawrence, Giles, and Fong that demonstrated that a neural network can learn to distinguish grammatically correct sentences C. Rumelhart/McClelland
in the mid-1980’s wondered, how might a network, like a child, learn to produce correct forms of pat-tense verbs
Chomsky’s view assumes the existence of a set of rules we unconsciously apply to the present tense verbs to produce the past tense
all regular verbs follow the simple rule that adds the “-ed” suffix to the root of the verb
the 180 or so irregulars verbs obey what Chomsky calls ‘exception” rules — go becomes went, was becomes had
these exception rules are central to Chomsky’s theory — for without these rules, children would be unable to generate past tenses
Rumelhart and McClelland say he’s Chomsky’s wrong
they devised a network that can produce correct past tense forms of both regular and irregular verbs without needing to treat irregular verbs differently
their network consisted of 460 input and 460 output units
verb roots can be programmed into the input units and via connections, encode the past tense from the output units
the network uses a learning algorithm called back-propagation of error which trains the network
in training, McClelland and Rumelhart put 420 verbs into the network which encoded the past-tense and then compared its result to what should have been the correct result
each time the network makes a mistake, it changes the weights in such a way as to ensure that the next time the input pattern is presented, the output will be closer to being correct

They argue that this teaching process mimics the feedback children receive as they attempt to learn the past tense

in one test, their network produced the correct past tense forms of 90% of a bacth of 86 unfamiliar regular verbs and was abile also to handle the irregular forms.
after training, the network produced ‘wept’ for ‘weep’, ‘clung’ for cling. E. Lawrence, et al
for sake of time, can’t do into the details of their experiment
SHOW TABLE!!
briefly, they tagged words in sentences with its possible part-of-speech, noun, verb, adjective, etc.
they inputted these tags into their network and using a Nearest-Neighbors algorithm tried to assess the grammatical correctness of the sentence
the network was trained with a set of correct sentences
after training, they showed that with any rules, their network determined whether an unfamiliar sentence was grammatically correct 55% of the time
their finding wasn’t entirely conclusive, but demonstrated that a neural network is capable of performing the cognitive task without rules

What it means for Chomsky’s theory and modern lingusitics.

perhaps there are no rules
McClelland and Rumelhart showed that

The problem for Chomsky’s theory is that their network does not use linguistic rules
in learning verb forms, its connections are simply weighed according to the correlations it detects between input and output verbs
interestingly, the computer model even showed human quirks. a. when it was trained with 10 verbs, with 3-4 irregular forms, it went on to produce ‘went’ from ‘go’ b. but when the training batch was increased to 420, it learning that the greater majority of words still follow the add -ed rule, so it would form ‘goed’ instead of ‘went’ even though it had just then performed correctly c. McClelland and Rumelhart say this is similar to the counter-intuitive errors children make when they’ve already shown that they know that ‘went’ is the past-tense of ‘go’, but sometimes say goed or wented
the debate between Chomsky and researcher using neural networks is just beginning
but already, the success of neural networks to produce language without rules are forcing linguists to rethink some of their ideas about how we acquire language

Psycholinguistics: Neural Networks and Chomsky's Rule-Based Language Theory

An exploration of how neural network computer models challenge Chomsky's notion that language is rule-based, examining key research in psycholinguistics.

Psycholinguistics - Presentation Outline

Neural Networks and Chomsky Notion That Language is Rule-Based

Introduction

How developments in neural network computer models are beginning to question Chomsky’s notion that language is ruled-based.

So, what is a neural network.

Daniel Story — UC Berkeley

The appeal of neural networks

So, again, what is a neural network?

A. The Neural Network

when you start looking a neural networks, you hit some complex math and difficult algorithms

What does this have to do with linguistics?

Chomsky

They argue that this teaching process mimics the feedback children receive as they attempt to learn the past tense

What it means for Chomsky’s theory and modern lingusitics.