Connectionism

From Wikipedia, the free encyclopedia

Connectionism is an approach in the fields of artificial intelligence, cognitive psychology/cognitive science, neuroscience and philosophy of mind. Connectionism models mental or behavioral phenomena as the emergent processes of interconnected networks of simple units. There are many different forms of connectionism, but the most common forms utilize neural network models.

Contents

[edit] Basic principles

The central connectionist principle is that mental phenomena can be described by interconnected networks of simple units. The form of the connections and the units can vary from model to model. For example, units in the network could represent neurons and the connections could represent synapses. Another model might make each unit in the network a word, and each connection an indication of semantic similarity.

[edit] Spreading activation

Most connectionist models include time, i.e. the network changes over time. A closely related and extremely common aspect of connectionist models is activation. At any time a unit in the network has an activation, which is a numerical value intended to represent some aspect of the unit. For example, if the units in the model are neurons the activation could represent the probability that the neuron would generate an action potential spike. If the model is a spreading activation model then over time a unit's activation spreads to all the other units connected to it. Spreading activation is always a feature of neural network models, and it is very common in connectionist models used by cognitive psychologists.

[edit] Neural networks

Main article: Neural networks

Neural networks are by far the dominant form of connectionist model today. A lot of research utilizing neural networks is carried out under the more general name "connectionist". These connectionist models adhere to two major principles regarding the mind:

  1. Any given mental state can be described as a (N)-dimensional vector of numeric activation values over neural units in a network.
  2. Memory is created by modifying the strength of the connections between neural units. The connection strengths, or "weights", are generally represented as a (N×N)-dimensional matrix.

Though there is a large variety of neural network models, they very rarely stray from these two basic principles. Most of the variety comes from:

  • Interpretation of units—units can be interpreted as neurons or groups of neurons.
  • Definition of activation—activation can be defined in a variety of fashions. For example, in a Boltzmann machine, the activation is interpreted as the probability of generating an action potential spike, and it's determined via a logistic function on the sum of the inputs to a unit.
  • Learning algorithm—different networks modify their connections differently. Generally, any mathematically defined change in connection weights over time is referred to as the "learning algorithm".

Connectionists are in agreement that recurrent neural networks (networks wherein connections of the network can form a directed cycle) are a better model of the brain than feedforward neural networks (networks with no directed cycles). A lot of recurrent connectionist models incorporate dynamical systems theory as well. Many researchers, such as the connectionist Paul Smolensky, have argued that the direction connectionist models will take is towards fully continuous, high-dimensional, non-linear, dynamic systems approaches.

[edit] Biological realism

The neural network branch of connectionism suggests that the study of mental activity is really the study of neural systems. This links connectionism to neuroscience, and models involve varying degrees of biological realism. Connectionist work in general need not be biologically realistic, but some neural network researchers try to model the biological aspects of natural neural systems very closely. As well, many authors find the clear link between neural activity and cognition to be an appealing aspect of connectionism. However, this is also a source of criticism, as some people view this as reductionism.

[edit] Learning

Connectionists generally stress the importance of learning in their models. As a result, many sophisticated learning procedures for neural networks have been developed by connectionists. Learning always involves modifying the connection weights. These generally involve mathematical formulas to determine the change in weights when given sets of data consisting of activation vectors for some subset of the neural units.

By formalizing learning in such a way connectionists have many tools at their hands. A very common tactic in connectionist learning methods is to incorporate gradient descent over an error surface in a space defined by the weight matrix. All gradient descent learning in connectionist models involves changing each weight by the partial derivative of the error surface with respect to the weight. Backpropagation, first made popular in the 1980s, is probably the most commonly known connectionist gradient descent algorithm today.

[edit] History

Connectionism can be traced back to ideas more than a century old. However, connectionist ideas were little more than speculation until the mid-to-late 20th century. It wasn't until the 1980's that connectionism became a popular perspective amongst scientists.

[edit] Parallel distributed processing

Complex parallel distributed processing programs, such as PDP++ shown here, can result in powerful simulations
Complex parallel distributed processing programs, such as PDP++ shown here, can result in powerful simulations

The prevailing connectionist approach today was originally known as Parallel Distributed Processing (PDP). PDP was a neural network approach that stressed the parallel nature of neural processing, and the distributed nature of neural representations.

PDP provided a general mathematical framework for researchers to operate in. The framework involved eight major aspects:

  • A set of processing units, represented by a set of integers.
  • An activation for each unit, represented by a vector of time-dependent functions.
  • An output function for each unit, represented by a vector of functions on the activations.
  • A pattern of connectivity among units, represented by a matrix of real numbers indicating connection strength.
  • A propagation rule spreading the activations via the connections, represented by a function on the output of the units.
  • An activation rule for combining inputs to a unit to determine its new activation, represented by a function on the current activation and propagation.
  • A learning rule for modifying connections based on experience, represented by a change in the weights based on any number of variables.
  • An environment which provides the system with experience, represented by sets of activation vectors for some subset of the units.

These eight aspects are now the foundation for almost all connectionist models.

A lot of the research that led to the development of PDP was done in the 1970s, but PDP became popular in the 1980s with the release of Parallel Distributed Processing: Explorations in the Microstructure of Cognition - Volume 1 (foundations) & Volume 2 (Psychological and Biological Models), by James L. McClelland, David E. Rumelhart, and the PDP Research Group. Although the books are now considered seminal connectionist works, the term "connectionism" was not used by the authors to describe their framework at that point. However it is now common to fully equate PDP and connectionism.

[edit] Earlier work

PDP's direct roots were the perceptron theories of researchers such as Frank Rosenblatt from the 1950s and 1960s. However, perceptron models were made very unpopular with the release in 1969 of a book titled Perceptrons by Marvin Minsky and Seymour Papert. Minsky and Papert elegantly demonstrated the limits on the sorts of functions which perceptrons can calculate, showing that even simple functions like the exclusive disjunction could not be handled properly. The PDP books overcame this earlier limitation by showing that multi-level, non-linear neural networks were far more robust and could be used for a vast array of functions.

However, there were many researchers outside of the perceptron theorists who were advocating connectionist style models prior to the 1980s.

In the 1940s and 1950s researchers such as Warren McCulloch, Walter Pitts, Donald Hebb, and Karl Lashley were advocating connectionist style theories. McCulloch and Pitts showed how first-order logic could be implemented by neural systems: their classic paper "A Logical Calculus of Ideas Immanent in Nervous Activity" (1943) is important in this development here (they were influenced by the important work of Nicolas Rashevsky in the 1930's). Hebb contributed greatly to speculations about neural functioning, and even proposed a learning principle that is still in use today, known as Hebbian learning. Lashley argued for distributed representations as a result of his failure to find anything like a localized engram in years of lesion experiments.

[edit] Connectionism apart from PDP

Though PDP is the dominant form of connectionism, other theorists' work should be classified as connectionist.

Many connectionist principles can be traced back to early work in psychology such as the work of William James, although it should be pointed out that psychological theories based on what was then known about the human brain were quite fashionable at the end of the 19th century. As early as 1869, the neurologist John Hughlings Jackson was arguing for multi-level, distributed systems. Following from this lead, Herbert Spencer's Principles of Psychology, 3rd edition (1872), and Sigmund Freud's Project for a Scientific Psychology (composed 1895) propounded connectionist or proto-connectionist theories. However these tended to be speculative theories. But by the early 20th century Edward Thorndike was carrying out experiments on learning that posited a connectionist type network.

In the 1950s the researcher Friedrich Hayek posited the idea of spontaneous order in the brain arising out of decentralized networks of simple units, but Hayek's work was generally not cited in the PDP literature until recently.

Another form of connectionist model was the relational network framework developed by the linguist Sydney Lamb in the 1960s. Relational networks have only ever been used by linguists, and have never been unified with the PDP approach. As a result, relational networks are used by very few researchers today.

[edit] Connectionism vs. computationalism debate

As connectionism became increasingly popular in the late 1980s there was a reaction to it by some researchers, including Fodor, Pinker, and others. These theorists argued that connectionism, as it was being developed at that time, was in danger of obliterating what they saw as the progress being made in the fields of cognitive science and psychology by the classical approach of computationalism. Computationalism is a specific form of cognitivism which argues that mental activity is computational, i.e., that the mind is essentially a Turing machine. Some researchers argued that the trend in connectionism was a reversion towards associationism and the abandonment of the idea of a language of thought, something they felt was mistaken. In contrast, it was those very tendencies that made connectionism attractive for other researchers.

Connectionism and computationalism need not be at odds per se, but the debate as it was phrased in the late 1980s and early 1990s certainly led to opposition between the two approaches. However, throughout the debate some researchers have argued that connectionism and computationalism are fully compatible, though no consensus has been reached. The differences between the two approaches that are usually cited are the following:

  • Computationalists posit symbolic models that do not resemble underlying brain structure at all, whereas connectionists engage in "low level" modeling, trying to ensure that their models resemble neurological structures
  • Computationalists generally focus on the structure of explicit symbols (mental models) and syntactical rules for their internal manipulation, whereas connectionists focus on learning from environmental stimuli and storing this information in a form of connections between neurons
  • Computationalists believe that internal mental activity consists of manipulation of explicit symbols, whereas connectionists believe that the manipulation of explicit symbols is a poor model of mental activity.
  • Computationalists often posit domain specific symbolic sub-systems designed to support learning in specific areas of cognition (e.g. language, intentionality, number), while connectionists posit one or a small set of very general learning mechanisms.

Though these differences do exist, they may not be necessary. For example, it is well known that connectionist models can actually implement symbol manipulation systems of the kind used in computationalist models. Hence the differences might be a matter of the personal choices that some connectionist researchers make rather than anything fundamental to connectionism.

The recent popularity of dynamical systems in philosophy of mind (due to the works of authors such as Gelder) have added a new perspective on the debate; some authors now argue that any split between connectionism and computationalism is really just a split between computationalism and dynamical systems, suggesting that the original debate was wholly misguided.

All of these views have led to considerable discussion on the issue amongst researchers, and it is likely that the debates will continue.

[edit] See also

[edit] References

  • Abdi, H. "A neural network primer. Journal of Biological Systems, 2, 247-281, (1994)".
  • Abdi, H. "[1] (2003). Neural Networks. In M. Lewis-Beck, A. Bryman, T. Futing (Eds): Encyclopedia for research methods for the social sciences. Thousand Oaks (CA): Sage. pp. 792-795.]".
  • Abdi, H. "[2] (2001). Linear algebra for neural networks. In N.J. Smelser, P.B. Baltes (Eds.): International Encyclopedia of the Social and Behavioral Sciences. Oxford (UK): Elsevier.]".
  • Abdi, H., Valentin, D., Edelman, B.E. (1999). Neural Networks. Thousand Oaks: Sage.
  • Rumelhart, D.E., J.L. McClelland and the PDP Research Group (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, Cambridge, MA: MIT Press
  • McClelland, J.L., D.E. Rumelhart and the PDP Research Group (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 2: Psychological and Biological Models, Cambridge, MA: MIT Press
  • Pinker, Steven and Mehler, Jacques (1988). Connections and Symbols, Cambridge MA: MIT Press.
  • Jeffrey L. Elman, Elizabeth A. Bates, Mark H. Johnson, Annette Karmiloff-Smith, Domenico Parisi, Kim Plunkett (1996). Rethinking Innateness: A connectionist perspective on development, Cambridge MA: MIT Press.
  • Marcus, Gary F. (2001). The Algebraic Mind: Integrating Connectionism and Cognitive Science (Learning, Development, and Conceptual Change), Cambridge, MA: MIT Press

[edit] External links