Predictive coding

Predictive coding models suggest that the brain is constantly generating and updating hypotheses that predict sensory input at varying levels of abstraction. This framework is in contrast to more mainstream views of the brain as integrating exteroceptive information through a predominantly feedforward process, with feedback connections playing a more minor role in cortical processing.

Origins

Theoretical ancestors to predictive coding date back as early as 1860 with Helmholz’s concept of unconscious inference (Clark, 2013). Unconscious inference refers to the idea that the human brain fills in visual information to make sense of a scene. For example, if something is relatively smaller in than another object in the visual field, the brain uses that information as a likely cue of depth, such that the perceiver ultimately (and involuntarily) experiences depth. The understanding of perception as the interaction between sensory stimuli (bottom-up) and conceptual knowledge (top-down) continued to be established by McClelland and Rumelhart in 1981. Their seminal paper examines the interaction between processing features (lines and contours) which form letters, which in turn form words. While the features suggest the presence of a word, they found that when letters were situated in the context of a word, people were able to identify them faster than when they were situated in a non-word without semantic context. McClelland and Rumelhart’s parallel processing model describes perception as the meeting of top-down (conceptual) and bottom-up (sensory) elements.

In the late 1990s, the idea of top-down and bottom-up processing was translated into a computational model of vision by Rao and Ballard (1990). Their paper demonstrated that there could be a generative model of a scene (top-down processing), which would receive feedback via error signals (how much the visual input varied from the prediction), which would subsequently lead to updating the prediction. The computational model was able to replicate well-established receptive field effects, as well as less understood extra-classical receptive field effects such as end-stopping. Today, the fields of computer science and cognitive science incorporate these same concepts to create the multilayer generative models that underlie machine learning and neural nets (Hinton, 2010).

General framework

Most of the research literature in the field has been about sensory perception, particularly vision, which is more easily conceptualized. However, the predictive coding framework could also be applied to different neural systems. Taking the sensory system as an example, the brain solves the seemingly intractable problem of modelling distal causes of sensory input through a version of Bayesian inference. It does this by modelling predictions of lower-level sensory inputs via backward connections from relatively higher levels in a cortical hierarchy (Clark, 2013). Constrained by the statistical regularities of the outside world (and certain evolutionarily prepared predictions), the brain encodes top-down generative models at various temporal and spatial scales in order to predict and effectively suppress sensory inputs rising up from lower levels. A comparison between predictions (priors) and sensory input (likelihood) yields a difference measure (e.g. prediction error, free energy, or surprise) which, if it is sufficiently large beyond the levels of expected statistical noise, will cause the generative model to update so that it better predicts sensory input in the future.

If, instead, the model accurately predicts driving sensory signals, activity at higher levels cancels out activity at lower levels, and the posterior probability of the model is increased. Thus, predictive coding inverts the conventional view of perception as a mostly bottom-up process, suggesting that it is largely constrained by prior predictions, where signals from the external world only shape perception to the extent that they are propagated up the cortical hierarchy in the form of prediction error.

Precision weighting

Expectations about the precision (or inverse variance) of incoming sensory input are crucial for effectively minimizing prediction error in that the expected precision of a given prediction error can inform confidence in that error, which influences the extent to which the error is weighted in updating predictions (Feldman & Friston, 2010). Given that the world we live in is loaded with statistical noise, precision expectations must be represented as part of the brain’s generative models, and they should be able to flexibly adapt to changing contexts. For instance, the expected precision of visual prediction errors likely varies between dawn and dusk, such that greater conditional confidence is assigned to errors in broad daylight than errors in prediction at nightfall (Hohwy, 2012). It has recently been proposed that such weighting of prediction errors in proportion to their estimated precision is, in essence, attention (Friston, 2009), and that the process of devoting attention may be neurobiologically accomplished by ascending reticular activating systems (ARAS) optimizing the “gain” of prediction error units.

Active inference

The same principle of prediction error minimization has been used to provide an account of behavior in which motor actions are not commands but descending proprioceptive predictions. In this scheme of active inference, classical reflex arcs are coordinated so as to selectively sample sensory input in ways that better fulfill predictions, thereby minimizing proprioceptive prediction errors (Friston, 2009). Indeed, Adams et al. (2013) review evidence suggesting that this view of hierarchical predictive coding in the motor system provides a principled and neurally plausible framework for explaining the agranular organization of the motor cortex. This view suggests that “perceptual and motor systems should not be regarded as separate but instead as a single active inference machine that tries to predict its sensory input in all domains: visual, auditory, somatosensory, interoceptive and, in the case of the motor system, proprioceptive” (Adams, Shipp, & Friston, 2013).

Neural theory in predictive coding

Evaluating the empirical evidence that suggests a neurologically plausible basis for predictive coding is a broad and varied task. For one thing, and according to the model, predictive coding occurs at every iterative step in the perceptual and cognitive processes; accordingly, manifestations of predictive coding in the brain include genetics, specific cytoarchitecture of cells, systemic networks of neurons, and whole brain analyses. Due to this range of specificity, different methods of investigating the neural mechanisms of predictive coding have been applied, where available; more generally, however, and at least as it relates to humans, there are significant methodological limitations to investigating the potential evidence and much of the work is based on computational modeling of microcircuits in the brain. Notwithstanding, there has been substantial [theoretical] work that has been applied to understanding predictive coding mechanisms in the brain. This section will focus on specific evidence as it relates to the predictive coding phenomenon, rather than analogues, such as homeostasis (which are, nonetheless, integral to our overall understanding of Bayesian inference but already supported heavily; see Clark, 2012 for a review).

Much of the early work that applied a predictive coding framework to neural mechanisms came from sensory neurons, particularly in the visual cortex (e.g., Rao and Ballard, 1999; Bolz & Gilbert, 1986).

More generally, however, what seems to be required by the theory are [at the least] two types of neurons [at every level of the perceptual hierarchy]: one set of neurons that encode incoming sensory input, so called feed-backward projections; one set of neurons that send down predictions, so called feed-forward projections. It is important to note that these neurons must also carry properties of error detection; which class of neurons has these properties is still up for debate (see Koster-Hale & Saxe, 2013; Seth, 2013). These sort of neurons have found support in superficial and non-superficial pyramidal neurons.

At a more whole-brain level, there is evidence that different cortical layers (aka laminae) may facilitate the integration of feedforward and feed-backward projections across hierarchies. These cortical layers, divided into granular, agranular, and dysgranular, which house the subpopulations of neurons mentioned above, are divided into 6 main layers. The cytoarchitecture within these layers are the same, but they differ across layers. For example, layer 4 of the granular cortex contain granule cells which are excitatory and distribute thalamocortical inputs to the rest of the cortex. According to one model:

“...prediction neurons... in deep layers of agranular cortex drive active inference by sending sensory predictions via projections ...to supragranular layers of dysgranular and granular sensory cortices. Prediction-error neurons ….in the supragranular layers of granular cortex compute the difference between the predicted and received sensory signal, and send prediction-error signals via projections...back to the deep layers of agranular cortical regions. Precision cells … tune the gain on predictions and prediction error dynamically, thereby giving these signals reduced (or, in some cases, greater) weight depending on the relative confidence in the descending predictions or the reliability of incoming sensory signals.” (Barrett & Simmons, 2015)

In sum, the neural evidence is still in its infancy.

Applying predictive coding

Perception

The empirical evidence for predictive coding is most robust for perceptual processing. As early as 1999, Rao and Ballard proposed a hierarchical visual processing model in which higher-order visual cortical area sends down predictions and the feedforward connections carry the residual errors between the predictions and the actual lower-level activities (Rao and Ballard, 1999). According to this model, each level in the hierarchical model network (except the lowest level, which represents the image) attempts to predict the responses at the next lower level via feedback connections, and the error signal is used to correct the estimate of the input signal at each level concurrently (Rao and Ballard, 1999). Emberson et al. established the top-down modulation in infants using a cross-modal audiovisual omission paradigm, determining that even infant brains have expectation about future sensory input that is carried downstream from visual cortices and are capable of expectation-based feedback (Emberson et al., 2015). Functional near-infrared spectroscopy (fNIRS) data showed that infant occipital cortex responded to unexpected visual omission (with no visual information input) but not to expected visual omission. These results establish that in a hierarchically organized perception system, higher-order neurons send down predictions to lower-order neurons, which in turn sends back up the prediction error signal.

Interoception

There have been several competing models for the role of predictive coding in interoception.

In 2013, Anil Seth proposed that our subjective feeling states, otherwise known as emotions, are generated by predictive models that is built actively of causal interoceptive appraisals. In relation to how we attribute internal states of others to causes, Sasha Ondobaka, James Kilner, and Karl Friston (2015) proposed that the free energy principle requires the brain to produce a continuous series of predictions with the goal of reducing the amount of prediction error that manifests as “free energy”. These errors are then used to model anticipatory information about what the state of the outside world will be and attributions of causes of that world state, including understanding of causes of others’ behavior. This is especially necessary because, to create these attributions, our multimodal sensory systems need interoceptive predictions to organize themselves. Therefore, Ondobaka posits that predictive coding is key to understanding other people’s internal states.

In 2015, Lisa Barrett and W. Kyle Simmons (2015) proposed the Embodied Predictive Interoception Coding model, a framework that unifies Bayesian active inference principles with a physiological framework of corticocortical connections. Using this model, they posited that agranular visceromotor cortices are responsible for generating predictions about interoception, thus, defining the experience of interoception.

Challenges

As a mechanistic theory, predictive coding has not been mapped out physiologically on the neuronal level. One of the biggest challenges to the theory has been the imprecision of exactly how prediction error minimization works (Kogo & Trengove, 2015). In some studies, the increase in BOLD signal has been interpreted as error signal in some studies while it indicates changes in the input representation in others (Kogo & Trengove, 2015). A crucial question that needs to be addressed is what exactly constitutes error signal and how it is computed at each level of information processing (Bastos et al., 2012). Another challenge that has been posed is predictive coding’s computational tractability. According to Kwisthout and Rooij, the subcomputation in each level of the predictive coding framework potentially hides a computationally intractable problem, which amounts to “intractable hurdles” that computational modelers have yet to overcome (Kwisthout & Rooij, 2013). Future research could focus on clarifying the neurophysiological mechanism and computational model of predictive coding.

See also

References

    This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.