What Is Predictive Processing?24 min read

In previous posts (see here and here), we have made reference to “predictive processing” (hereafter PP) or the “predictive coding” approach to cognition and action. While predictive processing or the “predictive turn” now comprises a sprawling literature ranging across several disciplines and disciplinary interfields (see Hohwy, 2020). The broad range of ancillary fields in which PP finds influence is, in fact, striking, including among them clinical psychology, psychiatry, addictionology, plant science, ergonomics, kinesiology, and even, recently, literature (Kukkonen, 2020). Despite this recent rise in widespread influence, PP may still be relatively unfamiliar to sociologists and social scientists more generally (but see Hoey, 2021). To bridge this gap, in this post, we review the basic tenets of the PP approach to the explanation of perception, cognition, action, and even consciousness. Given space limitations, this review is necessarily schematic and incomplete, as the literature on PP that has accumulated just in the last seven to ten years is vast (e.g., since Clark’s 2013 influential review). Instead, we aim to whet the sociological appetite with what we see as a promising unifying approach to accounting for the roots of such high-level phenomena as motivation, enculturation, and activity at multiple time-scales (Clark, 2015; Fabry, 2015; Miller Tate, 2019).

So what is Predictive Processing? PP is a theoretical framework proposing a set of unifying principles designed to describe brain structure and function at multiple levels (see Foster, 2018). It has been mainly developed as a general theory at the computational and algorithmic levels, although it has been shown to have testable implications about the brain’s expected cytoarchitectonic organization (Shipp et al., 2013). That is, PP is compatible with implementation-level details gleaned from research on cortical neurophysiology. Of most importance, for our purposes, is the fact that PP seems to use the same set of principles to describe the operation of subpersonal mechanisms involved in perception, action, and cognition, and descriptions couched in personal terms, such as, for instance, the language of expectation, anticipation, and orientation, and even first-person reports on phenomenal experience (Wiese & Metzinger, 2017, p. 2).

Main Substantive Claims of the PP Approach

So what is the primary claim of PP? The core idea is that the brain, its subpersonal components (e.g., brain networks, cortical and subcortical layers), the whole person, and perhaps, people in concert, are in one primary business, and that business is prediction. More specifically, across all levels of biological organization, prediction is the very “purpose” of the brain, or more accurately, the brain structures have been evolutionarily selected to engage in prediction to facilitate organismic survival and genetic reproduction via the control of action (Brown & Brüne, 2012).

What is prediction in the PP context? In its essence, prediction is the best guess as to the causes of an incoming signal yielded by a pre-existing probabilistic generative model (itself the product of previous experience), which could be produced by the external environment (at the personal level) or by other brain structures (at the subpersonal level). The difference between model-generated expectation (the “hypothesis”) and what is experienced (“the data”) is the prediction error. The best and most effectively adapted generative model of the environment’s causal structure is the one that produces the smallest prediction error. Thus, the brain can be thought of as a dynamic system that continually adjusts its generative models of the environment across multiple hierarchical levels and associated time-scales so that they (in the medium and long run) minimize prediction error. The goal is always to minimize the difference between the generative model’s expectations and the incoming signal or stimulation. This difference (between predicted and “observed”) is, as in standard linear regression models in the social sciences, referred to as “error.” According to PP, the brain is fundamentally an organ for prediction error minimization (PEM).

Importantly, the same error minimization principle holds across all levels of the explanatory hierarchy, from the most elementary subpersonal structures to the acting person in a lifeworld, which means that brain networks, subcortical regions, and the body embedded in an environment are all engaged in PEM. Both subpersonal structures and people as active, pattern-maintaining, surprise-reducing systems work together in tandem across levels, assuming that prediction extends equally to all of them. Accordingly, from the PP perspective, prediction is an activity and a predicate applicable at multiple levels of biological organization, from neurons to neuronal populations, structural and functional brain networks, organisms, and even populations of agents (Fabry & Kukkonen, 2018; K. J. Friston & Stephan, 2007; Pezzulo et al., 2015). Thus, even if “people” do not predict following a normative statistical theory, they can be described as performing a type of “unconscious inference” (in the manner of intimated by Helmholtz) subpersonally. Prediction is thus not a purely personal predicate but can be seen operating at multiple hierarchical levels and is thus instantiated and realized at each level by distinct structural components and functional activities.

Perceptual Inference

The PP approach rethinks and reconfigures our understanding of all the major cognitive activities and functions, from perception to consciousness, memory and action, and everything in between. Take, for instance, the standard (and well-researched) case of vision (see Marr 1982). In the usual account, visual representations emerge gradually at multiple hierarchical levels in the brain, from the simplest to the most complex. The process begins with the direct stimulation of retinal photoreceptors. This step is presumably followed by the generation of “low level” representation of the visual scene (decomposed into edges, angles surfaces, and the like), through higher-level representations (happening in parallel via ventral and dorsal pathways, upwards via hierarchically organized layers in the visual cortex, so that the result is the perception of a coherent visual scene (Hohwy, 2013, p. 19).

The PP account stands this picture on its head. Instead of beginning with sensory stimulation, we begin with a chain of probabilistic predictions of that input at multiple nested levels. Rather than a one-way flow beginning with direct sensory stimulation and ending with a coherent visual scene, PP proposes that at each hierarchical level, brain networks encode a generative model of the input that comes from the level directly preceding them in the hierarchy. The job description of each level of cortical organization is to predict (by minimizing error) the information coming from the level below across multiple layers. Whatever prediction error is leftover is then passed on to the next layer, where the error is the difference between that layer’s generative model’s anticipation of the previous layer’s reaction (in the form of its typical prediction error pattern) to similar stimuli as in the past. This ensemble of subpersonal mechanisms, acting in concert, thus ensure “that the causal structure of the world is recapitulated within the brain” (Hohwy 2013: 62). Pezzulo et al. (2015) note, “In predictive coding, perception is regarded as an inference process…whose aim is to minimize prediction errors or the difference between empirical priors (which play the role of perceptual hypotheses) and current sensations.”

As such, at all levels, the brain actively attempts to anticipate what is coming and thus aims to cancel out (not richly reconstruct) stimulation coming from the environment (in the neural layers immediately coding for environmental contingencies). Rather than being static “pictures” of the world, generative models in PP are composed of dynamically structured sets of “best guesses” that attempt to encode the probabilistic structure of the environment. They are “generative” because they are used to produce the best “estimates” of the likelihood of current experiences given our best predictive models. This process happens at multiple levels, from the neural to the organismic, phenomenological (seeing, feeling, touching, smelling, hearing), and up to the cultural and social.

Generative models, which preserve the causal/probabilistic structure of the world, preemptively send top-down signals attempting to match the incoming bottom-up ones at each level not to have to represent them anew. At all subpersonal levels, what we have is thus an orchestrated attunement of rich generative models of the environment aimed, not at spectatorial “representation” but instead active cancelation, reconciliation, and attunement between the predictions generated by the internalized model and what is received from the level down below. This is called “hierarchical predictive coding” and has been shown to provide an empirically plausible model of the workings of the visual cortex (Rao & Ballard, 1999).

This implies that during action, interaction, communication, and enculturation, the environment’s probabilistic structure is encoded by persons in the form of (also probabilistic) generative models. Persons then use and fine-tune, those models to cancel the error produced as the difference between the model’s predictions and the information provided by dynamically evolving experiential manifolds. Predictive processing thus breaks with passive conceptions of the link between perception, cognition, and action, in which we only have a one-directional arrow from environmental stimuli to cognitive representation, and then to action control.

The “self-tickling” problem is an apt demonstration of these principles. Why is it that others can tickle us, and this is immediately perceived, but when we try to tickle ourselves, we usually fail? While other accounts have trouble with this issue, the PP account handles it elegantly. When other people tickle us, the prediction problem, given our incomplete generative model of their action, is substantial. The tickling sensations are the endpoint of a large number of prediction errors across multiple layers of neuronal organization. Hence, we perceive what we cannot predict, and are sometimes amused or annoyed at the tickling.

In the case of self-tickling, on the other hand, prediction is simply par for the course. In subpersonal terms: the generation of an action (such as moving the hands during tickling) requires the generation of an “efference” copy of the motor command that is then used to predict our motor activity (via top-down predictive coding) to facilitate action control. In personal terms: when trying to tickle ourselves, we know what is coming and, therefore, do not perceive it in the same way. The task of minimizing the prediction error by incoming stimulation via hierarchical predictive coding is (subdoxastically) easier than when others try to tickle us. A continuous explanation, then, becomes clear: when we can predict our actions, we cancel out the sensory effect of the self-generated activity. In other words, we do not perceive what we can predict. More accurately, what is easier to predict is perceived more faintly, a phenomenon known as “somatosensory attenuation” (Kilteni et al., 2020). In the extreme case, as we will see later, when environmental contingencies are easily predicted, attention is withdrawn from them, resulting in various “inattentional blindness” phenomena, or failed registration of perceptual stimulation (Simons & Chabris, 1999).

The Principle of Active Inference

At the heart of the PP account’s potential contribution to reframing action theory in the social sciences is what PP theorists call active inference. The basic idea is that, when faced with prediction error across multiple hierarchical levels, we (or subpersonal structures in us) have two choices to minimize the error. We can either reconfigure the probabilistic structure of the stored generative model (at the cost of discarding a profitable history of experience in the face of what could be merely temporary environmental disturbances) or engage in activities aimed at selectively sampling incoming sensory information conforming with the hierarchically organized generative model.

This gets at the difference between what Vance (2017) refers to as “revision-PEM” and “action-PEM.” We prefer these terms, rather than something like “passive versus active” inference, because, according to PP, there is simply no such thing as passivity. Action (and prediction) is always constant. The only difference is whether we are actively coping with the world by canceling errors via an active updating of the generative or self-fulfilling selective sampling of the world via action. We also add the caveat that revision and action PEM represent “ideal-typical” ends of an action continuum, and not antithetical “types” of activity, as has been the classificatory penchant of standard action theory. Even the most mundane act of perception, cognition, or action leads to some updating of the overall generative world model (at some level of the hierarchy) and is, at the same time, an act geared toward self-fulfilling expectations of future experience derived from the same distributed model by engaging in the requisite selective experience-sampling activity.

Let us consider action-PEM, which also goes by the name of active inference. When faced with a situation in which our perceptual grip on the scene fails to align with what we expect, we can adjust, via action and locomotion, so that we begin to sample the scene in a way that accords with the generative model. Take, for instance, the following personal explanation:

Consider the example of walking along the wall at a museum and having your attention caught by an enormous picture just to your right. At first one is simply overwhelmed by its size, perhaps confused about how to respond. Quickly, though, one feels oneself pushed to step away from the wall in order to get the painting into view…one responds immediately to a felt demand of the situation; one is moved to retreat from the picture as if it were repelling one. As one steps back, and the painting comes into better view, the tension that was created by being too close to it is reduced. In such a situation the painting itself leads you to stand at an appropriate distance for seeing it (Dreyfus & Kelly, 2007, p. 52).

The “tension” produced by the enormous painting is the phenomenological signature of “error” that is the (subdoxastic) difference between the predictions issued by the generative model internalized by the museum patron. It makes a difference, in this sense, whether the generative model is produced through an accumulated history of exposure (e.g. “experience”) to artistic products, or whether this is the museum patron’s first time (Bourdieu, 1968). By adjusting our position, by moving our body, we bring the (perceived) world closer to the predictions of the generative model, thus “canceling out” prediction error. This is phenomenologically registered as a reduction in tension, and the “cognitive feeling” that this is “the right spot” from which to view the painting, but it does not follow as a consequence of inferential belief. Via active inference, we self-fulfill a perceptual reality that facilitates prediction from hard-won generative models acquired from a long accumulated history, or in the case of a first-timer, a nascent model that attempts to predict a whole barrage of error and may simply find that mimicking others (standing approximately where they are standing) will do for now.

The reference to the Mertonian self-fulfilling prophecy familiar to sociologists should be taken quite literally here (K. Friston, 2009, p. 295). In the PP account, organisms self-fulfill a perceptual (and cognitive, emotive, and affective) experiential reality via direct action in the world across all vertical levels of explanation from the subpersonal, to the personal, to the interpersonal. However, any alignment with self-fulfilling prophecy also should not mislead us into thinking that the “generative models” of PP are on a par with the “models” of traditional (idealist) cultural theory (Strand & Lizardo, 2015). As Bubic et al. note,

…predictions…drive our perception, cognition and behavior in a sense that we do not only passively match expected to incoming events and objectively evaluate the accuracy of our expectations, but actively try to fulfill those predictions by preferentially sampling corresponding features in the environment (2010).

PP and the Structure of Attention

Fundamentally, this means that not all discrepancies (prediction errors) between generated likelihoods and recent experience are created equal. At both personal and subpersonal levels, prediction errors that are inconsistent and less structured (e.g., the variance, or second moment, of the error signal, is large) are discounted and are thus less likely to generate either generative model updates or to motivate us to engage in active inference to cancel them out. Instead, it is repetitive, consistent error (error signals with low variance in the statistical sense) that is most likely to attract attention and motivate us to act to cancel out.

PP theorists refer to this as the principle of precision weighted prediction error minimization (Clark 2015, Chapter 2). More “precise” (lower variance) errors are the ones that are most likely to jump to the top of the list to be dealt with. These are also more likely to be allowed to change the underlying probabilistic structure of the internalized generative model if the error cannot be canceled via active resampling of the world (active inference). At the other end of the spectrum, attention is, therefore, a type of active inference (reasoning, action) induced by world offerings made “attention-worthy” because they generate highly weighted, precise, and difficult to predict errors (K. J. Friston & Stephan, 2007). This extends even so far that we can reflexively “notice ourselves doing [these types of active inference],” which appear to us as repeated actions or frequent thoughts; though this “noticing” has no impact on prediction error minimization (Metzinger 2013).

Thus, to put the principle of active inference in a manner that can be transposed directly into what objective probability means for social explanation: we attend only to that which we cannot consistently predict. Because prediction is in the service of non-perception (or canceling prediction error), and action is also in the service of assimilating experience to extant models, attention is necessarily a scarce resource to be mobilized only in the face of recalcitrant, precise, and insistent error. Hence, only when faced with reliable, stubborn evidence that extant generative models are not up to the predictive task does a “percept” become explicit and information-bearing, even if a generative model can never adapt to the probability that it will appear again (and again) in the active inferential loop. Note that, in this last respect, the action-theoretic notion of “hysteresis” (Strand & Lizardo, 2017) can thus easily be conceptualized in terms of PP as recalcitrance against updating internalized generative models of the environment produced by consistent past experience in the face of disconfirmation (high precision-weighted prediction error) by novel environmental circumstance that cannot be accommodated by the probabilistic structure of the current model. The prediction is that agents will try to use active inference to attempt to subsume the novel experiential set into the previous model (cancel out the error), but this will show up at the personal and institutional level as a form of “hysteresis” where old practices will seem to misfire (failed active inference).

Concluding Remarks

The cognitive neurosciences and the sciences of action are undergoing what has been referred to as a “predictive turn” (Clark, 2015), one that is increasingly integrated with a “pragmatic turn” (Engel et al., 2015) and previous turns toward embodiment and enaction (Clark, 2008; De Jaegher & Rohde, 2010). In this necessarily brief post, we have introduced predictive processing as one unifying perspective, with clear implications for the way social scientists conceive of the linkages between perception, cognition, action, enculturation, habituation and a host of other processes. We didn’t even scratch the surface, but we hope that the theoretical and substantive implications are clear. Future posts will extend this discussion to other topics and themes.

References

Bourdieu, P. (1968). Outline of a sociological theory of art perception. International Social Science Journal, 20(4), 589–612.

Brown, E. C., & Brüne, M. (2012). Evolution of social predictive brains? Frontiers in Psychology, 3, 414.

Bubic, A., von Cramon, D. Y., & Schubotz, R. I. (2010). Prediction, cognition and the brain. Frontiers in Human Neuroscience, 4, 25.

Clark, A. (2008). Supersizing the Mind: Embodiment, Action, and Cognitive Extension. Oxford University Press,.

Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. The Behavioral and Brain Sciences, 36(3), 181–204.

Clark, A. (2015). Surfing Uncertainty: Prediction, Action, and the Embodied Mind. Oxford University Press.

De Jaegher, H., & Rohde, M. (2010). Enaction: Toward a New Paradigm for Cognitive Science. MIT Press.

Dreyfus, H., & Kelly, S. D. (2007). Heterophenomenology: Heavy-handed sleight-of-hand. Phenomenology and the Cognitive Sciences, 6(1), 45–55.

Engel, A. K., Friston, K. J., & Kragic, D. (2015). The Pragmatic Turn: Toward Action-Oriented Views in Cognitive Science. MIT Press.

Fabry, R. E. (2015). Enriching the notion of enculturation: Cognitive integration, predictive processing, and the case of reading acquisition. Open MIND. Frankfurt am Main: MIND Group.

Fabry, R. E., & Kukkonen, K. (2018). Reconsidering the Mind-Wandering Reader: Predictive Processing, Probability Designs, and Enculturation. Frontiers in Psychology, 9, 2648.

Foster, J. G. (2018). Culture and computation: Steps to a Probably Approximately Correct theory of culture. Poetics , 68, 144–154.

Friston, K. (2009). The free-energy principle: a rough guide to the brain? Trends in Cognitive Sciences, 13(7), 293–301.

Friston, K. J., & Stephan, K. E. (2007). Free-energy and the brain. Synthese, 159(3), 417–458.

Hoey, J. (2021). Citizens, Madmen and Children: Equality, Uncertainty, Freedom and the Definition of State. https://doi.org/10.31235/osf.io/f463y

Hohwy, J. (2013). The Predictive Mind. Oxford University Press.

Hohwy, J. (2020). New directions in predictive processing. Mind & Language, 35(2), 209–223.

Kilteni, K., Engeler, P., & Ehrsson, H. H. (2020). Efference Copy Is Necessary for the Attenuation of Self-Generated Touch. iScience, 23(2), 100843.

Kukkonen, K. (2020). Probability Designs. Oxford University Press.

Miller, M., Kiverstein, J., & Rietveld, E. (2020). Embodying addiction: A predictive processing account. Brain and Cognition, 138, 105495.

Miller Tate, A. J. (2019). A predictive processing theory of motivation. Synthese. https://doi.org/10.1007/s11229-019-02354-y

Pezzulo, G., Rigoli, F., & Friston, K. (2015). Active Inference, homeostatic regulation and adaptive behavioural control. Progress in Neurobiology, 134, 17–35.

Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2(1), 79–87.

Shipp, S., Adams, R. A., & Friston, K. J. (2013). Reflections on agranular architecture: predictive coding in the motor cortex. Trends in Neurosciences, 36(12), 706–716.

Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: sustained inattentional blindness for dynamic events. Perception, 28(9), 1059–1074.

Strand, M., & Lizardo, O. (2015). Beyond world images: Belief as embodied action in the world. Sociological Theory, 33 (1), 44–70.

Strand, M., & Lizardo, O. (2017). The hysteresis effect: theorizing mismatch in action. Journal for the Theory of Social Behaviour, 47(2), 164–194.

Vance, J. (2017). Action prevents error: predictive processing without active inference. Johannes Gutenberg-Universität Mainz Frankfurt am Main.

Wiese, W., & Metzinger, T. (2017). Vanilla PP for Philosophers: A Primer on Predictive Processing. In T. Metzinger & W. Wiese (Eds.), Philosophy and Predictive Processing.

This site uses Akismet to reduce spam. Learn how your comment data is processed.