Limits of innateness: Are we born to see faces?

Sociologists tend to be skeptical of claims individuals are consistent across situations, as a recent exchange on Twitter exemplifies. This exchange was partially spurred by revelations that the famous Stanford Prison Experiment (which supposedly showed people will quickly engage in behaviors commensurate with their assigned roles even if it means being cruel to others), was even more problematic than previously thought.


The question of individual “durability” is sometimes framed as “nature vs nurture,” and this is certainly a part of the matter. In sociology, however, this skepticism of “durability” often goes much further than innateness, and sometimes leads sociologists to suggest individuals are inchoate blobs until situations come along to construct us (or interlocutors may resort to obfuscation by touting the truism that humans are always in a situation). If pushed on the topic, however, even the staunchest situationalist would likely concede that humans are born with some qualities, and the real question is what are the limits of such innateness? What kinds of qualities of people can be innate? To what extent are these innate qualities human universals? And, if we are “born with it” can  “it” change and how and to what extent? In Stephen Turner’s new Cognitive Science and the Social, he puts the matter succinctly:

“…children quickly acquire the ability to speak grammatically. This seems to imply that they already had this ability in some form, such as a universal set of rules of language stored in the brain. If one begins with this problem, one wants a model of the brain as “language ready.” But why stop there? Why think that only grammatical rules are innate? One can expand this notion to the idea of the “culture-ready” brain, one that is poised and equipped to acquire a culture” (2018:44–45).

As I’ve previously discussed, the search for either the universal rules or specialized module for language has, thus far, failed. Nevertheless, most humans must be “language-ready” in the minimal sense of having the ability to acquire the ability to speak and understand speech. But, answering the question of where innateness ends and enculturation begins is not easy. Even for those without the disciplinary inclination toward strongly situationalist arguments.

Are we born to see faces?

How we identify faces is a good place to explore this difficulty: Do we learn to identify faces or are we born to see faces? And, if we are born to see faces, is this ability refined through use and to what extent? Enter: the fusiform face area  (FFA). Just like language, the FFA is often used as evidence for the more general arguments of functional localization and domain specificity. This argument goes: facial recognition is produced not by generic cognitive processes involved in vision (or other generic processes), but rather an inborn special-purpose module.

One reason why faces are an even better candidate for grappling with the question of innateness than is language is that the human fetus is exposed to language while in the womb. Human fetuses gain some sense of prosody, tonality, and as a result, a basic sense of grammar in the course of development in utero. There is no comparable exposure to faces, however. Another reason is, as the Gestalt psychologists argued, faces have an irreducible structure such that they are perceived as complete wholes even when viewing only a part — “the whole is something else than the sum of its parts, because summing is a meaningless procedure, whereas the whole-part relationship is meaningful” (Koffka 1935:176).

Facial recognition encompasses two related functions: distinguishing faces from non-face objects and distinguishing among faces. The key debate within this area of cognitive neuroscience is whether there is a module that is specialized for one or both of these processes (Kanwisher, McDermott, and Chun 1997; Kanwisher and Yovel 2006), as opposed to a distributed and generic cognitive process (Haxby et al. 2001). This debate goes back to the observation that humans struggle to recognize and remember faces that are upside down, which seemed to be the case for faces more so than any other non-face object (Diamond and Carey 1986) — suggesting something about faces made them unique. 20181014-Selection_001.png The proposal facial recognition was the result of a specialized module, however, begins with a relatively recent paper by Kanwisher et al. (1997). Using functional magnetic resonance imaging (which I’ve discussed in detail in previous posts), 15 subjects were shown various common objects as well as faces. They found in 12 of those subjects a specific area of the brain was more active when they saw faces than when they saw non-face objects. On its face, it seems like reasonable evidence humans are born with a module necessary for identifying faces.

However, when one squares this claim with the underlying logic of fMRI—it is used to (a) measure relative activation, not an on/off process, and (b) voxel and temporal resolution is far too coarse to conclude a region is homogeneously activated—the claim that the FFA is a functionally specialized module for facial recognition weakens considerably.  These areas are not entirely inactive when viewing non-face objects. Indeed, relative to baseline activation, subsequent research found the FFA is significantly more active when viewing various objects (Grill-Spector, Sayres, and Ress 2006). Specifically, the level of specificity of the stimulus (e.g. faces tend to be individuals whereas chairs tend to be generic) and the participants level of expertise with the stimulus (e.g. car and bird enthusiasts) predicted greater relative activation (Gauthier et al. 2000; Rhodes et al. 2004).

Finally, if we are born to distinguish faces from non-faces, the ability to distinguish among faces is considerably trained by early socialization, and such socialization introduces a lot of variation among people. For example, one of the earliest attempts to measure facial recognition concluded, “that women are perhaps superior to men in the test; that salespeople are superior to students and farm people; that fraternity people are perhaps superior to non-fraternity people…” (Howells 1938:127).

Subsequent research in this vein found individuals are better at distinguishing among their racial/ethnic ingroups than their outgroups. In an early study of black and white students from a predominantly black university and a predominantly white university, researchers found participants more easily discriminated among faces of their own race. They also found “white faces were found more discriminable” overall, which they suggest may be the result of “the distribution of social experience is such that both black persons and white persons will have had more exposure to white faces than black faces in public media…” (Malpass and Kravitz 1969:332). Summarizing more recent work, Kubota et al.  (2012) state “participants process outgroup members primarily at the category level (race group) at the expense of encoding individuating information because of differences in category expertise or motivated ingroup attention.”

Why should sociologists care?

To summarize, the claim that facial recognition emerges from an innate functionally-specialized cognitive module is weakened in three ways: the FFA responds to more generic features faces share with other objects; the FFA is implicated in a distributed neural network rather than solely a discrete module; the FFA is used for non-facial recognition functions; and finally, facial recognition is trained by our (social) experience. Why should sociologists care? I think there are three reasons. First, innateness is not deterministic or specific but rather constraining and generic. Second, these constraints ripple throughout our social experience, forming the contours of cultural tropes, but are not immutable. Third, limited innateness does not mean individuals are not durable across situations, even (near) universally so.

A dispositional and distributed theory of cognition and action accounts for object recognition by its use: “information about salient properties of an object—such as what it looks like, how it moves, and how it is used—is stored in sensory and motor systems active when that information was acquired” (Martin 2007:25). This is commensurate with the broad approach many of the posts on this blog have been working with. Perhaps, however, there is a special class of objects for which this is not exactly the case. In other words, the admittedly weak innateness of distinguishing unfamiliar faces from non-face objects is, perhaps, the evidence we are “born with” some forms of nondeclarative knowledge (Lizardo 2017).

Such nondeclarative knowledge, however, may be re-purposed for cultural ends. Following the logic of neural exaption, discussed in a previous post, humans can be born with predispositions, especially related to very generic cognitive processes, which are further trained, refined, and recycled for novel uses, novel uses which are nevertheless constrained in a way that yields testable predictions. A fascinating example related to facial perception is anthropomorphization. If rudimentary facial recognition is innate (and therefore, probably evolutionarily old), this inherently social-cognitive process is being reused for non-social purposes (i.e. non-social in the restricted sense of interpersonal interaction). This facial recognition network—together with other neuronal networks—is used to identify people and predict their behavior, and this may be adapted to non-human animate and inanimate objects, like natural forces, as well as anonymous social structures, like financial markets.

What this means, following the logic of neural reuse and conceptual metaphor theory, is that the target domain (e.g. derivative markets, earthquakes) is “contaminated” by predispositions which originally dealt with the source domain (here, interpersonal interaction). This means attempting to imagine the intentions of thousands of unknown traders as if inferring the intentions of an interlocutor may lead traders to “ride” financial bubbles (De Martino et al. 2013). Therefore, what is and is not innate is a messy question to answer — even by those without a disciplinary distrust of innateness claims. Although cognitive neuroscientists are making headway, it remains an empirical question which objects are recognized innately and the extent to which the object recognition is robust to enculturation and neural recycling.

More importantly, the question of individual durability across situations should not be reduced solely to “nature vs nurture.” That is, we must grapple with the question of once these processes are so trained in an individual (during “primary socialization”), how easily can they be re-trained, if at all? In John Levi Martin’s Thinking Through Theory (2014:249), the third of his “Newest Rules of Sociological Method” is pessimistic in this regard: “Most of what people think of as cultural change is actually changes in the compositions of populations.” That is, even if we were to bar the possibility of innateness in any strong sense, once individuals reach a certain age they are likely to be fairly consistent across situations, with little chance of altering in fundamental ways.


De Martino, Benedetto, John P. O’Doherty, Debajyoti Ray, Peter Bossaerts, and Colin Camerer. 2013. “In the Mind of the Market: Theory of Mind Biases Value Computation during Financial Bubbles.” Neuron 79(6):1222–31.

Diamond, Rhea and Susan Carey. 1986. “Why Faces Are and Are Not Special: An Effect of Expertise.” Journal of Experimental Psychology. General 115(2):107.

Gauthier, I., P. Skudlarski, J. C. Gore, and A. W. Anderson. 2000. “Expertise for Cars and Birds Recruits Brain Areas Involved in Face Recognition.” Nature Neuroscience 3(2):191–97.

Grill-Spector, Kalanit, Rory Sayres, and David Ress. 2006. “High-Resolution Imaging Reveals Highly Selective Nonface Clusters in the Fusiform Face Area.” Nature Neuroscience 9(9):1177–85.

Haxby, J. V., M. I. Gobbini, M. L. Furey, A. Ishai, J. L. Schouten, and P. Pietrini. 2001. “Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex.” Science 293(5539):2425–30.

Howells, Thomas H. 1938. “A Study of Ability to Recognize Faces.” Journal of Abnormal and Social Psychology 33(1):124.

Kanwisher, Nancy and Galit Yovel. 2006. “The Fusiform Face Area: A Cortical Region Specialized for the Perception of Faces.” Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 361(1476):2109–28.

Kanwisher, N., J. McDermott, and M. M. Chun. 1997. “The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception.” The Journal of Neuroscience: The Official Journal of the Society for Neuroscience 17(11):4302–11.

Koffka, Kurt. 1935. Principles of Gestalt Psychology. New York: Harcourt, Brace.Kubota, Jennifer T., Mahzarin R. Banaji, and Elizabeth A. Phelps. 2012. “The Neuroscience of Race.” Nature Neuroscience 15(7):940–48.

Lizardo, Omar. 2017. “Improving Cultural Analysis Considering Personal Culture in Its Declarative and Nondeclarative Modes.” American Sociological Review 0003122416675175.

Malpass, R. S. and J. Kravitz. 1969. “Recognition for Faces of Own and Other Race.” Journal of Personality and Social Psychology 13(4):330–34.

Martin, Alex. 2007. “The Representation of Object Concepts in the Brain.” Annual Review of Psychology 58(1):25–45.

Martin, John Levi. 2014. Thinking Through Theory. W. W. Norton, Incorporated.

Rhodes, Gillian, Graham Byatt, Patricia T. Michie, and Aina Puce. 2004. “Is the Fusiform Face Area Specialized for Faces, Individuation, or Expert Individuation?” Journal of Cognitive Neuroscience 16(2):189–203.

Turner, Stephen P. 2018. Cognitive Science and the Social: A Primer. Routledge.