Categories, Part III: Expert Categories and the Scholastic Fallacy

There’s a story — probably a myth — about Pythagoras killing one of the members of his math cult because this member discovered irrational numbers (Choike 1980). (He also either despised or revered beans).

Screenshot from 2019-05-20 08-40-04.png
“Oh no, fava beans.” ~Pythagoras (Wikimedia Commons)

The Greeks spent a lot of time arguing about arche, or the primary “stuff.” Empedocles argued that it was the four elements. Anaximenes thought it was just air. Thales thought it was water. Pythagoras and his followers figured it was numbers (Klein 1992, page 64):

They saw the true grounds of the things in this world in their countableness, inasmuch as the condition of being a “world” is primarily determined by the presence of an “ordered arrangement” — [which] rests on the fact that the things ordered are delimited with respect to one another and so become countable.

For the Pythagoreans the clean, crisp integers were sacred because they conveyed a harmony — an orderedness — and there is an undeniable allure to this precision. (Indeed, such an allure that Pythagoras and his followers were driven to do some very strange things.)

Looking at even simple arithmetic, it does seem obvious that classical categories do in fact exists: there is a set of integers, a set of odd numbers, a set of even numbers, and so on. If we continue to follow this line of thought to pure mathematics in general, there is an almost mystical, quality of the “objects” of this discipline.

When thinking about mathematical objects like geometric forms, however, there is a fundamental difference between squares or circles or triangles as understood in our daily life (i.e. as having graded similarities to certain exemplar shapes we likely learned about in grade school) and the kind of perfectly precise shapes in theoretical geometry. That is, as far as we know, a perfect circle does not exist in nature (even though an electron’s spin and neutron stars are pretty damn close), nor has humankind been able to manufacture a perfect shape.

And this is the main point: precision is weird. If “crispness” is really only found in mathematics (and pure mathematics at that), then we should be skeptical of the analytical traditions’ use of discrete units as an analogy for knowledge in general.

But, sometimes, thinking with classical categories is useful.

Property Spaces

While we can be skeptical of the Chomskyan program presuming syntactical units must necessarily be classical categories, this does not mean we can never proceed as if phenomena could be divided into crisp sets.

Theorists commonly make something like “n by n” tables, typologies, or more technically, property spaces — for the classic statement see Lazarsfeld (1937) and Barton (1955), but this is elaborated in (Ragin 2000, page 76-85), Becker (Becker 2008, page 173-215), and most extensively in chapters 4, 5, and 6 of Karlsson and Bergman (2016). In this procedure, the analyst outlines a few dimensions that account for the most variation in their empirical observations. This is essentially “dimension reduction,” as we take the inherent heterogeneity (and particularity) of social experience and simplify this into the patterns that are the most explanatory (if only ideal-typical).

For example, Alejandro Portes and Julia Sensenbrenner (1993) tell us that there are four sources of social capital (each deriving conveniently from the work of Durkheim, Simmel, Weber, and Marx and Engels, respectively). These four sources are then grouped into those that come from “consummatory” (or principled) motivations and those that come from “instrumental” motivations. Thus the “motivation” is the single dimension that divides our Social Capital property space into a Set A and a Set B: either resources are exchanged because of the actor’s own self-interest, or not. More often, however, these basic property spaces based on simple categorical distinctions are the starting point for more complex (or “fitted”) property spaces.

Consider Aliza Luft’s excellent “Toward a Dynamic Theory of Action at the Micro Level of Genocide: Killing, Desistance, and Saving in 1994 Rwanda.” Luft begins with a critique of prior categorical thinking: “Research on genocide tends to pregroup actors—as perpetrators, victims, or bystanders—and to study each as a coherent collectivity (often identified by their ethnic category)” (Luft 2015, page 148). Previously, analysts explained participation in genocide in one of four ways: (1) members of the perpetrating group were obedient to an authority, (2) responding to intergroup antagonism, (3) succumbing to intragroup norms or peer pressure, (4) and finally, ingroup members dehumanize the outgroup. While all are useful theories, she explains, they are complicated by the empirical presence of behavioral variation. That is, not everyone associated with a perpetrating group engages in violence at the same time or consistently throughout a conflict (and may even save members of the victimized group).

Screenshot from 2019-05-20 08-45-07

What she does to meet this challenge is to add dimensions to a binary property space which previously consisted of a group committing murder and a group being murdered. Focusing on the former, she notes that (1) not everyone in that group does actually participate, (2) some of those who did (or did not) participate eventual cease (or begin) participating, (3) some of those who did not participate not only desisted but also actively saved members of the outgroup. Taking this together, we arrive at a property space that can be presented by the spanning tree shown above. Luft then outlines four mechanisms that explain “behavioral boundary crossing.”

In this case, previous expert categories lead to an insufficient explanation for the perpetration of genocide, and elaboration proved necessary. Attempting to create classical categories — with rules for inclusion and exclusion and the presumption of mutual exclusivity in which all members are equally representative — is likely a necessary step in the theorizing process. Much of the work of developing theory, however, is not just showing that these categories are insufficient (because, of course, they are), but rather pointing out where this slippage is leading to problems in our explanations, and how they can be mended, as Aliza does. 

The Scholastic Fallacy

Treating data or theory as if they can be cleanly divided into crisp sets is like the saying “all models are wrong, but some models are useful.” But taking for granted these distinctions can also lead analysts to commit the “scholastic fallacy.”

This is when the researcher “project[s] his theoretical thinking into the heads of acting agents…” (Bourdieu 2000, page 51).  This, according to Bourdieu, was a key folly of structuralism: “[Levi-Strauss] built formal systems that, though they account for practices, in no way provide the raison d’etre of practices” (Bourdieu 2000, page 384). This seems especially obvious for categories, as discussed in my previous two posts. It is one thing to say people can be divided into X group and Y group for Z reasons, and it is another to say people do divide other people in X group and Y group for Z reasons (see Martin 2001, or more generally Martin 2011)

Categorizing for the “acting agent” is not a matter of first learning rules and then applying them to demarcate the world into mutually exclusive clusters. It is, for the most part, a matter of simply “knowing it when I see it” —  a skill of identifying and grouping that we have built up through the accrued experience of redundant patterns encountered in mundane practices. Generally, rules, if they are used, are produced in post hoc justifications of our intuitive judgment about group memberships. It is here, however, where expert discourse is likely to play the largest role in lay categorizing: as a means to justify what we already believe to be the case.

This is not to say “non-experts” cannot or do not engage in this kind of theoretical thinking about categories. But, again Bourdieu points out, most people do not have the “leisure (or the desire) to withdraw from [the world]” so as to think about it in this way (Bourdieu 2000, page 51). More importantly, relying on expert categories for most of the tasks in our everyday lives would not be very useful because categorizing is foremost about reducing the cognitive demands of engaging with an always particular and continuously evolving reality.


Barton, Allen H. 1955. “The Concept of Property-Space in Social Research.” The Language of Social Research 40–53.

Becker, Howard S. 2008. Tricks of the Trade: How to Think about Your Research While You’re Doing It. University of Chicago Press.

Bourdieu, P. 2000. Pascalian Meditations. Stanford University Press.

Choike, James R. 1980. “The Pentagram and the Discovery of an Irrational Number.” The Two-Year College Mathematics Journal 11(5):312–16.

Karlsson, Jan Ch and Ann Bergman. 2016. Methods for Social Theory: Analytical Tools for Theorizing and Writing. Routledge.

Klein, Jacob. 1992. Greek Mathematical Thought and the Origin of Algebra. Courier Corporation.

Lazarsfeld, Paul F. 1937. “Some Remarks on the Typological Procedures in Social Research.” Zeitschrift Für Sozialforschung 6(1):119–39.

Luft, Aliza. 2015. “Toward a Dynamic Theory of Action at the Micro Level of Genocide: Killing, Desistance, and Saving in 1994 Rwanda.” Sociological Theory 33(2):148–72.

Martin, John Levi. 2001. “On the Limits of Sociological Theory.” Philosophy of the Social Sciences 31(2):187–223.

Martin, John Levi. 2011. The Explanation of Social Action. Oxford University Press, USA.

Portes, A. and J. Sensenbrenner. 1993. “Embeddedness and Immigration: Notes on the Social Determinants of Economic Action.” The American Journal of Sociology.

Ragin, Charles C. 2000. Fuzzy-Set Social Science. University of Chicago Press.

Categories, Part II: Prototypes, Fuzzy Sets, and Other Non-Classical Theories

A few years ago The Economist published “Lil Jon, Grammaticaliser.” “Lil Jon’s track ‘What You Gonna Do’ got me thinking,” the author tells us, “of all things, the progressive grammaticalisation of the word shit.” In it, Lil Jon repeats “What they gon’ do? Shit” and in this lyric, shit doesn’t mean “shit” it means “nothing.”

As the author goes on to explain, things that are either trivial, devalued or demeaning are commonly used to mean “nothing”: I haven’t eaten a bite, I don’t give a rat’s ass, I won’t hurt a fly, he doesn’t know shit. More examples are given in Hoeksema’s “On the Grammaticalization of Negative Polarity Items.” This is difficult to account for in Chomsky’s (Extended or Revised Extended) Standard Theory because the meaning of terms makes them candidates for specific kinds of syntactic functions (Traugott and Heine 1991:8):

What we find in language after language is that for any given grammatical domain, there is only a restrictive set of… sources. For example, case markers, including prepositions and postpositions, typically derive from terms for body parts or verbs of motion; tense and aspect markers typically derive from specific spatial configurations; modals from terms from possession, or desire; middles from reflexives, etc.

Grammaticalization involves the extension of term until its meaning is “bleached” and becomes more generic and encompassing (Sweetser 1988). For example, the modal word “will,” as in “I will finish that review,” comes from the Old English term willan meaning to “want” or “wish,” and, of course, it still carries that connotation:  “I willed it into being.” This relates to a second difficulty for Chomskian Theory: grammaticalization is a graded process. It’s not always easy to decide whether a particular lexical item should be categorized as one or another syntactical unit and therefore we cannot know precisely which rules apply when.

Logical Weakness of the Classical Theory

It may be that the classical theory doesn’t work well for linguistics, but that might not be reason to abandon it elsewhere. In fact, there is a certain sensibleness to the approach: categories are about splitting the world up, so why shouldn’t everything fall into mutually exclusive containers? To summarize the various weaknesses as described by Taylor (2003):

  1. Provided we know (innately or otherwise) what features grant membership in a category, we must still verify that a token has all the features granting it membership, rendering categories pointless.
  2. Perhaps we could allow an authority to assure us a token has all the features, but then we are no longer relying on the classical conditions to categorize.
  3. Features might also be kinds of categories, e.g., if cars must have wheels, what defines inclusion in the category “wheels,” which leads to infinite regress (unless, of course, we can find genuine primitives).
  4. Finally, it seems that a lot of features are defined circularly by reference to their category, e.g., cars have doors, but what kind of doors other than the doors cars tend to have?


The rejection of this classical theory is foreshadowed by, among others, Wittgenstein. The younger Wittgenstein was interested in philosophy and mathematics, and after being encouraged by Frege, he more or less forced Bertrand Russell to take him on as a student in 1911. His first major work the Tractatus Logico-Philosophicus, was published in 1921, which went on to inspire the founding of the Vienna Circle of Logical Empiricism—which, even though living in Vienna at the time, did not include Wittgenstein, who seemed to hate everyone. (At the same time, it bears noting, Roman Jakobson was a couple hundred miles away founding the Prague Circle of Linguistics).  

After several years worth reading about, the received story goes, Wittgenstein does an about face on his own argument in the Tractatus in the course of trying to find the “atoms” of formal logic. In his later writings beginning in the late 1920s and continuing until his death in 1951, we get, among other things, the notion of defining words not be a list of necessary and sufficient conditions but by looking at how words are used. The most well-known example being, after reviewing a few different ways the word “game” is used, he states “we can go through many, many other groups of games in the same way, can see how similarities crop up and disappear…I can think of no better expression to characterize these similarities than ‘family resemblances’” (Wittgenstein [1953] 2009 para. 66-67).


Beyond Family Resemblances

Screenshot from 2019-04-27 11-45-20
From the The Atlas of the Munsell Color System, by Albert H. Munsell

Prototype Theory and Basic Level Categories

One pillar of the classical theory is that, if membership is granted based on having certain attributes, than it follows that no member should be a better or worse example of that category. A second pillar is that, category criteria should be independent of who or what is doing the categorizing. Eleanor Rosch’s early work toppled both pillars.

Rosch graduated from Reed College, completing her senior thesis on Wittgenstein (who she says “cured her of philosophy”) — specifically his discussion of pain and “private language.” She went on to complete graduate work in psychology at the famed Harvard Department of Social Relations, under the direction of Roger Brown (who was an expert in the psychology of language). She conducted research in New Guinea on Dani color and form categories, as well as child rearing practices (Rosch Heider 1971), and in late 1971, she joined the psychology department at UC, Berkeley.

In a 1973 publication, “Natural Categories,” Rosch critiqued existing studies of category formation because it relied on categories that subjects had already formed. For example, “American college sophomores have long since learned the concepts ‘red’ and ‘square’” To meet this challenge, she studied the Dani who had only two color terms, which divided color on the basis of brightness, rather than hue. Rosch hypothesized (Rosch 1973:330):

…there are colors and forms which are more perceptually salient than other stimuli in their domains…salient colors are those areas of the color space previously found to be most exemplary of basic color names in many different languages… and that salient forms are the “good forms” of Gestalt psychology (circle, square, etc.). Such colors and forms more readily attract attention than other stimuli… are more easily remembered than less salient stimuli…

She ultimately found “the salience and memorability of certain areas of the color space…can influence the formation of linguistic categories” (the classical citation for cross-cultural color categorization being Berlin and Kay 1991; see also Gibson et al. 2017). As categories form around salient prototypes, potential members of this category are judged on a graded basis.

In addition to building categories around salient exemplars, Rosch also found that, and aligning with ecological psychology, such salience relates to the usefulness for, and capacities of, the observer. For example, there tends to be the most cross-cultural agreement as to how any given token is categorized at the “basic level.” That is,  although different groups of people may differ in terms of what the prototypical “dog” is — is it a golden retriever or a bulldog? — when people see a dog, any dog, they will probably categorize it at the basic level of “dog,” as opposed to generically as animal or mammal or specifically as a golden retriever-bulldog mix. And it is at this basic level where there is the most interpersonal (and cross-cultural) similarities.

Berkeley and the West Coast Cognitive Revolution

In a previous post, I discussed all the interesting things happening in anthropology and artificial intelligence at UC, San Diego and Stanford during the 70 and 80s, and we can add UC, Berkeley to this list of strongholds for West Coast Cognitive Revolutionaries.  

Lakoff left MIT for Berkeley in 1972, and shortly thereafter he was confronted with kinds of utterances neither generative semantics nor generative grammar could account for, e.g., “John invited you’ll never guess how many people to the party” in which a clause splits another clause, sometimes called “center embedding.” Faced with this, Lakoff got an NSF grant to invite people from linguistics, psychology, logic, and artificial intelligence for a summer seminar in 1975, which ballooned into roughly 190 attendees (de Mendoza Ibáñez 1997). Among the lectures was Rosch on basic-level categories and how category prototypes can be represented in motor-systems (the seedling of the embodied mind), Charles Fillmore’s discussion of “Frame Semantics” which inspired the cognitive anthropologists, and Leonard Talmy (a recent Berkeley PhD) on how physical embodiment creates universal “cognitive topologies” which map onto words, like “in” and “out.”

So, Lakoff recalls, “in the face of all this evidence, in the summer of 1975, I realized that both transformational grammar and formal logic were hopelessly inadequate and I stopped doing Generative Semantics” (de Mendoza Ibáñez 1997).  It is also in 1975 that he published “Hedges: A Study in Meaning Criteria and the Logic of Fuzzy Concepts,” incorporating ideas from Rosch, as well as another Berkeley Professor Lotfi Zadeh. In this paper Lakoff argued: “For me, some of the most interesting questions are raised by the study of words whose meaning implicitly involves fuzziness- words whose job is to make things fuzzier or less fuzzy. I will refer to such words as ‘hedges’.” In addition to referring to Rosch’s then-unpublished paper “On the Internal Structure of Perceptual and Semantic Categories,” Lakoff acknowledges “Professor Zadeh has been kind enough to discuss this paper with me often and at great length and many of the ideas in it have come from those  discussions.”

Zadeh was born in Baku, Azerbaijan, then studied at the University of Tehran before completing his master’s at MIT, and doctorate in electrical engineering at Columbia University in 1949. He eventually landed at UC, Berkeley in 1959 where he slowly began to develop “fuzzy” methods. In 1965 he published the paradigm-shifting piece, “Fuzzy Sets,” which he began writing during the summer of ‘64 while working at Rand Corporation, and exists as the report “Abstraction and Pattern Classification.” In essence, Zadeh realized many objects in the world did not have clear boundaries to allow discrete classification, but rather allowed for graded membership (he used the example of  “tall man” and “very tall man”). He then demonstrates that classical “crisp” set theory was simply a special case of “fuzzy” set theory.

Zadeh would quickly expand the notion of fuzzy methods into a plethora of subfields, including information systems and computer science, but also linguistics beginning in the 1970s, an early example being, “A Fuzzy-Set-Theoretic Interpretation of Linguistic Hedges.” However, whether fuzzy logic explains the normal process of human categorization (i.e. whether humans are actually following the procedures of fuzzy logic in the task of categorizing) continues to be a debated topic. Rosch (e.g. Rosch 1999), in particular, is skeptical, precisely because the process of categorizing is not about applying decontextualized “rules.” Rather, as Mike argued in his recent post, we can think of categorizing as more like finding, than seeking.



Berlin, Brent and Paul Kay. 1991. Basic Color Terms: Their Universality and Evolution. University of California Press.

Gibson, Edward, Richard Futrell, Julian Jara-Ettinger, Kyle Mahowald, Leon Bergen, Sivalogeswaran Ratnasingam, Mitchell Gibson, Steven T. Piantadosi, and Bevil R. Conway. 2017. “Color Naming across Languages Reflects Color Use.” Proceedings of the National Academy of Sciences of the United States of America 114(40):10785–90.

de Mendoza Ibáñez, Francisco José Ruiz. 1997. “An Interview with George Lakoff.” Cuadernos de Filología Inglesa 6(2):33–52.

Rosch, E. 1999. “Reclaiming Concepts.” Journal of Consciousness Studies 6(11-12):61–77.

Rosch, Eleanor H. 1973. “Natural Categories.” Cognitive Psychology 4(3):328–50.

Rosch Heider, Eleanor. 1971. “Style and Accuracy of Verbal Communications within and between Social Classes.” Journal of Personality and Social Psychology 18(1):33.

Sweetser, Eve E. 1988. “Grammaticalization and Semantic Bleaching.” Pp. 389–405 in Annual Meeting of the Berkeley Linguistics Society. Vol. 14..

Taylor, John R. 2003. Linguistic Categorization. OUP Oxford.

Traugott, Elizabeth Closs and Bernd Heine. 1991. Approaches to Grammaticalization: Volume II. Types of Grammatical Markers. John Benjamins Publishing.

Wittgenstein, Ludwig. [1953] 2009. Philosophical Investigations. Blackwell.

When is Consciousness Learned?

Continuing with the theme of innateness and durability from my last post, consider the question: are humans born with consciousness? In a ground-breaking (and highly contested) work, the psychologist Julian Jaynes argued that if only humans have consciousness, it must have emerged at some point in our human history. In …

Limits of innateness: Are we born to see faces?

Sociologists tend to be skeptical of claims individuals are consistent across situations, as a recent exchange on Twitter exemplifies. This exchange was partially spurred by revelations that the famous Stanford Prison Experiment (which supposedly showed people will quickly engage in behaviors commensurate with their assigned roles even if it means …

Where Did Sewell Get “Schema”?

Although there are precedents to using the term “schema” in an analytical manner in sociology (e.g., Goffman’s Frame Analysis and Cicourel’s Cognitive Sociology), it is undoubtedly William Sewell Jr’s “A Theory of Structure: Duality, Agency, and Transformation” published in the American Journal of Sociology in 1992 that really launched the career of …

Exaption: Alternatives to the Modular Brain, Part II

Scientists discovered the part of the brain responsible for… In my last post, I discuss one alternative to the modular theory of the mind/brain relationship: connectionism. Such a model is antithetical to modularity in that there are only distributed networks of neurons in the brain, not special-purpose processors. One strength …

Connectionism: Alternatives to the Modular Brain, Part I

In my previous post, I introduced the task of cognitive neuroscience, which is (largely) to locate processes we associate with the mind in the structures of the brain and nervous system (Tressoldi et al. 2012). I also discussed the classical and commonsensical approach which conceptualizes the brain and mind relationship …

Is The Brain a Modular Computer?

As discussed in the inaugural post, cognitive science encompasses numerous sub-disciplines, one of which is neuroscience. Broadly defined, neuroscience is the study of the nervous system or how behavioral (e.g. walking), biological (e.g. digesting), or cognitive processes (e.g. believing) are realized in the (physical) nervous system of biological organisms. Cognitive …