Categories, Part II: Prototypes, Fuzzy Sets, and Other Non-Classical Theories

A few years ago The Economist published “Lil Jon, Grammaticaliser.” “Lil Jon’s track ‘What You Gonna Do’ got me thinking,” the author tells us, “of all things, the progressive grammaticalisation of the word shit.” In it, Lil Jon repeats “What they gon’ do? Shit” and in this lyric, shit doesn’t mean “shit” it means “nothing.”

As the author goes on to explain, things that are either trivial, devalued or demeaning are commonly used to mean “nothing”: I haven’t eaten a bite, I don’t give a rat’s ass, I won’t hurt a fly, he doesn’t know shit. More examples are given in Hoeksema’s “On the Grammaticalization of Negative Polarity Items.” This is difficult to account for in Chomsky’s (Extended or Revised Extended) Standard Theory because the meaning of terms makes them candidates for specific kinds of syntactic functions (Traugott and Heine 1991:8):

What we find in language after language is that for any given grammatical domain, there is only a restrictive set of… sources. For example, case markers, including prepositions and postpositions, typically derive from terms for body parts or verbs of motion; tense and aspect markers typically derive from specific spatial configurations; modals from terms from possession, or desire; middles from reflexives, etc.

Grammaticalization involves the extension of term until its meaning is “bleached” and becomes more generic and encompassing (Sweetser 1988). For example, the modal word “will,” as in “I will finish that review,” comes from the Old English term willan meaning to “want” or “wish,” and, of course, it still carries that connotation:  “I willed it into being.” This relates to a second difficulty for Chomskian Theory: grammaticalization is a graded process. It’s not always easy to decide whether a particular lexical item should be categorized as one or another syntactical unit and therefore we cannot know precisely which rules apply when.

Logical Weakness of the Classical Theory

It may be that the classical theory doesn’t work well for linguistics, but that might not be reason to abandon it elsewhere. In fact, there is a certain sensibleness to the approach: categories are about splitting the world up, so why shouldn’t everything fall into mutually exclusive containers? To summarize the various weaknesses as described by Taylor (2003):

  1. Provided we know (innately or otherwise) what features grant membership in a category, we must still verify that a token has all the features granting it membership, rendering categories pointless.
  2. Perhaps we could allow an authority to assure us a token has all the features, but then we are no longer relying on the classical conditions to categorize.
  3. Features might also be kinds of categories, e.g., if cars must have wheels, what defines inclusion in the category “wheels,” which leads to infinite regress (unless, of course, we can find genuine primitives).
  4. Finally, it seems that a lot of features are defined circularly by reference to their category, e.g., cars have doors, but what kind of doors other than the doors cars tend to have?


The rejection of this classical theory is foreshadowed by, among others, Wittgenstein. The younger Wittgenstein was interested in philosophy and mathematics, and after being encouraged by Frege, he more or less forced Bertrand Russell to take him on as a student in 1911. His first major work the Tractatus Logico-Philosophicus, was published in 1921, which went on to inspire the founding of the Vienna Circle of Logical Empiricism—which, even though living in Vienna at the time, did not include Wittgenstein, who seemed to hate everyone. (At the same time, it bears noting, Roman Jakobson was a couple hundred miles away founding the Prague Circle of Linguistics).  

After several years worth reading about, the received story goes, Wittgenstein does an about face on his own argument in the Tractatus in the course of trying to find the “atoms” of formal logic. In his later writings beginning in the late 1920s and continuing until his death in 1951, we get, among other things, the notion of defining words not be a list of necessary and sufficient conditions but by looking at how words are used. The most well-known example being, after reviewing a few different ways the word “game” is used, he states “we can go through many, many other groups of games in the same way, can see how similarities crop up and disappear…I can think of no better expression to characterize these similarities than ‘family resemblances’” (Wittgenstein [1953] 2009 para. 66-67).


Beyond Family Resemblances

Screenshot from 2019-04-27 11-45-20
From the The Atlas of the Munsell Color System, by Albert H. Munsell

Prototype Theory and Basic Level Categories

One pillar of the classical theory is that, if membership is granted based on having certain attributes, than it follows that no member should be a better or worse example of that category. A second pillar is that, category criteria should be independent of who or what is doing the categorizing. Eleanor Rosch’s early work toppled both pillars.

Rosch graduated from Reed College, completing her senior thesis on Wittgenstein (who she says “cured her of philosophy”) — specifically his discussion of pain and “private language.” She went on to complete graduate work in psychology at the famed Harvard Department of Social Relations, under the direction of Roger Brown (who was an expert in the psychology of language). She conducted research in New Guinea on Dani color and form categories, as well as child rearing practices (Rosch Heider 1971), and in late 1971, she joined the psychology department at UC, Berkeley.

In a 1973 publication, “Natural Categories,” Rosch critiqued existing studies of category formation because it relied on categories that subjects had already formed. For example, “American college sophomores have long since learned the concepts ‘red’ and ‘square’” To meet this challenge, she studied the Dani who had only two color terms, which divided color on the basis of brightness, rather than hue. Rosch hypothesized (Rosch 1973:330):

…there are colors and forms which are more perceptually salient than other stimuli in their domains…salient colors are those areas of the color space previously found to be most exemplary of basic color names in many different languages… and that salient forms are the “good forms” of Gestalt psychology (circle, square, etc.). Such colors and forms more readily attract attention than other stimuli… are more easily remembered than less salient stimuli…

She ultimately found “the salience and memorability of certain areas of the color space…can influence the formation of linguistic categories” (the classical citation for cross-cultural color categorization being Berlin and Kay 1991; see also Gibson et al. 2017). As categories form around salient prototypes, potential members of this category are judged on a graded basis.

In addition to building categories around salient exemplars, Rosch also found that, and aligning with ecological psychology, such salience relates to the usefulness for, and capacities of, the observer. For example, there tends to be the most cross-cultural agreement as to how any given token is categorized at the “basic level.” That is,  although different groups of people may differ in terms of what the prototypical “dog” is — is it a golden retriever or a bulldog? — when people see a dog, any dog, they will probably categorize it at the basic level of “dog,” as opposed to generically as animal or mammal or specifically as a golden retriever-bulldog mix. And it is at this basic level where there is the most interpersonal (and cross-cultural) similarities.

Berkeley and the West Coast Cognitive Revolution

In a previous post, I discussed all the interesting things happening in anthropology and artificial intelligence at UC, San Diego and Stanford during the 70 and 80s, and we can add UC, Berkeley to this list of strongholds for West Coast Cognitive Revolutionaries.  

Lakoff left MIT for Berkeley in 1972, and shortly thereafter he was confronted with kinds of utterances neither generative semantics nor generative grammar could account for, e.g., “John invited you’ll never guess how many people to the party” in which a clause splits another clause, sometimes called “center embedding.” Faced with this, Lakoff got an NSF grant to invite people from linguistics, psychology, logic, and artificial intelligence for a summer seminar in 1975, which ballooned into roughly 190 attendees (de Mendoza Ibáñez 1997). Among the lectures was Rosch on basic-level categories and how category prototypes can be represented in motor-systems (the seedling of the embodied mind), Charles Fillmore’s discussion of “Frame Semantics” which inspired the cognitive anthropologists, and Leonard Talmy (a recent Berkeley PhD) on how physical embodiment creates universal “cognitive topologies” which map onto words, like “in” and “out.”

So, Lakoff recalls, “in the face of all this evidence, in the summer of 1975, I realized that both transformational grammar and formal logic were hopelessly inadequate and I stopped doing Generative Semantics” (de Mendoza Ibáñez 1997).  It is also in 1975 that he published “Hedges: A Study in Meaning Criteria and the Logic of Fuzzy Concepts,” incorporating ideas from Rosch, as well as another Berkeley Professor Lotfi Zadeh. In this paper Lakoff argued: “For me, some of the most interesting questions are raised by the study of words whose meaning implicitly involves fuzziness- words whose job is to make things fuzzier or less fuzzy. I will refer to such words as ‘hedges’.” In addition to referring to Rosch’s then-unpublished paper “On the Internal Structure of Perceptual and Semantic Categories,” Lakoff acknowledges “Professor Zadeh has been kind enough to discuss this paper with me often and at great length and many of the ideas in it have come from those  discussions.”

Zadeh was born in Baku, Azerbaijan, then studied at the University of Tehran before completing his master’s at MIT, and doctorate in electrical engineering at Columbia University in 1949. He eventually landed at UC, Berkeley in 1959 where he slowly began to develop “fuzzy” methods. In 1965 he published the paradigm-shifting piece, “Fuzzy Sets,” which he began writing during the summer of ‘64 while working at Rand Corporation, and exists as the report “Abstraction and Pattern Classification.” In essence, Zadeh realized many objects in the world did not have clear boundaries to allow discrete classification, but rather allowed for graded membership (he used the example of  “tall man” and “very tall man”). He then demonstrates that classical “crisp” set theory was simply a special case of “fuzzy” set theory.

Zadeh would quickly expand the notion of fuzzy methods into a plethora of subfields, including information systems and computer science, but also linguistics beginning in the 1970s, an early example being, “A Fuzzy-Set-Theoretic Interpretation of Linguistic Hedges.” However, whether fuzzy logic explains the normal process of human categorization (i.e. whether humans are actually following the procedures of fuzzy logic in the task of categorizing) continues to be a debated topic. Rosch (e.g. Rosch 1999), in particular, is skeptical, precisely because the process of categorizing is not about applying decontextualized “rules.” Rather, as Mike argued in his recent post, we can think of categorizing as more like finding, than seeking.



