prettier syntax

7 min readJun 4, 2022

(This is not my usual fare for this particular outlet; I’m assuming some familiarity with the mechanics of generative syntax, and use a lot of jargon that’s particular to minimalist syntax notation. Please feel free to skip it, and check out my more accessible posts elsewhere on the blog!)

Let’s talk about parsimony, which I find soothing to think about.

Parsimony means, like, lovely elegant simplicity. I think. that’s what I use the word to mean. The specific context in which I learned the word parsimony was in trying to navigate whether “this one is more elegant” is an allowable move in arguing for one formalization over another, where both formalizations account equally for the data we have on hand. Parsimony can be a legal move in syntax chess if you are playing one of the variations of the game where elegant simplicity is a built-in desideratum of the theoretical framework you’re trying to work on.

Doing syntactic theory is a little like trying to build a car as you’re driving it, with several other mechanics, who were trained in very different ways than you were, but all of you have your own personal ideas of what a good car is and does, and also, who’s driving right now? You can’t stop, the car has to keep going. And wanting the car to run well sometimes means we want it to go fast, or sometimes means we want it to be fuel efficient, or sometimes means we want a quiet and smooth ride. Lots of different goals all in this together. Let’s say, for the sake of this metaphor, that parsimony is approximately analogous to wanting a car that’s quiet and a smooth ride. As a parsimony-liker, I will happily be the first to admit that a smooth ride should not override fuel efficiency or safety or low-emissions goals. But it would be nice, and I think parsimonious building does make the car objectively better.

In syntax terms: it is absolutely not acceptable to me (a parsimony-liker) to sacrifice descriptive adequacy in favor of simpler-looking trees, or fewer variations (or restrictions) on an operation (like Merge). But theoretical syntax is attempting to do two things at once, circularly: we’re trying to model a formal system (make a math language that generates structures, only in the ways we want it to — we want to generate grammatical structures and not generate ungrammatical structures), but also approximate a formal model that could be a theory of cognition. Sometimes the “formal model drives real smooth” goals get into conflict with the “cognitive theory is sufficiently granular or empirically testable” goals.

What got me thinking about this (I mean, I’m always thinking about this) was that I casually used the phrase “zero copula” while talking to my friend about Russian and AAE. “Zero copula” or null copula or copula-drop are all terms that assume (1) below is a default, and (2) is a variant form:

(1) He is friendly

(2) He friendly

In (1), there is an overt copula (“is,” an inflected form of “be”). (1) is grammatical for most users of standardized varieties of English, as I understand it. (2) is grammatical for most users of AAE. Why do we call it copula drop or null copula instead of copula insertion or overt copula? Part — maybe a very large part — of the answer is simply that a lot of the securely-employed american linguists in academic positions for a very long time have been white dudes. On some level a lot of these white dudes have thought of AAE (unconsciously, if I’m being charitable) as a variant form of “core” english. They’ll (rightly) espouse that it’s linguistically valid, morally neutral, part of the natural and organic language variation that we linguists like to look at, but I really do think they see (1) as unmarked and (2) as marked.

Even if they don’t think that, our academic jargon terms certainly imply that (1) is marked and (2) is unmarked. If our academic jargon terms imply that, or if we (white linguists) unconsciously think that, to what extent are we (unconsciously or not) baking that assumption into our theory? Recall, again, that we have two goals: modeling formal system that works according to its own internal logic (call this Formal Goal) and converging on a formal system that is an accurate model of what’s actually happening in human brains (call this Cognition Goal).

For Formal Goal: is it more parsimonious to assume that copulas are things that are there, and sometimes get spelled out as silent? Or to assume that pronounced copulas are the spelled-out pronunciation of a thing that is a confluence of other elements, that silence is the default but some varieties are obligated to overtly mark it?

For Cognitive Goal: is it a more realistic model of cognition to assume that zero-copula languages have a “be” there, and then delete it? Or that overt-copula languages have a predicate structure there, and then insert a “be” on top of it?

I don’t actually have an answer for this right now; I’m not actually asking for your answer, either, because I’m posing these questions as a way of demonstrating what I believe the power is in using parsimony as one of the several factors by which we can make decisions about what to build onto the car as we’re driving it.

It’s also important for me to acknowledge that parsimony or elegance is a subjective measure. One syntactician might argue that Mechanism X is unparsimonious, or not very Minimal, while another might argue that Specifically Making The Point Of Ruling Out Mechanism X is unparsimonious or unminimal. I am thinking now of (sideways) multidomninance, which I like to operationalize as Parallel Merge (Citko 2005 ); is it more Minimal to restrict the possible items that Merge can act on to only stuff that hasn’t been Merged already, or is it more Minimal to let Merge be pretty much unrestricted in the syntax proper? (I personally come down on the latter side, if you’re curious.)

Another way of thinking about this is that there is going to be complexity, and we have to put the complexity somewhere. This comes up every time I teach Syntax 1, and have to coax a fresh set of students into deciding whether they want to organically converge towards X Bar theory. Do they want to put the complexity in the rules but have simpler trees, or do they want to put the complexity in the trees but have a very minimal set of rules? You have to make a judgment call. So far, all of my classes have opted to put the complexity in the trees so as to buy simpler rules; I don’t know if that is influenced by their belief that we (as a class) are obligated to converge on the same theory as the textbook (or consensus theory from the 80s and 90s), or whether they genuinely feel in their hearts that it is better to have taller trees generated by fewer rules.

(Keep in mind, by the way — especially any of you non-syntacticians reading this if you’ve gotten this far: the trees are not “real,” they are just a picture of a set of hypotheses generated by a theory. They are a formal convenience because it’s easier to read then a huge list of pair-lists; there is not a “correct” or “incorrect” syntax tree, a syntax tree is a proposed analysis of how the pieces might be fit together in a formal system. Taller trees just means the diagram of the thing got bigger, not necessarily that the thing it’s a diagram of got bigger. Think of it like the “exploded diagram” of an internal combustion engine: we moved the complexity in the drawing to make it easier to see more of the stuff of the thing we’re trying to represent.)

a black and white schematic drawing of an exploded view of an engine — Source: Wikimedia Commons

Anyways, back to parsimony, I guess. I think sometimes syntacticians get some flack for talking about an analysis being “pretty.” A “pretty” (or elegant, or parsimonious) analysis is, we assume, one which meets certain minimum criteria first: an engine that runs, doors that don’t fall off, a theory that can account for any data we currently have or can get our hands on — and then, in addition to that, has other desirable features: a good suspension system for a smooth ride, tuned finely for fuel efficiency, minimal wind-resistance that keeps the drive both quiet and more efficient, a formal analysis that hypothesizes the absolute minimum number of operations needed in the actual human brain to just barely get the job done. Arguments in favor of minimality (in the minimalist syntax framework) are an attempt to try and keep things light, efficient, and save us from having to propose that copula verbs are inherently “cognitively real” or some shit — we don’t want to have to do that! Parsimony as a guiding principle is that attempt, with (again) all of us trying from slightly different angles to make a formal system where we don’t have to claim anything about cognition that we can’t empirically verify.

This work is supported by my ko-fi tips. You can also follow me on twitter.

prettier syntax

Written by Kirby Conrod