PL Notation is a Barrier to Entry

Once upon a time in grad school, as we chatted and opened our lunches before the weekly Penn PL Club seminar, someone was sharing a photograph of a forearm with a tattoo of the Y combinator. Benjamin Pierce looked at it for a moment, thought, and said, “That’s not the font I would have chosen”.

The programming languages literature is intensely notated, marked up, typeset. We care a lot about symbols. Good notations enable clear thinking. With the right notation, it’s easier to do the work in the first place; with the right notation, inessential parts fall away and you can clearly see what the author is trying to convey. But there is a dark side to a heavy reliance on notation. A bad choice of notation slows exploration for the author and hinders understanding in the reader. Even simple arithmetic expressions like 8÷2(2+2) stump people of all stripes. When Mike Hicks wrote about increasing the impact of PL research, he cited notation as a barrier to understanding PL results.

PL research will have a broader audience with simpler notation. I suggest (a) standardizing the most common notations, (b) using single-word notations when possible, and (c) devising clear notations with audience accessibility in mind. Making simple changes to notation that don’t compromise on content will let PL researchers reach a broader, more diverse audience.

Notation in PL

Examples of PL notation abound: Backus-Naur Form quickly specifies syntax; a list of inference rules concisely defines a relation as its least-fix point; a turnstile and a colon instantly means “typing”; Strachey brackets make meanings. The best notations are concise; they are visually suggestive or work by analogy to some better known notation. In a good notation, the symbols themselves suggest the relationship: say, the ⟶ of small-step semantics (suggesting “left goes to right”), or the <: of subtyping (analogy to familiar preorders, where the : recalls its use in types). But what’s ‘good’ about <: is contextual: to a reader who’s seen typing judgments but not subtyping, the connection may be easy enough; but to a Java programmer, it may not be immediately clear that <: has the same meaning as extends!

Even good notations are barriers to entry to newcomers, who must refer to textbooks or blogposts or videos. But at its worst, PL notation obfuscates: a page of symbols is an opaque barrier to entry. Everyone has their favorite examples; I won’t point any fingers here. Combinators, like S and K and I and Y are a good, old example. The I combinator at least recollects the identity function. And when people first see Y combinator, ”why” is a common response. (Then again, so is, “What the hell?”) These days, only Y sees real action. If you’re going to use Y in your paper, why not name it fix, to suggest the fixpoint it will actually calculate, or rec to suggest recursion?

Staying in Budget

Any artifact for human consumption—whether it’s a type theory or a language or an API—has a complexity budget. A paper that exceeds its complexity budget is hard for readers to understand. Each notation or convention you define increases the complexity the reader must contend with. Whatever novelty is in your work will take up the lion’s share of your complexity budget… and as the paper’s complexity goes up, its readership and impact will go down.

Before you introduce a new notation, ask yourself the following questions:

Can you use or adapt an existing, well known notation?
Is there a simple, single-word name you could use instead?
Is the notation ‘good’? That is:
1. Are its parts in a natural order?
2. Does your notation allude to some preexisting idea? If not… why not? Will a reader know how to pronounce it, or will they just say the LaTeX commands out loud to themselves (or, as Don Knuth suggests, “blah” or some other grunting noise)?
3. Who will understand the allusion that you’re making? Subfield experts? PL researchers? Researchers in some other discipline? PL practitioners? General software developers?

If you can say ‘yes’ to (1), great: why invent when you can reuse? “Standard” knowledge is a slippery concept, but as researchers we have much to gain by communicating our work to the broadest audience possible. How much of your work can you make accessible to a first-year graduate student? What about a second-year undergraduate? A high schooler? In his talk “It’s Time for a New Old Language” (slides; paper), Guy Steele identifies 28 different notations for substitution that have been used at POPL… but e[v/x] (meaning the expression e with v substituted for any free occurrence of x) was far and away the most common. Let’s… just use that.

If there isn’t a standard notation—if you can’t say ‘yes’ to (1)—try to say ‘yes’ to (2). We should, as a default, use words and not notation. Use short, suggestive English names (or intentionally French, or whatever). Beyond being mildly verbose compared to other notations, some might complain that using named functions instead of mathematical notations looks less PL-ish. I agree—and that’s part of the idea! Our work’s aesthetics are important, without a doubt, but PL already has a reputation for abstruse notation. There are few papers that wouldn’t be improved by changing one or two notations into words.

If you say ‘no’ to (2), try to say ‘yes’ to (3)—with as broad of a (3.3) as you can manage. If you can broaden your audience without compromising on your content… why not?

Case Study: Casts

As an illustration of the benefits of a reader-conscious use of notation, consider that the literature on gradual typing and contracts has at least three different basic notations for casts: <S=>T> e and <T<=S> e and e : S=>T all represent an expression e of type S being cast to type T; all of them have been baroquely decorated with various superscripts and metadata above the arrow. On the one hand, let a thousand flowers bloom. On the other hand, this variation has certainly slowed me down, and I can only imagine it slows down newcomers to the area.

Could we use something “standard”? There is a “standard” idea of casts going back to C’s notation from the 70s, where (T)e casts e to T. Such a notation ends up being unhelpful in the higher-order setting, where the operational semantics needs the source type to properly implement contravariance. Granted, these are technical minutiae: no one has seriously proposed these casts as part of a surface language.