Code only says what it does | Hacker News


My name is Marc Brooker. I've been writing code, reading code, and living vicariously through computers for as long as I can remember. I like to build things that work. I also dabble in brewing, cooking and skiing.

I'm currently an engineer at Amazon Web Services (AWS) in Seattle, where I lead engineering on AWS Lambda and our other serverless products. Before that, I worked on EC2 and EBS. All opinions are my own. My Publications and Videos@MarcJBrooker on Twitter

Only loosely related to what it should do.

Code says what it does. That's important for the computer, because code is the way that we ask the computer to do something. It's OK for humans, as long as we never have to modify or debug the code. As soon as we do, we have a problem. Fundamentally, debugging is an exercise in changing what a program does to match what it should do. It requires us to know what a program should do, which isn't captured in the code. Sometimes that's easy: What it does is crash, what it should do is not crash. Outside those trivial cases, discovering intent is harder.

Debugging when should do is subtle, such as when building distributed systems protocols, is especially difficult. In our Millions of Tiny Databases paper, we say:

Our code reviews, simworld tests, and design meetings frequently referred back to the TLA+ models of our protocols to resolve ambiguities in Java code or written communication.

The problem is that the implementation (in Physalia's case the Java code) is both an imperfect implementation of the protocol, and an overly-specific implementation of the protocol. It's overly-specific because it needs to be fully specified. Computers demand that, and no less, while the protocol itself has some leeway and wiggle room. It's also overly-specific because it has to address things like low-level performance concerns that the specification can't be bothered with.

Are those values in an ArrayList because order is actually important, or because O(1) random seeks are important, or some other reason? Was it just the easiest thing to write? What happens when I change it?

Business logic code, while lacking the cachet of distributed protocols, have even more of these kinds of problems. Code both over-specifies the business logic, and specifies it inaccurately. I was prompted to write this by a tweet from @mcclure111 where she hits the nail on the head:

Since most software doesn't have a formal spec, most software "is what it does", there's an incredible pressure to respect authorial intent when editing someone else's code. You don't know which quirks are load-bearing.

This is a major problem with code: You don't know which quirks are load-bearing. You may remember, or be able to guess, or be able to puzzle it out from first principles, or not care, but all of those things are slow and error-prone. What can we do about it?

Design Documentation

Documentation is uncool. Most software engineers seem to come out of school thinking that documentation is below them (tech writer work), or some weird thing their SE professor talked about that is as archaic as Fortran. Part of this is understandable. My own software engineering courses emphasized painstakingly documenting the implementation in UML. No other mention of documentation was made. Re-writing software in UML helps basically nobody. I finished my degree thinking that documentation was unnecessary busywork. Even the Agile Manifesto agreed with me1:

Working software over comprehensive documentation

What I discovered later was that design documentation, encoding the intent and decisions made during developing a system, helps teams be successful in the short term, and people be successful in the long term. Freed from fitting everything in my head, emboldened by the confidence that I could rediscover forgotten facts later, I could move faster. The same applies to teams.

One thing I see successful teams doing is documenting not only the what and why behind their designs, but the how they decided. When it comes time to make changes to the system—either for debugging or in response to changing requirements—these documents are invaluable. It's hard to decide whether its safe to change something, when you don't know why it's like that in the first place. The record of how you decided is important because you are a flawed human, and understanding how you came to a decision is useful to know when that decision seems strange, or surprising.