Recently I was reading an old blog post I came across about productivity variations among software developers, where he says that “researchers have found 10-fold differences in productivity and quality between different programmers with the same levels of experience and also between different teams working within the same industries“. My immediate instinct as I was reading it was that it makes perfect sense, and it’s supported by my own experience, and from what I hear from others in the software industry.
But at the end of the post, he talks about a study which compares productivity of the teams for Excel and Lotus, and makes between their productivity in lines of code:
Excel’s team produced about 13,000 lines of code per staff year. Lotus’s team produced 1,500 lines of code per staff year.
It’s interesting the huge difference in lines of code per staff-year, but is that really a measure of productivity? It’s not clear from the post whether a high LOC-per-year is considered high or low productivity, but it seems to mean high productivity in this context. If this is so, then I am probably the least productive person in my team. I make an effort to write as few lines of code as possible.
In fact for the last few months my average LOC per day is probably negative, if you count deletion of lines as negative lines produced1. That sounds terrible, but it’s because I not only try to write as few lines as possible, but I also simplify other people’s verbose code as I go through it. It’s not uncommon for me to rewrite existing code using 5 or 10 times fewer lines (typically simpler lines).
Let me clarify though. Writing fewer lines of code is not my goal, it’s a side effect. What I try to do is write simpler code. Code that’s easier to reason about. Code that’s more pleasant to maintain. Code that has fewer bugs just because it’s simpler. What generally results from this is fewer lines of code. I do consider fewer lines of code to be a goal, but only in service of readability.
Another reason why fewer lines of code can be a good sign, is when you consider how much leverage each line of code has. I don’t know if leverage is the right word here, so let me explain. What I mean is simply a small piece of code that has a big effect, because of some amplification mechanism. For example, a bad coder might repeat almost the same line of code 20 times, while a good coder will instead write a for-loop which amplifies his single line of code to cover 20 cases2. A good programmer will use abstractions like this (and polymorphism, generics, frameworks, code generators, DSLs, etc) to amplify small pieces of code to great effects. A good programmer is not one who can write 1000 lines of code, but one who knows how to write 1 line of code in just the right way that it can be reused 1000 times. If he continues to write another 999 lines of this kind of code, he will have produced 1000 times more than the man who wrote the 1000 lines which had only one use34.
This sort of leverage is not linear, because good software engineers can leverage the leverage. They can abstract the abstractions. Any time you find yourself repeating something, you write code to do it for you. The more abstractly you can recognize these patterns, the more generally the solutions will apply and the less code you’ll need to write to get things done (although you have to be careful to not make code more complicated in all the abstraction).
I do understand that getting a quantitative measurement of programming productivity can be incredibly difficult. If it was easy, I could just give potential employers my objective productivity “score” in various areas of expertise, and they could simply hire me according to how this fits their objectives. It would also make it easier to pay software engineers what they’re individually worth. But I think “number of lines of code” is terrible variable to proxy for productivity. Especially when you look at the extremes of the spectrum, where people are really productive or where people get very little done.
This doesn’t change the fact that, at least for me, a better solution is one with fewer lines of code. Every line of code you don’t write is valuable, and often this value outweighs the cost of planning how to not write that line.
If you didn’t count deletion as negative, then the best way to be productive is simply to delete and rewrite the same function every day for the rest of your life ↩
You might think this is a bad example because nobody would write the non-for-loop code, but it’s exactly the situation I found just the other day ↩
It’s difficult to qualify the word “use”, since this doesn’t need to refer to the programmer himself reusing the code, but could also just refer to the number of times it’s executed, like in the example of the for-loop ↩
You might think 1000-times reusability is ridiculous, but there are number of places where I’ve achieved this level of “amplification”. They seem to normally manifest themselves as frameworks ↩