The pain of big code reviews
Some ranting on code reviews in general:
Bugzilla's bug 185090 is nearing completion, and I certainly hope I've finished my part by finally granting review to the 70 kbyte patch. To the no less than 13th iteration of the patch. For those of you not familiar with the Mozilla family review process, a short recap: Every change to the codebase must be reviewed by at least one designated code reviewer. The review is usually conducted by looking at the cvs diff ("patch"), pointing out issues and discarding the patch until its clean.
The patch I finally r+'d was far from the biggest in Bugzilla (or even my) history. Nor was 13 iterations an extreme amount if we compare to the general group of patches for major enhancements. Yet still, doing line-by-line review for 70 k of code several times in a row is hell on earth. It's always easy to come up with lots of comments for the first patch, but after several iterations most people (including me on most days) simply cannot focus enough to effectively review the code as a whole time after time again. You just become numb.
Reviewing interdiffs ("patches of patches") doesn't work very well for larger changes except when the interdiff is trivial. It's extremely easy to miss issues that way, and reviewing code readability in context is hard if not impossible. So in the end, the only way to effectively review is by looking at the patch, testing, and looking more at the patch.
Of course, there's no perfect solution. But having done quite a many fairly big (100+k) code reviews both for Bugzilla and my former employer, I'm pretty certain there are very few features that require such amount of new logic in one patch. So once again, the key is small iterative changes . Apart from find/replace changes such as renaming identifiers, almost every change can be split into a few easily reviewable patches (10-20 k is pretty nice).
But if so, why don't people split their patches more eagerly? For many development cultures, I think it's a matter of false beliefs about time. Getting a review from your collegue (at work) or someone from the dev group (for Bugzilla/Mozilla/etc. development) can take days, weeks at worst. "Well, in that case it's most effective to have them review as much as possible in a single run, right?" No.
Long review queues are, to a large extent, caused by the fact that a thorough review of a 100k patch takes at least a couple of hours. Since you can't allocate that big a time slice easily, you tend to slip on reviewing. But it's usually far easier to find time for half a dozen 30 minute sessions than to allocate a single 2-hour slot – and it's very much easier to go through half a dozen 20k patches than a single 100k one. Also, let me remind you that the amount of iterations required for a positive review rises very sharply in relation to the patch size.
To play some number games, assume that a reviewer will be able to go through five 20k patches or one 100k patch in a week. Assume that a feature can be implemented either in a single 100k giant or iteratively in seven 20k patches. And finally, assume that it takes 6 iterations for the 100k patch to be ready, while a 20k patch can be checked in after getting three rounds of review. Well, getting 6 reviews for the 100k patch takes 6 weeks. Getting 3 reviews for seven 20k patches each takes just a bit over four weeks (3×7/5)!
The example above is pretty conservative. In practice, I'd pick _ten_ 20k patches anytime instead of a single 100k lump. Also, the 3-6 balance in the iterations required is unrealistic: the difference is usually more drastic. But I guess this shows the point. Next time somebody asks for review on a big patch, my first aim is to find a reason I can deny review based on a "You can do this in smaller steps" type argument.
Ps. At this point, it's fair to admit I didn't come up with a decent way to split the patch that started all this ranting. But for many megabugs the answer is pretty apparent, and those are the ones that should get hacked into mincemeat.
August 4, 2004
В· Jouni Heikniemi В· No Comments
Posted in: Misc. programming
Leave a Reply