On Sat, Sep 1, 2018 at 1:03 AM Jeff King <peff@xxxxxxxx> wrote: > > On Sat, Sep 01, 2018 at 03:48:13AM -0400, Jeff King wrote: > > > Commit 6a1e32d532 (pack-objects: reuse on-disk deltas for > > thin "have" objects, 2018-08-21) taught pack-objects a new > > optimization trick. Since this wasn't meant to change > > user-visible behavior, but only produce smaller packs more > > quickly, testing focused on t/perf/p5311. > > > > However, since people don't run perf tests very often, we > > should make sure that the feature is exercised in the > > regular test suite. This patch does so. > > This, by the way, is the crux of how such an obvious and severe bug made > it to 'next'. > > The original series was tested quite extensively via t/perf and in > production at GitHub. When I re-rolled v2, the only change was the > addition of the assertion, so I didn't bother re-doing the perf tests, > since they're slow and there wouldn't be a measurable impact. > > I did run the normal test suite (as I'm sure Junio did, too) as a > double-check for correctness, but as we noticed, the code wasn't > actually exercised there. > > Nor had I yet backported the revised series to the version we run at > GitHub, so it hadn't been run there, either. > > And all of that coupled with the fact that it only triggers with > bitmaps, so day-to-day use of the buggy Git (like Junio trying to push > out the result ;) ) wouldn't show it. > > Anyway. Not that exciting, and kind of obviously dumb in retrospect. But > I think it was worth analyzing to see what went wrong. If there's an > immediate lesson, it is probably: add tests even for changes that aren't > really user-visible to make sure the code is exercised. Yeah, maybe we need to ask for more tests in the 'real' test suite, and not just in some special corner (such as t/perf/ or any of the environment variable proposals nearby). I wonder if we can make use of git.git in the test suite for similar things, e.g. after reading the thread about "index corruption with git commit -p" [1], I thought that we only have these toy examples in the test suite. Toy examples show that the new feature barely works, and doesn't show it is working at scale. [1] https://public-inbox.org/git/20180901214157.hxlqmbz3fds7hsdl@ltop.local/ > There may be a larger lesson about tracking code coverage, but I don't > know that most general code coverage tools would have helped (any > overall percentage number would be too large to move). A tool that > looked at the diff and said "of the N lines you added/touched, this > percent is exercised in the test suite" might have been useful. >From some offline discussion, maybe we want to adapt a philosophy of Each patch needs to add a test, that fails when the patch is not applied, but succeeds when it is applied. This shows that _some_ code in the patch is exercised at least. (and automatically/strongly enforce this going forwards; however enforcing such a strict thing is hard, not sure how we'd do it.) Stefan