On Tue, Sep 04, 2018 at 12:05:58PM -0700, Stefan Beller wrote: > Yeah, maybe we need to ask for more tests in the 'real' test suite, and not > just in some special corner (such as t/perf/ or any of the environment > variable proposals nearby). > > I wonder if we can make use of git.git in the test suite for similar things, > e.g. after reading the thread about "index corruption with git commit -p" [1], > I thought that we only have these toy examples in the test suite. Toy examples > show that the new feature barely works, and doesn't show it is working at scale. I think the toy examples do both. Often they drill down directly to a useful but rare corner case that _wouldn't_ be hit during normal operation. And being toys, they are a lot quicker to set up then trying to work with a 50MB repository. Take the "commit -p" one for example. It's not really about the repository shape but about a particular set of actions. If you don't test those actions, you won't reproduce the bug. > > There may be a larger lesson about tracking code coverage, but I don't > > know that most general code coverage tools would have helped (any > > overall percentage number would be too large to move). A tool that > > looked at the diff and said "of the N lines you added/touched, this > > percent is exercised in the test suite" might have been useful. > > From some offline discussion, maybe we want to adapt a philosophy of > > Each patch needs to add a test, that fails when the patch > is not applied, but succeeds when it is applied. This shows > that _some_ code in the patch is exercised at least. > > (and automatically/strongly enforce this going forwards; however > enforcing such a strict thing is hard, not sure how we'd do it.) I wouldn't want a hard-and-fast rule like that. If you're fixing a bug, sure, I think it's good to make sure it's exercised. And if you're adding a feature, you almost always add some basic tests (and almost certainly leave some corner without code coverage). But if you're writing an optimization, there's often no before/after test. Presumably it worked before, and hopefully it still works after, and it just got faster. You're generally relying on existing regression tests (from when that code was introduced) to save you from bugs. You might need to _write_ those tests if nobody did before. But it's hard to know without digging if there are decent tests or not. That's why I think code coverage of the lines in your diff is probably the best measure. -Peff