On Thu, Sep 02, 2021 at 02:25:33PM +0200, Ævar Arnfjörð Bjarmason wrote: > > Hmm. This still seems more complicated than we need. If we just want a > > flag in each script, then test-lib.sh can use that flag to tweak > > LSAN_OPTIONS. See the patch below. > > On the "pragma" include v.s. env var + export: I figured this would be > easier to read as I thought the export was required (I don't think it is > in most cases, but e.g. for t0000*.sh I think it is, but that's from > memory...). I admit that half of my complaint with the pragma is the weird filename with an "=" in it. :) But I do think just assigning the variable is the most readable thing. If t0000 needs to export for whatever reason, it can do so (preferably with a comment explaining why). > > That has two drawbacks: > > > > - it doesn't have any way to switch the flag per-test. But IMHO it is > > a mistake to go in that direction. This is all temporary scaffolding > > while we have leaks, and the script-level of granularity is fine. > > We have a lot of tests that do simple checking of the tool itself, and > later in the script might be stressing trace2, or common sources of > leaks like "git log" in combination with the tool (e.g. the commit-graph > tests). > > So being able to tweak this inside the script is useful, but that can of > course also be done with this proposed TEST_LSAN_OK + prereq. Getting rid of the "let's tell the tests that we were built with LSAN" was part of the simplicity I was going for (and obviously does preclude a prerequisite). I had hoped we wouldn't need to do per-test stuff, because this was all a temporary state. But maybe that's naive. > > If we do care about not running them, then I think it makes more > > sense to extend the run/skip mechanisms and build on that. > > The patch I have here is already nicely integrated with the skip > mechanism. I.e. we use skip_all which shows a summary in any TAP > consumer, and we can skip individual tests with prerequisites. I meant here that we'd be driving the selection externally from the tests using the skip/run mechanisms (something along the lines of what I sketched out before). But I admit that there isn't really a big difference between the two approaches. Since you've coded this one up already, let's go in that direction (i.e., this series). > I was interested in doing some summaries of existing leaks > eventually. It seems even with LSAN_OPTIONS=detect_leaks=0 compiling > with SANITIZE=leak make things a bit slower, but not by much (but actual > leak checking is much slower). > > But I'd prefer to leave any "write out leak logs and summarize" step for > some later change. OK, I can live with that (especially given how apparently difficult it is to convince LSAN to do it). > > Sort of a meta-question, but what's the plan for folks who add a new > > test to say t0000, and it reveals a leak in code they didn't touch? > > Then CI will fail on this job. We'd have those same failures now > (e.g. the mentioned current delta between master..seen), we just don't > see them. Having visibility on them seems like an improvement. > > > They'll get a CI failure (as will Junio if he picks up the patch), so > > somebody is going to have to deal with it. Do they fix it? Do they unset > > the "this script is OK" flag? Do they mark the individual test as > > non-lsan-ok? > > I'd think they'd fix it, or make marking the regression as OK part of > their re-roll, just like failures on master..seen now. > > If you're getting at that we should start out this job as an FYI job > that doesn't impact the CI run's overall status if it fails I think that > would be OK as a start. I think that would be OK, but I'm not quite sure of the best way to do it. Why don't we start it as a regular required job, and then we can see how often it is causing a headache. If once every few months somebody fixes a leak, I'd be happy. If new developers are getting tangled up constantly in unrelated leaks, then that's something we'd need to revisit. > > I do like the idea of finding real regressions. But while the state of > > leak-checking is still so immature, I'm worried about this adding extra > > friction for developers. Especially if they get some spooky action at a > > distance caused by a leak in far-away code. > > Yeah, ultimately this series is an implicit endorsement of us caring > more than we do now. > > I think this friction point is going to be mitigated a lot by the > ability I've added to not just skip entire test scripts, but allow > prereq skipping of some tests, early bailing out etc. I half-agree with your final paragraph. The biggest friction point I think will be for new folks when CI starts failing, and they don't understand why (or where the problem is, or how to debug it, etc). But like I said, let's see what happens. > > Anyway, here's LSAN_OPTIONS thing I was thinking of. > > Thanks, that & your follow-up is very interesting. Can I assume this has > your SOB? I'd like to add that redirect to fd 4 change to this series. Yes, go for it. -Peff