On Mon, Jul 25 2022, Elijah Newren wrote: > On Mon, Jul 25, 2022 at 4:06 AM Ævar Arnfjörð Bjarmason > <avarab@xxxxxxxxx> wrote: >> > [...] >> >> I'm re-rolling ab/leak-check, and came up with the below (at the very >> end) to "fix" a report in builtin/merge.c, reading your commit message >> your fix seems obviously better. >> >> Mine's early WIP, and I e.g. didn't notice that I forgot to unlock the >> &lock file, which is correct. >> >> I *could* say "that's not my problem", i.e. we didn't unlock it before >> (we rely on atexit). The truth is I just missed it, but having said that >> it *is* true that we could do without it, or do it as a separate chaneg. >> >> I'm just posting my version below to help move yours forward, i.e. to >> show that someone else has carefully at least this part. > > "has carefully ... at least this part" ? > > I think you have a missing verb there. "reviewed", sorry. >> But it is worth noting from staring at the two that your version is >> mixing several different behavior changes into one, which *could* be >> split up (but whether you think that's worth it I leave to you). >> >> Maybe I'm the only one initially confused by it, and that's probably >> just from being mentally biased towards my own "solution". Those are (at >> least): >> >> 1. Before we didn't explicitly unlock() before exit(), but had atexit() >> do it, that could be a one-line first commit. This change is >> obviously good. > > That'd be fine. (Though at this point, I'd rather not mess with the > series more.) That's completely fine, to be clear(er) what I'm walking through here is my review process & potential edge cases I discovered, but I think your patch as-is does it all correctly. >> 2. A commit like mine could come next, i.e. we bug-for-bug do what we >> do do now, but just run the "post-builtin" logic when we return from >> cmd_merge(). >> >> Doing it as an in-between would be some churn, as we'll need to get >> rid of "early_exit" again, but would allow us to incrementally move >> forward to... > > So, add a step that makes it glaringly obvious that the code is not > only buggy but totally at odds with itself? > > builtin/merge.c was designed to allow pluggable backends and to > automatically pick the "best" one if more than one is specified. We > had a bug in one line of code that defeated the design, by making it > not bother consulting beyond the first failed backend in some cases. > That's the bug I'm trying to address. Your patch would make the > inconsistency with the design both bigger and more obvious; I don't > see how it's a useful step to take. > > Now, the existing design might be questionable. In fact, I'm not sure > I like it. But I think we should either change the design, or fix > things in a way that improves towards the existing design. Yes, I don't think it's useful, sorry about the confusion. I was walking through what the observable changes from the outside are here, and how they *might* be split up. Not as a practical suggestion, but as a way to understand your commit... >> 3. ...then we'd say "but it actually makes sense not to early abort", >> i.e. you want to change this so that we'll run the logic between >> try_merge_strategy() exiting with 128 now and the return from >> cmd_merge(). >> >> This bit is my main sticking point in reviewing your change, >> i.e. your "a testcase for this is somewhat difficult" somewhat >> addresses this, but (and maybe I'm wrong) it seems to me that >> >> Editing that code the post-image looks like this, with my >> commentary & most of the code removed, i.e. just focusing on the >> branches we do and don't potentially have tests for: >> >> /* Before this we fall through from ret == 128 (or ret == 2...) */ >> if (automerge_was_ok) { // not tested? >> if (!best_strategy) { >> // we test this... >> if (use_strategies_nr > 1) >> // And this: _("No merge strategy handled the merge.\n")); >> else >> // And this: _("Merge with strategy %s failed.\n"), >> } else if (best_strategy == wt_strategy) >> // but not this? >> else >> // Or this, where we e.g. say "Rewinding the tree to pristene..."? >> >> if (squash) { >> // this? >> } else >> // this? (probably, yes) >> write_merge_state(remoteheads); >> >> if (merge_was_ok) >> // this? (probably, yes, we just don't grep it?) >> else >> // this? maybe yes because it's covered by the >> // "failed" above too? >> ret = suggest_conflicts(); >> >> done: >> if (!automerge_was_ok) { >> // this? ditto the first "not tested?" >> } >> >> I.e. are you confident that we want to continue now in these various >> cases, where we have squash, !automerge_was_ok etc. I think it would >> be really useful to comment on (perhaps by amending the above >> pseudocode) what test cases we're not testing / test already etc. > > To be honest, I'm confused by what looks like a make-work project. > Perhaps if I understood your frame of reference better, maybe they > wouldn't bother me, but there's a few things here that seem a little > funny to me. > > You've highlighted that you are worried about the case where ret is 2 > (or 128) at the point of all these branches in question. However, > three of those branches can almost trivially be deduced to never be > taken unless ret is 0. One of the other codepaths, for freeing > memory, is correct regardless of the value of ret -- the memory is > conditionally freed earlier and the "if"-check exists only to avoid a > double free (and checking the recent commit message where those lines > were added would explain this, though I'm not sure why it'd even need > explaining separately for e.g. ret == 2 compared to any other value). > Three of the other code paths involve nothing more than print > statements. Now, there are many codepaths you highlighted, and > perhaps there are some where it's not trivial to determine whether > they are okay in combination with a different return value. And it > may also be easy to miss some of the "almost trivial" cases. I'd > understand better if you asked about the tougher ones or only some of > the easier ones, but it feels like you didn't try to check any of them > and instead wanted me to just spend time commenting on every single > code branch? > > I hope that doesn't come across harshly. I'm just struggling to > understand where the detailed request is coming from. > > However, perhaps I can obviate the whole set of requests by just > pointing out that I don't think any of them are relevant. The premise > for the audit request seems to be that you are worrying that the > change from die() to "return 2" (or 128) in try_merge_strategy() will > result in the calling code getting into a state it has never > experienced before which might uncover latent bugs. We can trivially > point out that it's not a new state, though: such return values were > already possible from try_merge_strategy() via the final line of code > in the function -- namely, the "return try_merge_command(...)" code > path. And a return value of 2 from try_merge_command() and > try_merge_strategy() isn't merely theoretical either -- you can easily > trigger it multiple ways; the easiest is perhaps by passing `-s > octopus` when doing a non-octopus merge. (You can also make the > testcase more complex by combining that with as many other additional > merge strategies as you want, e.g. "git merge -s octopus -s resolve -s > recursive -s mySpecialStrategy $BRANCH". You can also move the > octopus to the end if you want to test what happens if it is tried > last. Lots of possibilities exist). Thanks, that all makes sense & addresses any questions I had, which were just if we were sure that this behavior was expected. >> 4. Having done all that (or maybe this can't be split up / needs to >> come earlier) you say that we'd like to not generically call this >> exit state 128, but have it under the "exit(2)" umbrella. > > I don't see how reading your set of steps 1-4 logically restores the > design of builtin/merge.c, though. An example: what if the user ran > "git merge -s resolve -s recursive $BRANCH", and `resolve` handled the > merge but had some conflicts, while `recursive` just failed and > returned a value of clean < 0? In such a case, builtin/merge.c is > supposed to restore the tree to pristine after attempting the > recursive backend, and then redo using the best_strategy (which would > be `resolve` in this example case). Your steps 1-4 never address such > a thing. Your steps might incidentally address such a case as a side > effect of their implementation, but that's not at all clear. Since > the whole point is fixing the code to match the existing design, it > seems really odd to split things into a set of steps that obscures > that fix. You know more about the wider context here, I was just poking at the narrow code change & seeing what the side-effects were of the changes being made. >> Again, all just food for thought, and a way to step-by-step go through >> how I came about reviewing this in detail, I hope it and the below >> version I came up with before seeing yours helps. >> >> P.s.: The last paragraph in my commit message does not point to some >> hidden edge case in the code behavior here, it's just that clang/gcc are >> funny about exit() and die() control flow when combined with >> -fsanitize=address and higher optimization levels. >> >> -- >8 -- >> Subject: [PATCH] merge: return, don't use exit() >> >> Change some of the builtin/merge.c code added in f241ff0d0a9 (prepare >> the builtins for a libified merge_recursive(), 2016-07-26) to ferry up >> an "early return" state, rather than having try_merge_strategy() call >> exit() itself. >> >> This is a follow-up to dda31145d79 (Merge branch >> 'ab/usage-die-message' into gc/branch-recurse-submodules-fix, >> 2022-03-31). >> >> The only behavior change here is that we'll now properly catch other >> issues on our way out, see e.g. [1] and the interaction with /dev/full >> for an example. >> >> The immediate reason to do this change is because it's one of the >> cases where clang and gcc's SANITIZE=leak behavior differs. Under >> clang we don't detect that "t/t6415-merge-dir-to-symlink.sh" triggers >> a leak, but gcc spots it. >> >> 1. https://lore.kernel.org/git/87im2n3gje.fsf@xxxxxxxxxxxxxxxxxxx/ >> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> >> --- >> builtin/merge.c | 27 +++++++++++++++++++++------ >> 1 file changed, 21 insertions(+), 6 deletions(-) >> >> diff --git a/builtin/merge.c b/builtin/merge.c >> index 23170f2d2a6..a8d5d04f622 100644 >> --- a/builtin/merge.c >> +++ b/builtin/merge.c >> @@ -709,10 +709,12 @@ static void write_tree_trivial(struct object_id *oid) >> >> static int try_merge_strategy(const char *strategy, struct commit_list *common, >> struct commit_list *remoteheads, >> - struct commit *head) >> + struct commit *head, int *early_exit) >> { >> const char *head_arg = "HEAD"; >> >> + *early_exit = 0; >> + >> if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED, 0) < 0) >> return error(_("Unable to write index.")); >> >> @@ -754,8 +756,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common, >> else >> clean = merge_recursive(&o, head, remoteheads->item, >> reversed, &result); >> - if (clean < 0) >> - exit(128); >> + if (clean < 0) { >> + *early_exit = 1; >> + return 128; >> + } >> if (write_locked_index(&the_index, &lock, >> COMMIT_LOCK | SKIP_IF_UNCHANGED)) >> die(_("unable to write %s"), get_index_file()); >> @@ -1665,6 +1669,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix) >> >> for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) { >> int ret, cnt; >> + int early_exit; >> + >> if (i) { >> printf(_("Rewinding the tree to pristine...\n")); >> restore_state(&head_commit->object.oid, &stash); >> @@ -1680,7 +1686,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix) >> >> ret = try_merge_strategy(use_strategies[i]->name, >> common, remoteheads, >> - head_commit); >> + head_commit, &early_exit); >> + if (early_exit) >> + goto done; >> + >> /* >> * The backend exits with 1 when conflicts are >> * left to be resolved, with 2 when it does not >> @@ -1732,12 +1741,18 @@ int cmd_merge(int argc, const char **argv, const char *prefix) >> } else if (best_strategy == wt_strategy) >> ; /* We already have its result in the working tree. */ >> else { >> + int new_ret, early_exit; >> + >> printf(_("Rewinding the tree to pristine...\n")); >> restore_state(&head_commit->object.oid, &stash); >> printf(_("Using the %s strategy to prepare resolving by hand.\n"), >> best_strategy); >> - try_merge_strategy(best_strategy, common, remoteheads, >> - head_commit); >> + new_ret = try_merge_strategy(best_strategy, common, remoteheads, >> + head_commit, &early_exit); >> + if (early_exit) { >> + ret = new_ret; >> + goto done; >> + } > > Incidentally, this is essentially dead code being added in this last > hunk. This final `else` block can only be triggered when > best_strategy has been set, and best_strategy will only be set after a > merge strategy works (possibly with conflicts, but was at least > appropriate for the problem). Anyway, by the point of this `else` > block, we've run multiple strategies already, at least one worked, and > the "best" one was not the last one tried. So, at this point, we > simply rewind the tree and rerun the known-working merge strategy. (I > guess you could say that maybe the merge strategy might not return the > same result despite being given the same inputs, so theoretically > early_exit could come back true and this hunk isn't quite dead. Just > mostly dead, I guess.) The idea here was to leave the "is it really dead?" question aside, and just move the code towards running the post-builtin code we run when we don't call an early exit().