On Mon, Jul 25, 2022 at 4:06 AM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > [...] > > I'm re-rolling ab/leak-check, and came up with the below (at the very > end) to "fix" a report in builtin/merge.c, reading your commit message > your fix seems obviously better. > > Mine's early WIP, and I e.g. didn't notice that I forgot to unlock the > &lock file, which is correct. > > I *could* say "that's not my problem", i.e. we didn't unlock it before > (we rely on atexit). The truth is I just missed it, but having said that > it *is* true that we could do without it, or do it as a separate chaneg. > > I'm just posting my version below to help move yours forward, i.e. to > show that someone else has carefully at least this part. "has carefully ... at least this part" ? I think you have a missing verb there. > But it is worth noting from staring at the two that your version is > mixing several different behavior changes into one, which *could* be > split up (but whether you think that's worth it I leave to you). > > Maybe I'm the only one initially confused by it, and that's probably > just from being mentally biased towards my own "solution". Those are (at > least): > > 1. Before we didn't explicitly unlock() before exit(), but had atexit() > do it, that could be a one-line first commit. This change is > obviously good. That'd be fine. (Though at this point, I'd rather not mess with the series more.) > 2. A commit like mine could come next, i.e. we bug-for-bug do what we > do do now, but just run the "post-builtin" logic when we return from > cmd_merge(). > > Doing it as an in-between would be some churn, as we'll need to get > rid of "early_exit" again, but would allow us to incrementally move > forward to... So, add a step that makes it glaringly obvious that the code is not only buggy but totally at odds with itself? builtin/merge.c was designed to allow pluggable backends and to automatically pick the "best" one if more than one is specified. We had a bug in one line of code that defeated the design, by making it not bother consulting beyond the first failed backend in some cases. That's the bug I'm trying to address. Your patch would make the inconsistency with the design both bigger and more obvious; I don't see how it's a useful step to take. Now, the existing design might be questionable. In fact, I'm not sure I like it. But I think we should either change the design, or fix things in a way that improves towards the existing design. > 3. ...then we'd say "but it actually makes sense not to early abort", > i.e. you want to change this so that we'll run the logic between > try_merge_strategy() exiting with 128 now and the return from > cmd_merge(). > > This bit is my main sticking point in reviewing your change, > i.e. your "a testcase for this is somewhat difficult" somewhat > addresses this, but (and maybe I'm wrong) it seems to me that > > Editing that code the post-image looks like this, with my > commentary & most of the code removed, i.e. just focusing on the > branches we do and don't potentially have tests for: > > /* Before this we fall through from ret == 128 (or ret == 2...) */ > if (automerge_was_ok) { // not tested? > if (!best_strategy) { > // we test this... > if (use_strategies_nr > 1) > // And this: _("No merge strategy handled the merge.\n")); > else > // And this: _("Merge with strategy %s failed.\n"), > } else if (best_strategy == wt_strategy) > // but not this? > else > // Or this, where we e.g. say "Rewinding the tree to pristene..."? > > if (squash) { > // this? > } else > // this? (probably, yes) > write_merge_state(remoteheads); > > if (merge_was_ok) > // this? (probably, yes, we just don't grep it?) > else > // this? maybe yes because it's covered by the > // "failed" above too? > ret = suggest_conflicts(); > > done: > if (!automerge_was_ok) { > // this? ditto the first "not tested?" > } > > I.e. are you confident that we want to continue now in these various > cases, where we have squash, !automerge_was_ok etc. I think it would > be really useful to comment on (perhaps by amending the above > pseudocode) what test cases we're not testing / test already etc. To be honest, I'm confused by what looks like a make-work project. Perhaps if I understood your frame of reference better, maybe they wouldn't bother me, but there's a few things here that seem a little funny to me. You've highlighted that you are worried about the case where ret is 2 (or 128) at the point of all these branches in question. However, three of those branches can almost trivially be deduced to never be taken unless ret is 0. One of the other codepaths, for freeing memory, is correct regardless of the value of ret -- the memory is conditionally freed earlier and the "if"-check exists only to avoid a double free (and checking the recent commit message where those lines were added would explain this, though I'm not sure why it'd even need explaining separately for e.g. ret == 2 compared to any other value). Three of the other code paths involve nothing more than print statements. Now, there are many codepaths you highlighted, and perhaps there are some where it's not trivial to determine whether they are okay in combination with a different return value. And it may also be easy to miss some of the "almost trivial" cases. I'd understand better if you asked about the tougher ones or only some of the easier ones, but it feels like you didn't try to check any of them and instead wanted me to just spend time commenting on every single code branch? I hope that doesn't come across harshly. I'm just struggling to understand where the detailed request is coming from. However, perhaps I can obviate the whole set of requests by just pointing out that I don't think any of them are relevant. The premise for the audit request seems to be that you are worrying that the change from die() to "return 2" (or 128) in try_merge_strategy() will result in the calling code getting into a state it has never experienced before which might uncover latent bugs. We can trivially point out that it's not a new state, though: such return values were already possible from try_merge_strategy() via the final line of code in the function -- namely, the "return try_merge_command(...)" code path. And a return value of 2 from try_merge_command() and try_merge_strategy() isn't merely theoretical either -- you can easily trigger it multiple ways; the easiest is perhaps by passing `-s octopus` when doing a non-octopus merge. (You can also make the testcase more complex by combining that with as many other additional merge strategies as you want, e.g. "git merge -s octopus -s resolve -s recursive -s mySpecialStrategy $BRANCH". You can also move the octopus to the end if you want to test what happens if it is tried last. Lots of possibilities exist). > 4. Having done all that (or maybe this can't be split up / needs to > come earlier) you say that we'd like to not generically call this > exit state 128, but have it under the "exit(2)" umbrella. I don't see how reading your set of steps 1-4 logically restores the design of builtin/merge.c, though. An example: what if the user ran "git merge -s resolve -s recursive $BRANCH", and `resolve` handled the merge but had some conflicts, while `recursive` just failed and returned a value of clean < 0? In such a case, builtin/merge.c is supposed to restore the tree to pristine after attempting the recursive backend, and then redo using the best_strategy (which would be `resolve` in this example case). Your steps 1-4 never address such a thing. Your steps might incidentally address such a case as a side effect of their implementation, but that's not at all clear. Since the whole point is fixing the code to match the existing design, it seems really odd to split things into a set of steps that obscures that fix. > Again, all just food for thought, and a way to step-by-step go through > how I came about reviewing this in detail, I hope it and the below > version I came up with before seeing yours helps. > > P.s.: The last paragraph in my commit message does not point to some > hidden edge case in the code behavior here, it's just that clang/gcc are > funny about exit() and die() control flow when combined with > -fsanitize=address and higher optimization levels. > > -- >8 -- > Subject: [PATCH] merge: return, don't use exit() > > Change some of the builtin/merge.c code added in f241ff0d0a9 (prepare > the builtins for a libified merge_recursive(), 2016-07-26) to ferry up > an "early return" state, rather than having try_merge_strategy() call > exit() itself. > > This is a follow-up to dda31145d79 (Merge branch > 'ab/usage-die-message' into gc/branch-recurse-submodules-fix, > 2022-03-31). > > The only behavior change here is that we'll now properly catch other > issues on our way out, see e.g. [1] and the interaction with /dev/full > for an example. > > The immediate reason to do this change is because it's one of the > cases where clang and gcc's SANITIZE=leak behavior differs. Under > clang we don't detect that "t/t6415-merge-dir-to-symlink.sh" triggers > a leak, but gcc spots it. > > 1. https://lore.kernel.org/git/87im2n3gje.fsf@xxxxxxxxxxxxxxxxxxx/ > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > --- > builtin/merge.c | 27 +++++++++++++++++++++------ > 1 file changed, 21 insertions(+), 6 deletions(-) > > diff --git a/builtin/merge.c b/builtin/merge.c > index 23170f2d2a6..a8d5d04f622 100644 > --- a/builtin/merge.c > +++ b/builtin/merge.c > @@ -709,10 +709,12 @@ static void write_tree_trivial(struct object_id *oid) > > static int try_merge_strategy(const char *strategy, struct commit_list *common, > struct commit_list *remoteheads, > - struct commit *head) > + struct commit *head, int *early_exit) > { > const char *head_arg = "HEAD"; > > + *early_exit = 0; > + > if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED, 0) < 0) > return error(_("Unable to write index.")); > > @@ -754,8 +756,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common, > else > clean = merge_recursive(&o, head, remoteheads->item, > reversed, &result); > - if (clean < 0) > - exit(128); > + if (clean < 0) { > + *early_exit = 1; > + return 128; > + } > if (write_locked_index(&the_index, &lock, > COMMIT_LOCK | SKIP_IF_UNCHANGED)) > die(_("unable to write %s"), get_index_file()); > @@ -1665,6 +1669,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix) > > for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) { > int ret, cnt; > + int early_exit; > + > if (i) { > printf(_("Rewinding the tree to pristine...\n")); > restore_state(&head_commit->object.oid, &stash); > @@ -1680,7 +1686,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix) > > ret = try_merge_strategy(use_strategies[i]->name, > common, remoteheads, > - head_commit); > + head_commit, &early_exit); > + if (early_exit) > + goto done; > + > /* > * The backend exits with 1 when conflicts are > * left to be resolved, with 2 when it does not > @@ -1732,12 +1741,18 @@ int cmd_merge(int argc, const char **argv, const char *prefix) > } else if (best_strategy == wt_strategy) > ; /* We already have its result in the working tree. */ > else { > + int new_ret, early_exit; > + > printf(_("Rewinding the tree to pristine...\n")); > restore_state(&head_commit->object.oid, &stash); > printf(_("Using the %s strategy to prepare resolving by hand.\n"), > best_strategy); > - try_merge_strategy(best_strategy, common, remoteheads, > - head_commit); > + new_ret = try_merge_strategy(best_strategy, common, remoteheads, > + head_commit, &early_exit); > + if (early_exit) { > + ret = new_ret; > + goto done; > + } Incidentally, this is essentially dead code being added in this last hunk. This final `else` block can only be triggered when best_strategy has been set, and best_strategy will only be set after a merge strategy works (possibly with conflicts, but was at least appropriate for the problem). Anyway, by the point of this `else` block, we've run multiple strategies already, at least one worked, and the "best" one was not the last one tried. So, at this point, we simply rewind the tree and rerun the known-working merge strategy. (I guess you could say that maybe the merge strategy might not return the same result despite being given the same inputs, so theoretically early_exit could come back true and this hunk isn't quite dead. Just mostly dead, I guess.) > } > > if (squash) { > -- > 2.36.1