On Mon, Oct 4, 2021 at 8:42 AM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > On Mon, Oct 04 2021, Elijah Newren wrote: > > > On Sun, Oct 3, 2021 at 5:46 PM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > >> > >> Change the clear_unpack_trees_porcelain() to be like a *_release() > >> function, not a *_reset() (in strbuf.c terms). Let's move the only API > >> user that relied on the latter to doing its own > >> unpack_trees_options_init(). See the commit that introduced > >> unpack_trees_options_init() for details on the control flow involved > >> here. > >> > >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > >> --- > >> merge-recursive.c | 1 + > >> unpack-trees.c | 1 - > >> 2 files changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/merge-recursive.c b/merge-recursive.c > >> index d24a4903f1d..a77f66b006c 100644 > >> --- a/merge-recursive.c > >> +++ b/merge-recursive.c > >> @@ -442,6 +442,7 @@ static void unpack_trees_finish(struct merge_options *opt) > >> { > >> discard_index(&opt->priv->orig_index); > >> clear_unpack_trees_porcelain(&opt->priv->unpack_opts); > >> + unpack_trees_options_init(&opt->priv->unpack_opts); > > > > This is wrong. It suggests that unpack_opts is used after > > unpack_trees_finish() (other than an outer merge first calling > > unpack_trees_start() again), which can only serve to greatly confuse > > future readers. Drop this hunk. > > Sure, but (and also re: > https://lore.kernel.org/git/CABPp-BEA2myh2Np_YpFWnE+jqmT5vz7ohigZ0=2tL-wizgYQmg@xxxxxxxxxxxxxx/) > if you'd like not initialize things in merge_start() just for good > measure wouldn't the diff-at-the-end on top of your 5bf7e5779ec > (merge-recursive: split internal fields into a separate struct, > 2019-08-17) also make sense? Sorry, I can't parse this sentence. Could you retry? > I.e. the reason I entered this particular rabbit hole was in looking at > existing members of "struct merge_options_internal" & past commits and > seeing how we did its initialization. That canary on top passes all our > tests, and per my reading we also don't use "df_conflict_file_set" until > as late as the things we setup in unpack_trees_start(). Should those be > moved to do the post-merge_start() setup at the same time? It appears df_conflict_file_set has some theoretical memory leaks (though in practice unlikely and quite small in the few cases that could be constructed to trigger it). Initializing it nearer to use and free'ing when done (in merge_trees_internal()) would make more sense, yes. But, merge-recursive.c right now is supposed to be the stable fallback in case someone runs into an issue with merge-ort. I'd rather keep it stable in preparation for deleting it, not churning its code unnecessarily. > >> } > >> > >> static int save_files_dirs(const struct object_id *oid, > >> diff --git a/unpack-trees.c b/unpack-trees.c > >> index 94767d3f96f..e7365322e82 100644 > >> --- a/unpack-trees.c > >> +++ b/unpack-trees.c > >> @@ -197,7 +197,6 @@ void clear_unpack_trees_porcelain(struct unpack_trees_options *opts) > >> { > >> strvec_clear(&opts->msgs_to_free); > >> dir_clear(&opts->dir); > >> - memset(opts->msgs, 0, sizeof(opts->msgs)); > > > > This seems like a very dangerous change. You want to leave opts->msgs > > pointing at freed memory? > > Yes, as argued in > http://lore.kernel.org/git/87bl45niqs.fsf@xxxxxxxxxxxxxxxxxxx; In this > series we can see that nothing re-uses it, so it's as safe as our > strbuf_release(), or a plain free(). strbuf_release() sets sb->buf to strbuf_slopbuf, and sets sb->len = sb->alloc = 0. The strbuf can thus be reused after calling strbuf_release(). strvec_clear() also calls strvec_init() afterwards to set the vector to be usable though 0-sized. hashmap_clear() also clears out existing data, but makes it ready for reuse (as per 6da1a25814) strmap_clear(), strintmap_clear(), and strset_clear() also set up the data structure for reuse. There's a longstanding presumption that something named `*_clear()` will make it still usable afterwards. Rename it to end with `_free` if you want it to be an analogy to free() where usage afterwards would cause use-after-free errors. > Maybe I'm misunderstanding what you're getting at, and I could > understand a "let's just reset it for good measure" POV. But I can't > square your view that we shouldn't do setup in merge_start() for good > measure in case some new future code accidentally uses the data earlier > (which I'm fine with), but then also not finding it OK to skip the > memset() here ... No existing caller needs to make use of the fact that it's a `_clear` function rather than a `_free` function, but if you want to take advantage of that to do less work, you should both call it out in your commit message and rename the function. You didn't do either. In fact, your existing commit message mentions strbuf_release(), which reinforces the `_clear` presumption of reusability and thus makes me flag the change as dangerous.