Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > Fix a memory leak in code added in 5d7eeee2ac6 (git-show: grok blobs, > trees and tags, too, 2006-12-14). As we iterate over a "<revision>..." > command-line and encounter ad OBJ_COMMIT we want to use our "struct "... encounter an OBJ_COMMIT, we want to ..."? > rev_info", but with a "pending" array of one element: the one commit > we're showing in the loop. > > To do this 5d7eeee2ac6 saved away a pointer to rev.pending.objects and > rev.pending.nr for its iteration. We'd then clobber those (and alloc) > when we needed to show an OBJ_COMMIT. > > We'd therefore leak the "rev.pending" we started out with, and only > free the new "rev.pending" in the "OBJ_COMMIT" case arm as > prepare_revision_walk() would draw it down. > > Let's fix this memory leak. Now when we encounter an OBJ_COMMIT we > save away the "rev.pending" before clearing it. We then add a single > commit to it, which our indirect invocation of prepare_revision_walk() > will remove. After that we restore the "rev.pending". > > Our "rev.pending" will then get free'd by the release_revisions() > added in f6bfea0ad01 (revisions API users: use release_revisions() in > builtin/log.c, 2022-04-13) Hmph, this gives me a strange sense of deja-vu that I saw a better solution in a separate patch from you, strange because I do not see anything at the tip of 'seen' like what I thought you did elsewhere. We do need to reuse "rev_info" we got from outside the loop so that we will have to consistently apply diffopt and other things we got in there from the configuration and the command line. But after we decide to go to "each object is shown individually" mode, the contents of the pending array (rather, what we got from the command line in cmdline member) is only relevant to enumerate which individual objects are shown in the loop, and should not even be visible to the code that handles each individual object, e.g. even we pass &rev to this code when we see a blob: switch (o->type) { case OBJ_BLOB: ret = show_blob_object(&o->oid, &rev, name); break; there is no point in exposing the rev.pending to show_blob_object() at al. The same is true for codepaths used to show a tag or a tree. When showing a commit, we do not even want the codepath that shows a single-commit range to touch and destroy the original rev.pending; we instead want to give a single-element pending array. So instead of keeping rev.pending when we enter the loop and saving it away and restoring it, it feels a lot cleaner to invent/use an interface to "clone" the settings in an existing rev_info by: * Grab rev.pending into a separate "struct object_array" that is a local variable in cmd_show() and clear rev.pending, immediately after we decide to go to "show individually" mode. * Iterate over the objects in that local variable. For each object to be shown, prepare a rev_info, clone the setting from the original rev_info, put that single object to the pending member of the clone, use that cloned rev_info, and release the resources held by the cloned rev_info once we are done. > diff --git a/builtin/log.c b/builtin/log.c > index 88a5e98875a..b4b1d974617 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -743,11 +743,17 @@ int cmd_show(int argc, const char **argv, const char *prefix) > rev.shown_one = 1; > break; > case OBJ_COMMIT: > + { > + struct object_array old; > + > + memcpy(&old, &rev.pending, sizeof(old)); > rev.pending.nr = rev.pending.alloc = 0; > rev.pending.objects = NULL; > add_object_array(o, name, &rev.pending); > ret = cmd_log_walk_no_free(&rev); > + memcpy(&rev.pending, &old, sizeof(rev.pending)); > break; > + } The reason why I find this approach a bit disturbing is that we pretend to know for certain that pending is the only thing we need to protect across multiple calls to the log_walk(), but I suspect that such an overconfidence will come back and bite us. We are not re-initializing other states reachable from the rev_info (e.g. the diff_options struct has various members that records what happened in an invocation, like needed_rename_limit, found_follow, and found_changes, that would want to be reinitialized if we are starting a new and totally independent traversal after we are done one traversal). But I do not mind at all to leave such a fundamental clean-up to a totally separate topic, and keep this patch only about "we are clobbering and leaking rev.pending, let's plug the leak". In fact, I would prefer it that way. So take all of the above as material for possible NEEDSWORK comment, food for further thought.