On Fri, Sep 20, 2019 at 11:27:49AM -0400, Derrick Stolee wrote: > On 9/19/2019 5:47 PM, SZEDER Gábor wrote: > > At the beginning of the recursive name_rev() function it creates a new > > 'struct rev_name' instance for each previously unvisited commit or, if > > this visit results in better name for an already visited commit, then > > updates the 'struct rev_name' instance attached to to the commit, or > > returns early. > > > > Restructure this so it's caller creates or updates the 'struct > > rev_name' instance associated with the commit to be passed as > > parameter, i.e. both name_ref() before calling name_rev() and > > name_rev() itself as it iterates over the parent commits. > > > > This makes eliminating the recursion a bit easier to follow, and it > > will be moved back to name_rev() after the recursion is eliminated. > > > > This change also plugs the memory leak that was temporarily unplugged > > in the earlier "name-rev: pull out deref handling from the recursion" > > patch in this series. > [snip] > > > > @@ -276,11 +277,17 @@ static int name_ref(const char *path, const struct object_id *oid, int flags, vo > > path = name_ref_abbrev(path, can_abbreviate_output); > > if (commit->date >= cutoff) { > > const char *tip_name; > > + char *to_free = NULL; > > if (deref) > > - tip_name = xstrfmt("%s^0", path); > > + tip_name = to_free = xstrfmt("%s^0", path); > > else > > tip_name = xstrdup(path); > > So this xstrdup(path) is not a leak? Well... yes, everything is leaked, eventually ;) First of all, name_ref() is a callback function invoked by for_each_ref(), and 'path' is one of its parameters. This means that we must duplicate this string, because we might need to access it even after iterating over refs is over (when displaying the name of the commits). This copy then becomes the initial name_rev() invocation's 'tip_name' parameter, and things get a bit harder to reason about because of merges, the recursion, and being in the middle of the refactorings... So for argument's sake look at current master's 'builtin/name-rev.c', and assume that it has to deal with a linear history with a single branch, and no tags. When name_ref() is invoked with a branch, then deref = 0, so in name_rev() the condition: if (deref) { tip_name = to_free = xstrfmt("%s^0", tip_name); won't be fulfilled, and 'tip_name' remains unchanged. Then comes the initialization of the 'struct rev_name' associated with the tip commit in the commit slab, including the name->tip_name = tip_name; assignment, IOW we now have a pointer to the original 'tip_name' in the commit slab. Then name_rev() looks at the tip commit's parents, or rather, because of the linear history, at its only parent. This means that neither of the other two xstrfmt() will be invoked, and name_rev() will be recursively invoked with the original 'tip_name' pointer. This will then initialize another 'struct rev_name' instance, including yet another pointer to the original 'tip_name'. This will go on until the root commit is reached, and in the end every commit will have an associated 'struct rev_name' instance with a pointer to the original 'tip_name' string. At this point we're done with the recursion, and since the repo has only a single branch, we're done with for_each_ref() as well, and finally print names of the commits. Then it's time to clean up and free() memory, but: - we could only release the commit slab and free() all 'struct rev_name' instances within, but we can't free() the 'name->tip_name' pointers, because, as shown above, we might have a bunch of pointers pointing to the same string. - we are at the end of cmd_name_rev() and are about to exit the program, so the OS will release any and all resources anyway. Yeah, it's far from ready to be turned into a library call... Now, in a real repository we'll have multiple branches, tags, and merges, so there will be cases when an existing 'struct rev_name' instance is updated, and it's 'name->tip_name' is overwritten by a better name. At that point the old value is potentially leaked, but we can't really do anything about it, because we don't know whether any other 'struct rev_name' instances point to it. Eliminating the recursion doesn't change anything in this respect.