Re: [PATCH 3/3] dir: fix problematic API to avoid memory leaks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 16, 2020 at 06:59:11AM +0000, Elijah Newren via GitGitGadget wrote:

> Digging further, I found that despite the pretty clear documentation
> near the top of dir.h that folks were supposed to call clear_directory()
> when the user no longer needed the dir_struct, there were four callers
> that didn't bother doing that at all.  However, two of them clearly
> thought about leaks since they had an UNLEAK(dir) directive, which to me
> suggests that the method to free the data was too unclear.  I suspect
> the non-obviousness of the API and its holes led folks to avoid it,
> which then snowballed into further problems with the entries[],
> ignored[], parent_hashmap, and recursive_hashmap problems.

The UNLEAK() ones are sort-of my fault, and are a combination of:

  - The commit adding them says "in some cases (e.g., dir_struct) we
    don't even have a function which knows how to free all of the struct
    members". I'm not sure if why I didn't fix clear_directory() as you
    did. I may not have known about it, or I may have been lazy. Or more
    likely I was simply holding the UNLEAK() hammer, so it looked like a
    nail. ;)

  - My focus was on making leak-checker output cleaner. And it wasn't
    clear for cases where we're about to exit the program whether we
    should be actually freeing (which has small but non-zero cost) or
    just annotating (which is zero-cost, but clearly has confused some
    people about how UNLEAK() was meant to be used). I think I'm leaning
    these days towards "free if it is easy to do so".

So this definitely seems like a good direction to me.

> Rename clear_directory() to dir_free() to be more in line with other
> data structures in git, and introduce a dir_init() to handle the
> suggested memsetting of dir_struct to all zeroes.  I hope that a name
> like "dir_free()" is more clear, and that the presence of dir_init()
> will provide a hint to those looking at the code that there may be a
> corresponding dir_free() that they need to call.

I think having a pair of matched calls is good. I _thought_ we had
established a pattern that we should prefer "clear" to "free" for cases
where the struct itself its not freed. But grepping around, I see there
are a few exceptions (hashmap_free() is the big one, and then
oidmap_free() which is built around it seems to have inherited it).

The rest seem to follow that pattern, though: attr_check_free,
cache_tree_free, and submodule_cache_free all actually free the pointer
passed in. And "git grep '_clear(' *.h" shows lots of
clear-but-don't-free examples.

Would dir_clear() make the pairing more obvious, but keep the clear
name?

(I also wouldn't be opposed to changing hashmap and oidmap to use the
name "clear", but that's obviously a separate patch).

>  builtin/add.c          |  4 ++--
>  builtin/check-ignore.c |  4 ++--
>  builtin/clean.c        |  8 ++++----
>  builtin/grep.c         |  3 ++-
>  builtin/ls-files.c     |  4 ++--
>  builtin/stash.c        |  4 ++--
>  dir.c                  |  7 ++++++-
>  dir.h                  | 19 ++++++++++---------
>  merge.c                |  3 ++-
>  wt-status.c            |  4 ++--
>  10 files changed, 34 insertions(+), 26 deletions(-)

That patch itself looks good except for two minor points:

>  /* Frees memory within dir which was allocated.  Does not free dir itself. */
> -void clear_directory(struct dir_struct *dir)
> +void dir_free(struct dir_struct *dir)
>  {
>  	int i, j;
>  	struct exclude_list_group *group;

As I mentioned in my previous email, I think it would be nice if this
called dir_init() at the end, so that the struct is always in a
consistent state.

> diff --git a/dir.h b/dir.h
> index 7d76d0644f..7c55c1a460 100644
> --- a/dir.h
> +++ b/dir.h
> [...]
> - * - Use `dir.entries[]`.
> + * - Use `dir.entries[]` and `dir.ignored[]`.
>   *
>   * - Call `clear_directory()` when the contained elements are no longer in use.
>   *

This last line should become dir_free() / dir_clear().

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux