Re: [PATCH 1/3] fsck: free tree buffers after walking unreachable objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> After calling fsck_walk(), a tree object struct may be left in the
> parsed state, with the full tree contents available via tree->buffer.
> It's the responsibility of the caller to free these when it's done with
> the object to avoid having many trees allocated at once.
>
> In a regular "git fsck", we hit fsck_walk() only from fsck_obj(), which
> does call free_tree_buffer(). Likewise for "--connectivity-only", we see
> most objects via traverse_one_object(), which makes a similar call.
>
> The exception is in mark_unreachable_referents(). When using both
> "--connectivity-only" and "--dangling" (the latter of which is the
> default), we walk all of the unreachable objects, and there we forget to
> free. Most cases would not notice this, because they don't have a lot of
> unreachable objects, but you can make a pathological case like this:
>
>   git clone --bare /path/to/linux.git repo.git
>   cd repo.git
>   rm packed-refs ;# now everything is unreachable!
>   git fsck --connectivity-only
>
> That ends up with peak heap usage ~18GB, which is (not coincidentally)
> close to the size of all uncompressed trees in the repository. After
> this patch, the peak heap is only ~2GB.
>
> A few things to note:
>
>   - it might seem like fsck_walk(), if it is parsing the trees, should
>     be responsible for freeing them. But the situation is quite tricky.
>     In the non-connectivity mode, after we call fsck_walk() we then
>     proceed with fsck_object() which actually does the type-specific
>     sanity checks on the object contents. We do pass our own separate
>     buffer to fsck_object(), but there's a catch: our earlier call to
>     parse_object_buffer() may have attached that buffer to the object
>     struct! So by freeing it, we leave the rest of the code with a
>     dangling pointer.
>
>     Likewise, the call to fsck_walk() in index-pack is subtle. It
>     attaches a buffer to the tree object that must not be freed! And
>     so rather than calling free_tree_buffer(), it actually detaches it
>     by setting tree->buffer to NULL.
>
>     These cases would _probably_ be fixable by having fsck_walk() free
>     the tree buffer only when it was the one who allocated it via
>     parse_tree(). But that would still leave the callers responsible for
>     freeing other cases, so they wouldn't be simplified. While the
>     current semantics for fsck_walk() make it easy to accidentally leak
>     in new callers, at least they are simple to explain, and it's not a
>     function that's likely to get a lot of new call-sites.
>
>     And in any case, it's probably sensible to fix the leak first with
>     this simple patch, and try any more complicated refactoring
>     separately.
>
>   - a careful reader may notice that fsck_obj() also frees commit
>     buffers, but neither the call in traverse_one_object() nor the one
>     touched in this patch does so. And indeed, this is another problem
>     for --connectivity-only (and accounts for most of the 2GB heap after
>     this patch), but it's one we'll fix in a separate commit.
>
> Signed-off-by: Jeff King <peff@xxxxxxxx>
> ---
>  builtin/fsck.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index f7916f06ed..34e575a170 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -228,6 +228,8 @@ static void mark_unreachable_referents(const struct object_id *oid)
>  
>  	options.walk = mark_used;
>  	fsck_walk(obj, NULL, &options);
> +	if (obj->type == OBJ_TREE)
> +		free_tree_buffer((struct tree *)obj);
>  }

Unlike codepaths like mark_object(), which uses the REACHABLE bit to
avoid the walker coming into an already marked objects, we have no
protection that says "this tree is already marked as USED, so lets
not go into its contents" (it would be a disaster if we free tree
buffer here and then later end up calling the function on the same
tree), but it is OK because this is an unreachable object nobody
points at and we will never come back?



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux