On Thu, Sep 22, 2022 at 11:40:05AM -0700, Junio C Hamano wrote: > > diff --git a/builtin/fsck.c b/builtin/fsck.c > > index f7916f06ed..34e575a170 100644 > > --- a/builtin/fsck.c > > +++ b/builtin/fsck.c > > @@ -228,6 +228,8 @@ static void mark_unreachable_referents(const struct object_id *oid) > > > > options.walk = mark_used; > > fsck_walk(obj, NULL, &options); > > + if (obj->type == OBJ_TREE) > > + free_tree_buffer((struct tree *)obj); > > } > > Unlike codepaths like mark_object(), which uses the REACHABLE bit to > avoid the walker coming into an already marked objects, we have no > protection that says "this tree is already marked as USED, so lets > not go into its contents" (it would be a disaster if we free tree > buffer here and then later end up calling the function on the same > tree), but it is OK because this is an unreachable object nobody > points at and we will never come back? I do think it is true that this is the final time we'd look at these objects. But I don't think it would be a disaster if somebody did. The free_tree_buffer() function clears the "parsed" flag on the struct. And anybody wishing to look at the tree contents would need to call parse_tree(), which would then re-load the contents. In general, that's _possibly_ less efficient if we visit the same tree twice, but it would be balanced against not holding all of the tree data in RAM at once. And as I said, that doesn't happen for this use case anyway. As a side note, IMHO having tree->buffer at all is a mistake, because it leads to exactly this kind of confusion about when the buffer should be discarded. We'd be better off having all callers parse directly into a local buffer, and then clean up when they're done. It effectively ends up the same, except then it's obvious when a tree is "leaked" because the local buffer goes out of scope, rather than hanging around in the global struct and just wasting memory. But that's obviously a much bigger change. -Peff