On Thu, Sep 22, 2022 at 12:27:33PM -0700, Junio C Hamano wrote: > > As a side note, IMHO having tree->buffer at all is a mistake, because it > > leads to exactly this kind of confusion about when the buffer should be > > discarded. We'd be better off having all callers parse directly into a > > local buffer, and then clean up when they're done. > > Yeah, tree-walk.c users woud use tree_desc structure anyway, and > instead of having a moving pointer that points into a separate thing > (i.e. tree->buffer), it could have its own copy of the "whole buffer" > that can be used to free when it is done iterating over entries. > > > .... But that's obviously a much bigger change. > > Yup. I took a (very) brief stab at this, out of curiosity. The sticking point becomes obvious very quickly: how do we get the buffer to the caller? If you are calling parse_tree(), we can add new out-parameters to provide the buffer. But something like parse_object() is just returning an object struct, and we have to stuff anything we want to communicate to the caller inside the polymorphic struct which contains it. We could split the concept of "parse" away from "get the buffer" entirely, but then we have a potential slowdown. The "parse" functions really want to open the object contents and check the hash (and removing that in the general case would probably break part of fsck, at least). So we'd end up inflating the object contents twice, which would probably have a measurable impact. I don't plan on digging any further on it for now, so this is just a note for future people who do. :) -Peff