Am 18.01.23 um 18:19 schrieb Jeff King: > On Tue, Jan 17, 2023 at 08:58:24PM +0100, René Scharfe wrote: > >> Am 17.01.23 um 18:11 schrieb Ævar Arnfjörð Bjarmason: >>> Fix a memory leak that's been with us ever since c879daa2372 (Make >>> hash-object more robust against malformed objects, 2011-02-05). With >>> "HASH_FORMAT_CHECK" (used by "hash-object" and "replace") we'll parse >>> tags into a throwaway variable on the stack, but weren't freeing the >>> "item->tag" we might malloc() when doing so. >>> >>> The clearing that release_tag_memory() does for us is redundant here, >>> but let's use it as-is anyway. It only has one other existing caller, >>> which does need the tag to be cleared. >> >> Calling it is better than getting our hands dirty with tag internals >> here. >> >> There's similar leak in check_commit() in the same file, but plugging it >> would require exporting unparse_commit(). Or perhaps using the heavy >> hammer that is release_commit_memory()? Anyway, it doesn't seem simple >> to me, so that would be a patch for a separate series. > > I think both of these cases are a bit sketchy, because they create an > object struct on the stack, and not via alloc_*_node(), which may > violate assumptions elsewhere. In the case of the tag, it's probably OK. > For the commit, though, the "index" field is going to be 0, which may > confuse release_commit_memory(). It calls free_commit_buffer(), which is > going to use that index to try to free from the buffer slab. > > So I'd be wary of calling that. I'm also slightly skeptical of the > existing code that calls parse_commit_buffer(), but I think it works. We > do not attach the buffer to the commit object there (good), and we pass > "0" for check_graph, so the graph code (which IIRC also uses the index > for some slab lookups) isn't run. > > I think this code would be better off either: > > 1. Just allocating an object struct in the usual way. I understand the > desire not to spend extra memory, but parse_commit_buffer() is > going to allocate structs under the hood for the tree and parent > commits anyway. > > 2. The point of this code is to find malformed input to hash-object. > We're probably better off feeding the buffer to fsck_commit(), etc. > It does more thorough checks, and these days it does not need an > object struct at all. I like the second one, as long as it won't check too much. c879daa237 (Make hash-object more robust against malformed objects, 2011-02-05) added the checks that are now in object-file.c and intended to only validate the internal structure of objects, not relations between. It gave the example to allow adding a commit before its tree, which should be allowed. And IIUC fsck_object() fits that bill. > Either of which would naturally fix the leak for tags. I'm not sure > there actually is a leak for commits, as commit structs don't store any > strings themselves. parse_commit_buffer() allocates the list of parents. Hmm, and it looks them up. Doesn't this violate the goal to allow dangling references? > I don't think any of that needs to hold up Ævar's current series, > though. Fixing the leak this way in the meantime is OK, and then if we > switch to one of the other techniques, the problem just goes away. Right. René