When parsing a commit, the default behavior is to stuff the original buffer into a commit_slab (which takes ownership of it). But for a tool like fsck, this isn't useful. While we may look at the buffer further as part of fsck_commit(), we'll always do so through a separate pointer; attaching the buffer to the slab doesn't help. Worse, it means we have to remember to free the commit buffer in all call paths. We do so in fsck_obj(), which covers a regular "git fsck". But with "--connectivity-only", we forget to do so in both traverse_one_object(), which covers reachable objects, and mark_unreachable_referents(), which covers unreachable ones. As a result, that mode ends up storing an uncompressed copy of every commit on the heap at once. We could teach the code paths for --connectivity-only to also free commit buffers. But there's an even easier fix: we can just turn off the save_commit_buffer flag, and then we won't attach them to the commits in the first place. This reduces the peak heap of running "git fsck --connectivity-only" in a clone of linux.git from ~2GB to ~1GB. According to massif, the remaining memory goes where you'd expect: the object structs themselves, the obj_hash containing them, and the delta base cache. Note that we'll leave the call to free commit buffers in fsck_obj() for now; it's not quite redundant because of a related bug that we'll fix in a subsequent commit. Signed-off-by: Jeff King <peff@xxxxxxxx> --- builtin/fsck.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index 34e575a170..b45de003d4 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -855,6 +855,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) errors_found = 0; read_replace_refs = 0; + save_commit_buffer = 0; argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0); -- 2.38.0.rc1.583.ga560cd8328