Here is a small-ish update to my series which makes t5319 leak-free (i.e., it passes even when Git is compiled with SANITIZE=leak). It is based on ab/only-single-progress-at-once (so I dropped the cherry-picked patch towards the end from Ævar). Most of the fixes are straightforward from review, but the biggest change is in 7/10, which converts get_midx_filename() to write its result to a strbuf per Junio's helpful suggestion. Thinking ahead, there are a few things I noted while reading through replies for future improvements to the MIDX and pack bitmap code. From my ~/notes, they are: - load_multi_pack_index() should not die when missing the packfile names chunk, instead should pretend that no MIDX exists (maybe warn?) - double-close when reaching the cleanup_fail codepath after xmmap returns - static `verify_midx_error` is ugly - load_bitmap() does not clean up after itself very well (e.g., bitmap_git->trees and others) I'm pretty happy with the state of this topic, and I'll plan on taking on the above in separate series in the future. Taylor Blau (9): midx.c: clean up chunkfile after reading the MIDX midx.c: don't leak MIDX from verify_midx_file t/helper/test-read-midx.c: free MIDX within read_midx_file() builtin/pack-objects.c: don't leak memory via arguments builtin/repack.c: avoid leaking child arguments builtin/multi-pack-index.c: don't leak concatenated options midx.c: write MIDX filenames to strbuf pack-bitmap.c: don't leak type-level bitmaps pack-bitmap.c: more aggressively free in free_bitmap_index() builtin/multi-pack-index.c | 4 +++ builtin/pack-objects.c | 11 ++++--- builtin/repack.c | 23 +++++++++---- midx.c | 66 ++++++++++++++++++++++---------------- midx.h | 4 +-- pack-bitmap.c | 29 ++++++++++++++--- pack-revindex.c | 8 ++--- t/helper/test-read-midx.c | 3 +- 8 files changed, 97 insertions(+), 51 deletions(-) Range-diff against v1: 1: 30f6f23daf = 1: dcc5998072 midx.c: clean up chunkfile after reading the MIDX 2: b0c79904ab ! 2: 258a9e2e57 midx.c: don't leak MIDX from verify_midx_file @@ Metadata ## Commit message ## midx.c: don't leak MIDX from verify_midx_file - The function midx.c:verify_midx_file() allocate a MIDX struct by calling - load_multi_pack_index(). But when cleaning up, it calls free() without - freeing any resources associated with the MIDX. + The function midx.c:verify_midx_file() allocates a MIDX struct by + calling load_multi_pack_index(). But when cleaning up, it calls free() + without freeing any resources associated with the MIDX. Call the more appropriate close_midx() which does free those resources, which causes t5319.3 to pass when Git is compiled with SANITIZE=leak. 3: 5157edb41e = 3: 84859d5b53 t/helper/test-read-midx.c: free MIDX within read_midx_file() 4: dd3b9a949e = 4: aedb1713b4 builtin/pack-objects.c: don't leak memory via arguments 5: a68c77c006 ! 5: bcd12ecab8 builtin/repack.c: avoid leaking child arguments @@ Commit message In none of these cases do we bother to call child_process_clear(), which frees the memory associated with each child's arguments and environment. - In order to do so, tweak each function that spawns a child process to - have a `cleanup` label that we always visit before returning from each - function. Then, make sure that we call child_process_clear() as a part - of that label. + Make sure that we call child_process_clear() in any functions which + initialize a struct child_process before returning along a path which + did not call finish_command(). + + In cmd_repack(), take a slightly different approach to use a cleanup + label to clear the child_process, unless finish_command() was called. + This allows us to free other memory allocated during the lifetime of + that function. But it avoids calling child_process_clear() twice (the + other call coming from inside of finish_command()) to avoid assuming the + function's implementation is idempotent. Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> ## builtin/repack.c ## @@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args, - struct child_process cmd = CHILD_PROCESS_INIT; - FILE *out; - struct strbuf line = STRBUF_INIT; -+ int ret = 0; + for_each_packed_object(write_oid, &cmd, + FOR_EACH_OBJECT_PROMISOR_ONLY); - prepare_pack_objects(&cmd, args); - cmd.in = -1; -@@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args, - - if (cmd.in == -1) +- if (cmd.in == -1) ++ if (cmd.in == -1) { /* No packed objects; cmd was never started */ -- return; -+ goto cleanup; ++ child_process_clear(&cmd); + return; ++ } close(cmd.in); -@@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args, - free(promisor_name); - } - fclose(out); -- if (finish_command(&cmd)) -+ ret = finish_command(&cmd); -+ -+cleanup: -+ child_process_clear(&cmd); -+ -+ if (ret) - die(_("could not finish pack-objects to repack promisor objects")); - } - -@@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include, - struct string_list_item *item; - struct packed_git *largest = get_largest_active_pack(geometry); - FILE *in; -- int ret; -+ int ret = 0; - - if (!include->nr) -- return 0; -+ goto cleanup; - - cmd.in = -1; - cmd.git_cmd = 1; @@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include, + strvec_pushf(&cmd.args, "--refs-snapshot=%s", refs_snapshot); ret = start_command(&cmd); - if (ret) -- return ret; -+ goto cleanup; +- if (ret) ++ if (ret) { ++ child_process_clear(&cmd); + return ret; ++ } in = xfdopen(cmd.in, "w"); for_each_string_list_item(item, include) - fprintf(in, "%s\n", item->string); - fclose(in); - -- return finish_command(&cmd); -+ ret = finish_command(&cmd); -+ -+cleanup: -+ child_process_clear(&cmd); -+ -+ return ret; - } - - int cmd_repack(int argc, const char **argv, const char *prefix) @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix) struct pack_geometry *geometry = NULL; struct strbuf line = STRBUF_INIT; @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix + int i, ext, ret = 0; FILE *out; int show_progress = isatty(2); ++ int cmd_cleared = 0; + /* variables to be filled by option parsing */ + int pack_everything = 0; @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix) ret = start_command(&cmd); @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix if (geometry) { FILE *in = xfdopen(cmd.in, "w"); @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix) + } fclose(out); ret = finish_command(&cmd); ++ cmd_cleared = 1; if (ret) - return ret; + goto cleanup; @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix string_list_clear(&existing_kept_packs, 0); clear_pack_geometry(geometry); strbuf_release(&line); -+ child_process_clear(&cmd); ++ if (!cmd_cleared) ++ child_process_clear(&cmd); - return 0; + return ret; 6: ffded80c7d = 6: 0d252cd323 builtin/multi-pack-index.c: don't leak concatenated options 7: f3897c3afc < -: ---------- pack-bitmap.c: avoid leaking via midx_bitmap_filename() -: ---------- > 7: 0f293ab638 midx.c: write MIDX filenames to strbuf 8: 29920e7735 = 8: 77a4454632 pack-bitmap.c: don't leak type-level bitmaps 9: e65ac7deb5 ! 9: c1e7e6cc92 pack-bitmap.c: more aggressively free in free_bitmap_index() @@ Commit message The function free_bitmap_index() is somewhat lax in what it frees. There are two notable examples: - - While it does call kh_destroy_oid_map on the "bitmaps" map (which - maps) commit OIDs to their corresponding bitmaps, the bitmaps + - While it does call kh_destroy_oid_map on the "bitmaps" map, which + maps commit OIDs to their corresponding bitmaps, the bitmaps themselves are not freed. Note here that we recycle already-freed ewah_bitmaps into a pool, but these are handled correctly by ewah_pool_free(). 10: cb30aa67c0 < -: ---------- pack-bitmap-write.c: don't return without stop_progress() 11: f1bb8b73ff < -: ---------- t5319: UNLEAK() the remaining leaks -- 2.33.0.96.g73915697e6