On Mon, Nov 11, 2024 at 11:21:44AM -0500, Simon Marchi wrote: > When running a dir-diff command that produces no diff, variables > `wt_modified` and `tmp_modified` are used while uninitialized, causing: > > $ /home/smarchi/src/git/git-difftool --dir-diff master > free(): invalid pointer > [1] 334004 IOT instruction (core dumped) /home/smarchi/src/git/git-difftool --dir-diff master > $ valgrind --track-origins=yes /home/smarchi/src/git/git-difftool --dir-diff master > ... > Invalid free() / delete / delete[] / realloc() > at 0x48478EF: free (vg_replace_malloc.c:989) > by 0x422CAC: hashmap_clear_ (hashmap.c:208) > by 0x283830: run_dir_diff (difftool.c:667) > by 0x284103: cmd_difftool (difftool.c:801) > by 0x238E0F: run_builtin (git.c:484) > by 0x2392B9: handle_builtin (git.c:750) > by 0x2399BC: cmd_main (git.c:921) > by 0x356FEF: main (common-main.c:64) > Address 0x1ffefff180 is on thread 1's stack > in frame #2, created by run_dir_diff (difftool.c:358) > ... > > If taking any `goto finish` path before these variables are initialized, > `hashmap_clear_and_free()` operates on uninitialized data, sometimes > causing a crash. > > Fix it by zero-initializing these variables, making > `hashmap_clear_and_free()` a no-op in that case. The fix makes sense. I wondered if this had been broken for a long time, and if so, how we managed not to notice it. But it looks like it is a recent problem, via 7f795a1715 (builtin/difftool: plug several trivial memory leaks, 2024-09-26). > diff --git a/builtin/difftool.c b/builtin/difftool.c > index ca1b0890659b..b902f5d2ae17 100644 > --- a/builtin/difftool.c > +++ b/builtin/difftool.c > @@ -376,7 +376,8 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, > struct checkout lstate, rstate; > int err = 0; > struct child_process cmd = CHILD_PROCESS_INIT; > - struct hashmap wt_modified, tmp_modified; > + struct hashmap wt_modified = {0}; > + struct hashmap tmp_modified = {0}; > int indices_loaded = 0; That commit likewise frees some other local variables, but they are all properly initialized. So touching these two are sufficient. I'm not sure if zero-initialization is being a little too familiar with the hashmap internals, though. The other variables use HASHMAP_INIT(). Should we do the same here, like this: diff --git a/builtin/difftool.c b/builtin/difftool.c index 1a68ab6699..86995390c7 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -374,7 +374,8 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, struct checkout lstate, rstate; int err = 0; struct child_process cmd = CHILD_PROCESS_INIT; - struct hashmap wt_modified, tmp_modified; + struct hashmap wt_modified = HASHMAP_INIT(path_entry_cmp, NULL); + struct hashmap tmp_modified = HASHMAP_INIT(path_entry_cmp, NULL); int indices_loaded = 0; workdir = get_git_work_tree(); @@ -594,14 +595,7 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, * should be copied back to the working tree. * Do not copy back files when symlinks are used and the * external tool did not replace the original link with a file. - * - * These hashes are loaded lazily since they aren't needed - * in the common case of --symlinks and the difftool updating - * files through the symlink. */ - hashmap_init(&wt_modified, path_entry_cmp, NULL, wtindex.cache_nr); - hashmap_init(&tmp_modified, path_entry_cmp, NULL, wtindex.cache_nr); - for (i = 0; i < wtindex.cache_nr; i++) { struct hashmap_entry dummy; const char *name = wtindex.cache[i]->name; That loses the initial table growth that the original had, but I think letting it grow in the usual way is fine here. -Peff