Hi, this is the second version of my patch series that fixes various memory leaks in Git. Changes compared to v1: - t4153 and t7006 aren't marked as passing anymore. I thought they pass because most of these tests were skipped because of a missing TTY prerequisite both on my local machine, but also in our CI. - Add another patch to install the Perl IO:Pty module on Alpine and Ubuntu. This fulfills the TTY prerequisite and thus surfaces the memory leaks in both of the above tests. - Add another unit test for strvec that exercise replacing a string in the strvec with a copy of itself. - A bunch of commit message improvements. Thanks! Patrick Patrick Steinhardt (21): ci: add missing dependency for TTY prereq t: mark a bunch of tests as leak-free transport-helper: fix leaking helper name strbuf: fix leak when `appendwholeline()` fails with EOF checkout: clarify memory ownership in `unique_tracking_name()` http: refactor code to clarify memory ownership config: clarify memory ownership in `git_config_pathname()` diff: refactor code to clarify memory ownership of prefixes convert: refactor code to clarify ownership of check_roundtrip_encoding builtin/log: stop using globals for log config builtin/log: stop using globals for format config config: clarify memory ownership in `git_config_string()` config: plug various memory leaks builtin/credential: clear credential before exit commit-reach: fix memory leak in `ahead_behind()` submodule: fix leaking memory for submodule entries strvec: add functions to replace and remove strings builtin/mv: refactor `add_slash()` to always return allocated strings builtin/mv duplicate string list memory builtin/mv: refactor to use `struct strvec` builtin/mv: fix leaks for submodule gitfile paths Makefile | 1 + alias.c | 6 +- attr.c | 2 +- attr.h | 2 +- builtin/blame.c | 2 +- builtin/checkout.c | 14 +- builtin/commit.c | 4 +- builtin/config.c | 2 +- builtin/credential.c | 2 + builtin/log.c | 708 ++++++++++-------- builtin/merge.c | 4 +- builtin/mv.c | 222 +++--- builtin/rebase.c | 2 +- builtin/receive-pack.c | 6 +- builtin/repack.c | 8 +- builtin/worktree.c | 20 +- checkout.c | 4 +- checkout.h | 6 +- ci/install-dependencies.sh | 4 +- commit-reach.c | 4 + config.c | 52 +- config.h | 10 +- convert.c | 30 +- convert.h | 2 +- delta-islands.c | 2 +- diff.c | 20 +- environment.c | 16 +- environment.h | 14 +- fetch-pack.c | 4 +- fsck.c | 4 +- fsmonitor-settings.c | 5 +- gpg-interface.c | 6 +- http.c | 50 +- imap-send.c | 12 +- mailmap.c | 4 +- mailmap.h | 4 +- merge-ll.c | 6 +- pager.c | 2 +- pretty.c | 14 +- promisor-remote.h | 2 +- remote.c | 20 +- remote.h | 8 +- sequencer.c | 2 +- setup.c | 6 +- strbuf.c | 4 +- strvec.c | 20 + strvec.h | 13 + submodule-config.c | 2 + t/t0300-credentials.sh | 2 + t/t0411-clone-from-partial.sh | 1 + t/t0610-reftable-basics.sh | 1 + t/t0611-reftable-httpd.sh | 1 + t/t1013-read-tree-submodule.sh | 1 + t/t1306-xdg-files.sh | 1 + t/t1350-config-hooks-path.sh | 1 + t/t1400-update-ref.sh | 2 + t/t2013-checkout-submodule.sh | 1 + t/t2024-checkout-dwim.sh | 1 + t/t2060-switch.sh | 1 + t/t2405-worktree-submodule.sh | 1 + t/t3007-ls-files-recurse-submodules.sh | 1 + t/t3203-branch-output.sh | 2 + t/t3415-rebase-autosquash.sh | 1 + t/t3426-rebase-submodule.sh | 1 + t/t3512-cherry-pick-submodule.sh | 1 + t/t3513-revert-submodule.sh | 1 + t/t3600-rm.sh | 1 + t/t3906-stash-submodule.sh | 1 + t/t4001-diff-rename.sh | 4 +- t/t4041-diff-submodule-option.sh | 1 + t/t4043-diff-rename-binary.sh | 1 + t/t4059-diff-submodule-not-initialized.sh | 1 + t/t4060-diff-submodule-option-diff-format.sh | 1 + t/t4120-apply-popt.sh | 1 + t/t4137-apply-submodule.sh | 1 + t/t4210-log-i18n.sh | 2 + t/t5563-simple-http-auth.sh | 1 + t/t5564-http-proxy.sh | 1 + t/t5581-http-curl-verbose.sh | 1 + t/t6006-rev-list-format.sh | 1 + t/t6041-bisect-submodule.sh | 1 + t/t6400-merge-df.sh | 1 + t/t6412-merge-large-rename.sh | 1 + t/t6426-merge-skip-unneeded-updates.sh | 1 + t/t6429-merge-sequence-rename-caching.sh | 1 + t/t6438-submodule-directory-file-conflicts.sh | 1 + t/t7001-mv.sh | 2 + t/t7005-editor.sh | 1 + t/t7102-reset.sh | 1 + t/t7112-reset-submodule.sh | 1 + t/t7417-submodule-path-url.sh | 1 + t/t7421-submodule-summary-add.sh | 1 + t/t7423-submodule-symlinks.sh | 1 + t/t9129-git-svn-i18n-commitencoding.sh | 1 - t/t9139-git-svn-non-utf8-commitencoding.sh | 1 - t/t9200-git-cvsexportcommit.sh | 1 + t/t9401-git-cvsserver-crlf.sh | 1 + t/t9600-cvsimport.sh | 1 + t/t9601-cvsimport-vendor-branch.sh | 1 + t/t9602-cvsimport-branches-tags.sh | 1 + t/t9603-cvsimport-patchsets.sh | 2 + t/t9604-cvsimport-timestamps.sh | 2 + t/unit-tests/t-strvec.c | 269 +++++++ t/unit-tests/test-lib.c | 13 + t/unit-tests/test-lib.h | 13 + transport-helper.c | 6 +- transport.c | 1 + upload-pack.c | 2 +- userdiff.h | 12 +- 109 files changed, 1151 insertions(+), 586 deletions(-) create mode 100644 t/unit-tests/t-strvec.c Range-diff against v1: -: ---------- > 1: 857b8b14ce ci: add missing dependency for TTY prereq 1: 0e9fa9ca73 ! 2: ceade7dbba t: mark a bunch of tests as leak-free @@ Commit message - t2405: Passes since 6741e917de (repository: avoid leaking `fsmonitor` data, 2024-04-12). - - t4153: Passes since 71c7916053 (apply: plug a leak in apply_data, - 2024-04-23). - - - t7006: Passes since at least Git v2.40. I did not care to go back - any further than that. - - t7423: Introduced via b20c10fd9b (t7423: add tests for symlinked submodule directories, 2024-01-28), passes since e8d0608944 (submodule: require the submodule path to contain directories only, @@ t/t2405-worktree-submodule.sh: test_description='Combination of submodules and m base_path=$(pwd -P) - ## t/t4153-am-resume-override-opts.sh ## -@@ - - test_description='git-am command-line options override saved options' - -+TEST_PASSES_SANITIZE_LEAK=true - . ./test-lib.sh - . "$TEST_DIRECTORY"/lib-terminal.sh - - - ## t/t7006-pager.sh ## -@@ - - test_description='Test automatic use of a pager.' - -+TEST_PASSES_SANITIZE_LEAK=true - . ./test-lib.sh - . "$TEST_DIRECTORY"/lib-pager.sh - . "$TEST_DIRECTORY"/lib-terminal.sh - ## t/t7423-submodule-symlinks.sh ## @@ 2: 05fbadbae2 ! 3: a96b5ac359 transport-helper: fix leaking helper name @@ Commit message ownership of the name, nor do we free it. The associated memory thus leaks. - Fix this memory leak and mark now-passing tests as leak free. + Fix this memory leak by freeing the string at the calling side in + `transport_get()`. `transport_helper_init()` now creates its own copy of + the string and thus can free it as required. + + An alterantive way to fix this would be to transfer ownership of the + string passed into `transport_helper_init()`, which would avoid the call + to xstrdup(1). But it does make for a more surprising calling convention + as we do not typically transfer ownership of strings like this. + + Mark now-passing tests as leak free. Signed-off-by: Patrick Steinhardt <ps@xxxxxx> 3: d76079797f = 4: 9dd8709d1b strbuf: fix leak when `appendwholeline()` fails with EOF 4: ffd9eb795f = 5: 6d4e9ce706 checkout: clarify memory ownership in `unique_tracking_name()` 5: d4bf3c1f95 = 6: 141cae2de1 http: refactor code to clarify memory ownership 6: 1702762772 = 7: ff5e761e55 config: clarify memory ownership in `git_config_pathname()` 7: 18dce492df ! 8: afe69c7303 diff: refactor code to clarify memory ownership of prefixes @@ Metadata ## Commit message ## diff: refactor code to clarify memory ownership of prefixes - The source end destination prefixes are tracked in a `const char *` + The source and destination prefixes are tracked in a `const char *` array, but may at times contain allocated strings. The result is that those strings may be leaking because we never free them. 8: 667eb3f8ff = 9: eb7fce55b0 convert: refactor code to clarify ownership of check_roundtrip_encoding 9: 11eed8cea7 = 10: ee2fcf388c builtin/log: stop using globals for log config 10: d8cd9a05f8 = 11: 3490ad3a02 builtin/log: stop using globals for format config 11: a857637e61 = 12: 6cfc28c7e2 config: clarify memory ownership in `git_config_string()` 12: b2f8878b55 = 13: 70e8e26513 config: plug various memory leaks 13: 23e2cf98b7 = 14: f1a1c43e76 builtin/credential: clear credential before exit 14: a11ce6a0ed = 15: 64b92156f8 commit-reach: fix memory leak in `ahead_behind()` 15: 24362604b2 = 16: cd8a992f08 submodule: fix leaking memory for submodule entries 16: c43c93db3b ! 17: 128e2eaf7a strvec: add functions to replace and remove strings @@ t/unit-tests/t-strvec.c (new) + strvec_clear(&vec); +} + ++static void t_replace_with_substring(void) ++{ ++ struct strvec vec = STRVEC_INIT; ++ strvec_pushl(&vec, "foo", NULL); ++ strvec_replace(&vec, 0, vec.v[0] + 1); ++ check_strvec(&vec, "oo", NULL); ++ strvec_clear(&vec); ++} ++ +static void t_remove_at_head(void) +{ + struct strvec vec = STRVEC_INIT; @@ t/unit-tests/t-strvec.c (new) + TEST(t_replace_at_head(), "replace at head"); + TEST(t_replace_in_between(), "replace in between"); + TEST(t_replace_at_tail(), "replace at tail"); ++ TEST(t_replace_with_substring(), "replace with substring"); + TEST(t_remove_at_head(), "remove at head"); + TEST(t_remove_in_between(), "remove in between"); + TEST(t_remove_at_tail(), "remove at tail"); 17: 97470398ad = 18: 1310b24fc2 builtin/mv: refactor `add_slash()` to always return allocated strings 18: 7a2e5e82cc = 19: d4fef9825a builtin/mv duplicate string list memory 19: b546ca4d62 = 20: 797cdb286a builtin/mv: refactor to use `struct strvec` 20: bba735388d = 21: 095469193c builtin/mv: fix leaks for submodule gitfile paths -- 2.45.1.216.g4365c6fcf9.dirty
Attachment:
signature.asc
Description: PGP signature