Hi, I recently upgraded my git to version 2.27.0-1~ppa0~ubuntu18.04.1 and noticed that git-fast-import uses so much memory it gets killed. I'm fetching from a Mercurial repo using an importer from https://github.com/mnauw/git-remote-hg.git which uses git-fast-import to fetch commits from Mercurial. Here is an output of a git fetch showing is used 14Gb of RAM (on a 16Gb machine) # time git fetch error: git-fast-import died of signal 9 fatal: error while running fast-import Command exited with non-zero status 128 2.02user 3.82system 0:08.00elapsed 73%CPU (0avgtext+0avgdata 14744800maxresident)k 104920inputs+0outputs (414major+3688606minor)pagefaults 0swaps strace shows that git-fast-import is reading the marks from a file, then allocate some memory, reads more marks, allocates more memory, and so on: 11191 06:19:08.180572 read(7<.../.git/hg/origin/marks-git>, "79798 8ea080f15ab22807608aae4696dd23edefd8febe\n:220396 919079de10d43caf3fcde56bb1a17994b47a6214\n:75683 928813193a1535dc1274ed9da2f54f5de2caf2f4\n:155297 9108211d7ba318076fb53b2bd3d291102b376dbf\n:162042 9458fe329e9be30ad2b61e75197595889d80144b\n:305834 93485ce7991b4330a1114136b5d8e08d8bd1505b\n:223654 9750bdef7d22a885d2522bdd9e0a0e882979098e\n...", 4096) = 4096 <0.000027> 11191 06:19:08.182162 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be38ef000 <0.000024> 11191 06:19:08.183403 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be36ee000 <0.000127> 11191 06:19:08.184775 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be34ed000 <0.000059> 11191 06:19:08.186036 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be32ec000 <0.000121> 11191 06:19:08.187412 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be30eb000 <0.000110> 11191 06:19:08.188743 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be2eea000 <0.000022> 11191 06:19:08.189929 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be2ce9000 <0.000039> 11191 06:19:08.191150 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be2ae8000 <0.000019> 11191 06:19:08.192329 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be28e7000 <0.000023> 11191 06:19:08.193536 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be26e6000 <0.000038> 11191 06:19:08.194523 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be24e5000 <0.000019> 11191 06:19:08.195474 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be22e4000 <0.000212> 11191 06:19:08.196677 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be20e3000 <0.000027> 11191 06:19:08.197729 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be1ee2000 <0.000128> 11191 06:19:08.198883 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be1ce1000 <0.000043> 11191 06:19:08.199881 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be1ae0000 <0.000124> 11191 06:19:08.200959 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be18df000 <0.000020> 11191 06:19:08.201943 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5be16de000 <0.000021> The following shows that memory allocation seems to be linear with respect to the number of marks, but with a very high constant factor: # cut -d' ' -f 3 /tmp/gitfetch.strace | cut -d '(' -f 1 | uniq -c [ ... cut (this is not the start of the allocations) ... ] 1 read 47 mmap 1 read 79 mmap 1 read 36 mmap [ ... removed some other syscalls ... ] 73 mmap 1 read 141 mmap 1 read 173 mmap 1 read 204 mmap 1 read 235 mmap 1 read 267 mmap 1 read 297 mmap 1 read 329 mmap 1 read 361 mmap 1 read 392 mmap 1 read 424 mmap 1 read 454 mmap 1 read 493 mmap My marks file contains 91k entries, git fetch reads only 1400 before killed. I bisected the problem, below is my bisect log: git bisect start # good: [af6b65d45ef179ed52087e80cb089f6b2349f4ec] Git 2.26.2 git bisect good af6b65d45ef179ed52087e80cb089f6b2349f4ec # bad: [b3d7a52fac39193503a0b6728771d1bf6a161464] Git 2.27 git bisect bad b3d7a52fac39193503a0b6728771d1bf6a161464 # bad: [af986863c1ae2e306d5627f4e42cc6d2cf2a057f] Merge branch 'dd/ci-musl-libc' git bisect bad af986863c1ae2e306d5627f4e42cc6d2cf2a057f # bad: [7a8bb6db7cc04add05484c4fc907e34f76b12fb9] Merge branch 'jm/gitweb-fastcgi-utf8' git bisect bad 7a8bb6db7cc04add05484c4fc907e34f76b12fb9 # bad: [4e4baee3f44da26a5eaab27c76d597b04fef5259] Merge branch 'bc/filter-process' git bisect bad 4e4baee3f44da26a5eaab27c76d597b04fef5259 # good: [883e23820ed21b4ae65463f2a87152285bf77937] Merge branch 'en/oidset-uninclude-hashmap' git bisect good 883e23820ed21b4ae65463f2a87152285bf77937 # bad: [1bdca816412910e1206c15ef47f2a8a6b369b831] fast-import: add options for rewriting submodules git bisect bad 1bdca816412910e1206c15ef47f2a8a6b369b831 # good: [bf154a878281b6a971ece0fb6d917938298be60d] t/helper: make repository tests hash independent git bisect good bf154a878281b6a971ece0fb6d917938298be60d # good: [e02a7141f83326f7098800fed764061ecf1f0eff] worktree: allow repository version 1 git bisect good e02a7141f83326f7098800fed764061ecf1f0eff # bad: [abe0cc536414f2b9cfa37f208b36df5126e6356a] fast-import: add helper function for inserting mark object entries git bisect bad abe0cc536414f2b9cfa37f208b36df5126e6356a # bad: [ddddf8d7e254f4af6297d0ed62ea6a5d7eabdb64] fast-import: permit reading multiple marks files git bisect bad ddddf8d7e254f4af6297d0ed62ea6a5d7eabdb64 # good: [42d4e1d1128fa1cb56032ac58f65ea3dd1296a9a] commit: use expected signature header for SHA-256 git bisect good 42d4e1d1128fa1cb56032ac58f65ea3dd1296a9a # first bad commit: [ddddf8d7e254f4af6297d0ed62ea6a5d7eabdb64] fast-import: permit reading multiple marks files According to the bisect the first bad commit is: commit ddddf8d7e254f4af6297d0ed62ea6a5d7eabdb64 (refs/bisect/bad) Author: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> Date: Sat Feb 22 20:17:45 2020 +0000 fast-import: permit reading multiple marks files In the future, we'll want to read marks files for submodules as well. Refactor the existing code to make it possible to read multiple marks files, each into their own marks set. Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> When doing the bisect it was easier for me to use git from the Ubuntu package and only replace the git-fast-import binary with the one I was testing. I hope it doesn't falsify the bisect results. The behavior seemed to be consistent: it either produced the issue above, or it worked perfectly fine. Can you help me fix this issue? I hope the information I gathered is enough to help you find the cause of this behavior. I'd be happy to provide more information if needed or test patches. Unfortunately the source code I was fetching is proprietary, I cannot post it. Best Regards, Tibor Billes