On 26.10.2020 16:50, Elijah Newren wrote: > On Mon, Oct 26, 2020 at 8:11 AM Ephrim Khong <dr.khong@xxxxxxxxx> wrote: >> >> I am trying to find the root cause for what I believe might be a strange >> bug in git merge. I have a feature branch A which branched off master >> not too long ago, and want to bring it up to date with master: >> >> git checkout A >> git merge master >> >> which yields >> >> error: add_cacheinfo failed to refresh for path 'c/d/e.sh'; merge >> aborting. > > "add_cacheinfo failed to refresh"? Wow, that's a new one. Some years > back we had a "add_cacheinfo failed for path" corresponding to the > other error site within that function, but we fixed that one up long > ago. I've never seen anything hit the refresh failure. Thank you for looking into this and sorry for the delay. I ran into this again and did some more testing. The merge works if I copy the complete repository to a different filesystem (in this case a local SSD) with cp -a. It was originally on a network share. An strace on the affected file seems to show that git creates and writes the file, attempting to set the executable bit. But the subsequent lstat reports that no executable bit is set (and that is correct, when looking at the file after the merge errors out): > lstat("tools/ci/nightly/run_benchmarks.sh", 0x7ffc2f148c20) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_WRONLY|O_CREAT|O_EXCL, 0777) = 4 > write(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > fstat(4, {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 > close(4) = 0 > > lstat("tools/ci/nightly/run_benchmarks.sh", {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 #<-- first lstat, mode is OK > unlink("tools/ci/nightly/run_benchmarks.sh") = 0 > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_WRONLY|O_CREAT|O_TRUNC, 0777) = 4 > write(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > close(4) = 0 > lstat("tools/ci/nightly/run_benchmarks.sh", {st_mode=S_IFREG|0640, st_size=1973, ...}) = 0 #<-- second lstat, mode is wrong > > write(2, "error: add_cacheinfo failed to r"..., 120error: add_cacheinfo failed to refresh for path 'tools/ci/nightly/run_benchmarks.sh' (2); merge aborting. note how the last lstat() reports 0640, even though openat() wanted 0777. The same thing when merging on the SSD shows > lstat("tools/ci/nightly/run_benchmarks.sh", 0x7ffd491745c0) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_WRONLY|O_CREAT|O_EXCL, 0777) = 4 > write(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > fstat(4, {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 > close(4) = 0 > > lstat("tools/ci/nightly/run_benchmarks.sh", {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 > unlink("tools/ci/nightly/run_benchmarks.sh") = 0 > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_WRONLY|O_CREAT|O_TRUNC, 0777) = 4 > write(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > close(4) = 0 > lstat("tools/ci/nightly/run_benchmarks.sh", {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 > > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_RDONLY|O_CLOEXEC) = 4 > read(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > close(4) = 0 here, lstat() reports 0755 and the merge continues. This could very well be an issue with the storage system backing my network share, but maybe I overlooked something in that strace that git is doing wrong or could do better. It does, for example, write the file twice with identical content for some reason. I have patched my git for now, and in read-cache.c, refresh_cache_ent() call chmod() and re-run lstat() if the mode is incorrect. That fixes the problem for now. It's not perfect, since chmod() ignores the umask, but at least allows to merge. > Any chance this repository is available for others to access to try to > reproduce the problem? Unfortunately it's not public. I have tried to cut out the offending part, but was unable to reproduce it with a smaller test case yet. Thanks - Eph