On 11.03.2021 00:01, Junio C Hamano wrote: > Ephrim Khong <dr.khong@xxxxxxxxx> writes: > >> When re-writing the content of a file during a merge, the file >> is first unlinked and then re-created. Since it is a new file >> at the time of the open call, O_EXCL should be passed to clarify >> that the file is expected to be new. > > But the use of O_TRUNC has > spread over time to other parts of the codebase, perhaps cargo > culted. If you look at hits from > > $ git grep -e O_TRUNC -e O_EXCL > > you see the combination of CREAT/WRONLY/TRUNC used all over the > place [*], especially in newer parts of the code. > > So it becomes very curious why this one location needs to be so > special and you are not patching other uses of O_TRUNC. The main difference is that this is the only place where the file mode matters, since we want the executable bit set for some files in the working directory. All other locations create the files with mode 0600 or 0666 and are happy as long as the user has rw rights, which appears to be the case even when running into my nfs-issue. > I do not think we mind fixing the use of O_TRUNC with "remove and > then create with O_EXCL", but we'd probably want to > > * understand why only this place matters, or perhaps other uses of > O_TRUNC needs the same fix to work "correctly" with your NFS > mounts, in which case we'd need all of them addressed in the same > series of patches, and I'd say that is up to you. Personally I'd rather err on the side of caution and leave the other code as is, especially since it is not really required for the file mode issue as described above. But I'd happily patch those places as well if you find that to be a better idea. Maybe refactoring this (fstat - unlink - open) into a common function would make sense then. > * understand why your NFS mount is broken and give a better > explanation as to why we need to have a workaround in our code. I'll work on this, but unfortunately have no idea of how to properly debug it. Since it is a company server without administrative rights and the backend is some IBM storage system, the options are limited and processes are slow. What I did find out so far is that it is not a race condition with unlink. A simple openat() without O_EXCL already produces the wrong file mode. (I fully understand that this is not a bug on git's side, and I found no documentation indicating that O_EXCL would be recommended or have any effect in this way. Hopefully, others that run into similar issues would benefit from this as well, there are a few reports online of people running into "failed to refresh" errors.) As a side-note, the strace on the affected file also shows that git writes that file twice during the merge, with the same content. There might be some potential to further optimize merges to avoid such double-writes. A small example to reproduce, note how "b" is opened twice during the merge: git init echo "foo" > a git add a git commit -m "Initial commit" git mv a b git commit -m "File moved" git checkout -b other_branch HEAD~ touch c && git add c && git commit -m "Some other commit" strace -otrace git merge master -m "merge message" grep '"b"' trace Thanks - Eph