Re: chmod failure on GVFS mounted CIFS share

"Konrad Bucheli (PSI)" <konrad.bucheli@xxxxxx> · Fri, 10 Jan 2025 13:05:03 +0100

Hi

I investigated some more git init failures on our NetApp backed CIFS 
storage, this time not with FUSE but a real kernel mount. I found two 
different situations where a seemingly working "no-op chmod" makes git 
init fail.

a) Everyone Full Control permissions on the backend NTFS

$ touch test.txt
$ ll
total 1
-------rwx 1 buchel_k root 0 10. Jan 10:28 test.txt
$ chmod 007 test.txt
$ ll
total 1
-------rwx 1 buchel_k root 0 10. Jan 10:28 test.txt
$ echo foo > test.txt
-bash: test.txt: Permission denied
$ rm test.txt
rm: cannot remove 'test.txt': Permission denied
$ chmod 600 test.txt
$ rm test.txt
$

Investigation in the backend filesystem showed that this created a "User 
Deny" rule which is stronger that the "Everyone Full Control" rule.

b) only User Write permissions on backend NTFS

This second issue is even more wired:

$ touch test.txt
$ ll
total 1
-rwx------ 1 buchel_k root 0 10. Jan 11:00 test.txt
$ # open the file and keep it open
$ exec {fd}>test.txt
$ chmod 700 test.txt
$ # this chmod took multiple seconds
$ ll
total 0
-rwx------ 1 buchel_k root 0 10. Jan 11:01 test.txt
$ # write to file using the already open filehandle
$ echo test >&$fd
-bash: echo: write error: Permission denied
$ # close filehandle
$ exec {fd}>&-
$ # repeat without chmod
$ exec {fd}>test.txt
$ echo test >&$fd
$ exec {fd}>&-
$ cat test.txt
test
$

So if the chmod to a file happens after is has been already opened, then 
subsequent write fail. Looks like a bug to me. After reopening, all is fine.

For the records:
mount options as listed by mount:
(rw,relatime,vers=3.1.1,sec=krb5,cruid=0,cache=strict,multiuser,uid=0,noforceuid,gid=0,noforcegid,addr=<REDACTED>,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,cifsacl,noperm,reparse=nfs,rsize=1048576,wsize=1048576,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1)
Kernel: 5.14.0-503.19.1.el9_5.x86_64 (RHEL 9.5)
Backend: NetApp FAS8700 with Release 9.13.1P6

My conclusion is that a no-op chmod is fine on a local, proper UNIX 
filesystem, but in other setups it can have unexpected side effects or 
even hit bugs. So stat before chmod will make it work in more 
circumstances and cases. And as git is nowadays a ubiquitous tool, my 
users will use git wherever they believe it would be useful. I would 
like to give them a seamless experience as possible.

My experience might be limited, but I have not seen in my career any 
attempt or use case where permission bits inside .git where adjusted 
after. And only then there would be minimal price to pay with the extra 
stat, all other save a chmod. And this price is way smaller than having 
git init/clone fail completely.

What do you think?

Kind regards

Konrad

On 20.12.24 16:50, Konrad Bucheli (PSI) wrote:

On 20.12.24 16:44, Junio C Hamano wrote:
"Konrad Bucheli (PSI)" <konrad.bucheli@xxxxxx> writes:

I have another idea: there is no need for a chmod if both the config
file and the lock file already have he same mode. Which is the case if
the filesystem has no proper chmod support.

And for majority of people who have working chmod(), would it mean
one extra and unnecessary system call?

I do not have stats, but I guess the chmod call would be needed very 
rarely as most of the time both files have default permissions. I do not 
know which call is more expensive.

Instead, how about doing the chmod() first, like we have always done,
but after seeing it fail, check with lstat() to see if the modes are
already in the desired state and refrain from complaining if that is
the case?  That way, we'll incur extra overhead only in the error
code path, which is the usual pattern we would prefer to do things.

So instead of removing this part, ...

-        if (chmod(get_lock_file_path(&lock), st.st_mode & 07777) < 0) {
-            error_errno(_("chmod on %s failed"), 
get_lock_file_path(&lock));

... you'd do an extra lstat() on the lock file (so you can move the
st_lock inside this block, narrowing its scope) before calling
error_errno(), and only after finding out that st_lock cannot
somehow be obtained or the resulting mode is different, you call
error_errno() and arrange an error to be returned, all inside the
if(){} block of the original.

Wouldn't it work even better, I wonder?

If you think that is the way to go, I will adapt the patch.

Thanks.

+        if (stat(get_lock_file_path(&lock), &st_lock) == -1) {
+            error_errno(_("stat on %s failed"), 
get_lock_file_path(&lock));
              ret = CONFIG_NO_WRITE;
              goto out_free;
          }
+        if ((st.st_mode & 07777) != (st_lock.st_mode & 07777)) {
+            if (chmod(get_lock_file_path(&lock), st.st_mode & 07777) 
< 0) {
+                error_errno(_("chmod on %s failed"), 
get_lock_file_path(&lock));
+                ret = CONFIG_NO_WRITE;
+                goto out_free;
+            }
+        }
+
          if (store.seen_nr == 0) {
              if (!store.seen_alloc) {
                  /* Did not see key nor section */