Umask ignored when mounting NFSv4.2 share of an exported ZFS (with acltype=off) (was: Re: Bug#962254: NFS(v4) broken at 4.19.118-2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Elliott,

[I'm adding linux-nfs upstream hopefully J. Bruce Fields or others can
help clarifying]

On Thu, Jun 11, 2020 at 03:37:11PM -0700, Elliott Mitchell wrote:
> Bit more experimentation on this issue.
> 
> I tried a very small C program meant to create files with fewer
> permissions bits set.  This succeeded which strengthens the theory of
> the umask getting ignored.
> 
> I haven't seen anything hinting whether this is more a client or server
> issue.
> 
> I can speculate perhaps somewhere between 4.9 and 4.15 the NFS client
> code stepped closer to proper the "proper" 4.2 protocol.  If a
> corresponding NFS server was slow at getting merged, what we're seeing
> could happen.
> 
> Alternatively someone was trying to get a Linux NFS v4.2 client to work
> better with a different NFS v4.2 server, so they fixed Linux's NFS v4.2
> client.  Yet they failed to test with Linux's v4.2 server.
> 
> 
> This though is speculation.  All I can say is sometime between kernels
> 4.9 and 4.15, NFS v4.2 got broken.  There are hints this is related to
> handling of umask.
 
I was initially confused because of the mentioning of only appearing
with the update to 4.19.118-2 but this is now cleared up, so it shows
up when changing from 4.9.x from stretch to 4.19.x.

Now I'm quite unsure if this should and is to be considered a Linux
kernel issue. What follows is just what I found with respect of the
mentioned behaviour. There is a specific aspect of the NFSv4.2
implementation:

In upstream, with [nfsv4.2-umask-support], [47057abde515] NFSv4.2
support was added. The repsective RFC describing it is [RFC8275].

[nfsv4.2-umask-support]: <https://lore.kernel.org/linux-nfs/1477686228-12158-1-git-send-email-bfields@xxxxxxxxxx/>
[47057abde515]: <https://git.kernel.org/linus/47057abde515155a4fee53038e7772d6b387e0aa>
[RFC8275]: <https://tools.ietf.org/html/rfc8275>

Since, they allow the umask to be ignored in the presence of
inheritable NFSv4 ACLs.

Now what is or will be confusing is that the behaviour is reproducible
with ZFS default of acltype=off (aclinherit=restricted, sharenfs=off).

Reproducing the issue is easy as follows (all done on Debian unstable
to verify the behaviours can be triggered there as well with more
current 5.6.14-2, zfs-linux on 0.8.4-1):

# zpool create zfs_test /dev/vdb

and exporting /zfs_test in /etc/exports as

/zfs_test 192.168.122.1/24(rw,sync,no_subtree_check,no_root_squash)

The properties of zfs_test would be:

# zfs get acltype,aclinherit,sharenfs zfs_test
NAME      PROPERTY    VALUE          SOURCE
zfs_test  acltype     off            local
zfs_test  aclinherit  restricted     local
zfs_test  sharenfs    off            default

And reproducing then with

# mount -t nfs 192.168.122.150:/zfs_test /mnt
# mkdir /mnt/foo && ls -ld /mnt/foo && rmdir /mnt/foo
drwxrwxrwx 2 root root 2 Jun 13 14:25 /mnt/fo
# umount /mnt

The comment from J. Bruce Fields, in
https://bugzilla.redhat.com/show_bug.cgi?id=1667761#c1 can help debug
it further:

> To start debugging this, I'd recommend looking running wireshark to
> sniff traffic while running your reproducer (mount, mkdir) and
> compare to what's expected from the umask RFC.  Somewhere there
> should be a getattr from the client for the supported_attrs
> attribute, and the reply from the server will probably indicate
> support for the new mode_umask attribute.  If you find the CREATE
> operation that creates the new directory, you should see the client
> set the mode_umask attribute, with the mode part set to the open
> mode and the umask to the process umask.  If those values look
> right, then the problem is likely on the server side.

In fact in sniffing the traffic, there, the gettattr from the client and the
server does indicate support for the new mode_umask. Then later in the CREATE
operation, the client sets the mode_umask attribute, with mode part set to
'0777' and umask to '022'. The mode replied is then as well '0777'.

If further needed to debug we should try to distill a sniff with
wireshark providing the repsective pcap.

https://bugzilla.redhat.com/show_bug.cgi?id=1667761

did not further contain specific information on followups.

https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1779736

indicated this was specifically observed on ZFS on Linux only. Seth
Arnold's answer seem to be inline with that that the issue is more on
the ZFS on Linux side and the issue keeps biting people a bit
unexpectedly. Why does this break with ACL off settings?

But there was at least one other (but again without further
detail/followups) that it was observed on an export from OpenWRT, but
no specific details here:

https://bugs.openwrt.org/index.php?do=details&task_id=2581

Both Debian bugs itself were as well with underlying ZFS filesystem exported:
https://bugs.debian.org/934160
https://bugs.debian.org/962254

Any hint on were to pin-point the issue? Both on Linux anf ZFS on
Linux side or only on one of the components?

Regards,
Salvatore



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux