Hi to all.
We've been using glusterfs for some time in order to share a volume as
user home for our unix labs. The glusterfs volume is a distributed one,
made of 4 bricks over ZFS datasets (over Ubuntu).
Our architecture has 3 layers: the glusterfs servers layer, an
intermediate server for reexporting it via NFS, and the final nfs client
at the lab [1]. Thus, the final client uses the glusterfs volume through
a intermediate nfs-kernel server. This intermediate nfs-kernel server,
reexports (by NFS-v3 or NFS-v4) the previously fuse mounted glusterfs
volume.
We have tested different versions of gluster (3.7.4 and 3.8.4) but
always we have seen a weird problem related to the opening and write
operations over a file that has read-only permissions. The major problem
related to this issue is that our alumni cannot use in their home
directory some tools like git [2], that are confident on being able to
write an owned file although it has not write permissions.
In our last development environment, using NFS-v4 we can see errors
related to permission denied, but only if we try to do the operation
through NFS, not if we try to do it directly over the fuse mounted
glusterfs volume in the intermediate server.
The issue can be shown easily with this command line at the client side:
u056@l056:~$ rm -f kk.txt 444.txt; echo "prueba" > 444.txt; chmod 444
444.txt; cp -p 444.txt kk.txt; ls -ld 444.txt kk.txt;
cp: cannot create regular file ‘kk.txt’: Permission denied
ls: cannot access kk.txt: No such file or directory
-r--r--r-- 1 u056 admincdc 7 oct 20 2016 444.txt
u056@l056:~$ rm -f kk.txt 444.txt; echo "prueba" > 444.txt; chmod 444
444.txt; cp -p 444.txt kk.txt; ls -ld 444.txt kk.txt;
cp: cannot create regular file ‘kk.txt’: Permission denied
-r--r--r-- 1 u056 admincdc 7 oct 20 2016 444.txt
---------- 1 u056 admincdc 0 sep 18 1970 kk.txt
u056@l056:~$ rm -f kk.txt 444.txt; echo "prueba" > 444.txt; chmod 444
444.txt; cp -p 444.txt kk.txt; ls -ld 444.txt kk.txt;
cp: cannot create regular file ‘kk.txt’: Permission denied
ls: cannot access kk.txt: No such file or directory
-r--r--r-- 1 u056 admincdc 7 oct 20 2016 444.txt
u056@l056:~$
Versions and mounting of the client are:
u056@l056:~$ dpkg -l | grep -e nfs
ii libnfsidmap2:i386
0.25-5 i386 NFS
idmapping library
ii nfs-common 1:1.2.8-6ubuntu1.2
i386 NFS support files common to client and server
u056@l056:~$ mount | grep cuentas09
cuentas09:/home-3/u056 on /home/u056 type nfs
(rw,noatime,intr,fsc,nolock,vers=4,rsize=262140,wsize=262140,addr=138.4.30.80,clientaddr=138.4.31.56)
u056@l056:~$
In the intermediate server, we can see two logs revealing the
"permission denied" errors. The first at intermediate server
/var/log/glusterfs/home-3.log (observe that this log has UTC time):
[2016-10-20 16:05:42.822034] E [MSGID: 114031]
[client-rpc-fops.c:444:client3_3_open_cbk] 36-home-lab-3-client-0:
remote operation failed. Path:
<gfid:dbcb6e65-7b34-473f-8175-735f24f55003>/kk.txt
(e9519bb2-9813-48fd-966b-c012ba333413) [Permission denied]
[2016-10-20 16:05:42.822112] W [fuse-bridge.c:989:fuse_fd_cbk]
0-glusterfs-fuse: 9824: OPEN()
<gfid:dbcb6e65-7b34-473f-8175-735f24f55003>/kk.txt => -1 (Permission denied)
The second, the intermediate server /var/log/kern.log (CEST time):
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169421] ------------[ cut
here ]------------
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169471] WARNING: CPU: 0
PID: 12218 at /build/linux-BwgxJb/linux-4.4.0/fs/nfsd/nfs4proc.c:464
nfsd4_open+0x515/0x780 [nfsd]()
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169475] nfsd4_process_open2
failed to open newly-created file! status=13
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169477] Modules linked in:
rpcsec_gss_krb5 nfsv4 nfs fscache vmw_vsock_vmci_transport vsock ppdev
vmw_balloon coretemp joydev input_leds serio_raw irda 8250_fintek
parport_pc shpchp parport i2c_piix4 crc_ccitt vmw_vmci mac_hid ib_iser
nfsd rdma_cm iw_cm ib_cm auth_rpcgss nfs_acl lockd grace ib_sa sunrpc
ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1
raid0 multipath linear vmwgfx ttm psmouse drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops mptspi mptscsih mptbase drm e1000
scsi_transport_spi pata_acpi floppy fjes
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169640] CPU: 0 PID: 12218
Comm: nfsd Tainted: G W 4.4.0-43-generic #63-Ubuntu
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169643] Hardware name:
VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform,
BIOS 6.00 09/22/2009
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169646] 0000000000000286
00000000d893ddf8 ffff8800b9ebbc80 ffffffff813f1f93
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169650] ffff8800b9ebbcc8
ffffffffc047b668 ffff8800b9ebbcb8 ffffffff81081212
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169653] ffff8800b8424240
ffff8800b8425068 000000000d000000 ffff8800b8290000
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169657] Call Trace:
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169691]
[<ffffffff813f1f93>] dump_stack+0x63/0x90
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169708]
[<ffffffff81081212>] warn_slowpath_common+0x82/0xc0
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169713]
[<ffffffff810812ac>] warn_slowpath_fmt+0x5c/0x80
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169730]
[<ffffffffc04665db>] ? nfs4_free_ol_stateid+0x3b/0x40 [nfsd]
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169743]
[<ffffffffc0459d05>] nfsd4_open+0x515/0x780 [nfsd]
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169756]
[<ffffffffc045a2fa>] nfsd4_proc_compound+0x38a/0x660 [nfsd]
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169766]
[<ffffffffc0446e78>] nfsd_dispatch+0xb8/0x200 [nfsd]
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169824]
[<ffffffffc039eeac>] svc_process_common+0x40c/0x650 [sunrpc]
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169844]
[<ffffffffc03a0273>] svc_process+0x103/0x1c0 [sunrpc]
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169854]
[<ffffffffc04468cf>] nfsd+0xef/0x160 [nfsd]
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169863]
[<ffffffffc04467e0>] ? nfsd_destroy+0x60/0x60 [nfsd]
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169872]
[<ffffffff810a0928>] kthread+0xd8/0xf0
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169876]
[<ffffffff810a0850>] ? kthread_create_on_node+0x1e0/0x1e0
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169896]
[<ffffffff81831c4f>] ret_from_fork+0x3f/0x70
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169900]
[<ffffffff810a0850>] ? kthread_create_on_node+0x1e0/0x1e0
Oct 20 18:05:42 cuentas09-lab kernel: [90094.169903] ---[ end trace
34db7650fa22d1e0 ]---
This intermediate NFS reexporter server is mounting a glusterfs server
volume, see software versions below and /etc/exports and /etc/fstab files:
root@cuentas09-lab:/var/log/glusterfs# mount | grep gluster
recipiente10hd:/home-lab-3.tcp on /home-3 type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
root@cuentas09-lab:/var/log/glusterfs#
root@cuentas09-lab:~# dpkg -l | grep -e nfs -e gluster
ii glusterfs-client 3.8.5-ubuntu1~xenial1 amd64
clustered file-system (client package)
ii glusterfs-common 3.8.5-ubuntu1~xenial1 amd64
GlusterFS common libraries and translator modules
ii libnfsidmap2:amd64 0.25-5 amd64
NFS idmapping library
ii nfs-common 1:1.2.8-9ubuntu12 amd64 NFS
support files common to client and server
ii nfs-kernel-server 1:1.2.8-9ubuntu12 amd64
support for NFS kernel server
root@cuentas09-lab:~#
root@cuentas09-lab:~# grep home-3 /etc/fstab /etc/exports
/etc/fstab:recipiente10hd:/home-lab-3 /home-3 glusterfs
defaults,_netdev,transport=tcp 0 0
/etc/exports:/home-3
138.4.30.0/23(rw,fsid=3,insecure,no_subtree_check,async,no_root_squash)
127.0.0.1/32(rw,fsid=3,insecure,no_subtree_check,async,no_root_squash)
root@cuentas09-lab:~#
The problem seen in the client side can be repeated at the server if the
intermediate server mounts the glusterfs exported volume over NFS, but
not if the server does not use the nfs mounting:
root@cuentas09-lab:~# cd /home-3/u056
root@cuentas09-lab:/home-3/u056#
root@cuentas09-lab:/home-3/u056# su u056
l056@cuentas09-lab:/home-3/u056$
l056@cuentas09-lab:/home-3/u056$ rm -f kk.txt 444.txt; echo "prueba" >
444.txt; chmod 444 444.txt; cp -p 444.txt kk.txt; ls -ld 444.txt kk.txt;
-r--r--r-- 1 l056 admincdc 7 oct 20 18:19 444.txt
-r--r--r-- 1 l056 admincdc 7 oct 20 18:19 kk.txt
l056@cuentas09-lab:/home-3/u056$
By the glusterfs server side the configs are pretty standard, but I
disabled all performance characteristics related to open, write,
flush-behind because of my recent reading of the posts:
root@recipiente10:~# dpkg -l | grep gluster
ii glusterfs-client 3.8.4-ubuntu1~xenial1 amd64
clustered file-system (client package)
ii glusterfs-common 3.8.4-ubuntu1~xenial1 amd64
GlusterFS common libraries and translator modules
ii glusterfs-server 3.8.4-ubuntu1~xenial1 amd64
clustered file-system (server package)
root@recipiente10:~# gluster volume set home-lab-3
performance.write-behind off; gluster volume set home-lab-3
performance.open-behind off; gluster volume set home-lab-3
performance.flush-behind off; gluster volume set home-lab-3
performance.read-after-open no; gluster volume set home-lab-3
performance.lazy-open off; gluster volume set home-lab-3
performance.nfs.write-behind off;
root@recipiente10:~# gluster volume get home-lab-3 all | grep performance
performance.cache-max-file-size 128KB
performance.cache-min-file-size 0
performance.cache-refresh-timeout 1
performance.cache-priority
performance.cache-size 256MB
performance.io-thread-count 16
performance.high-prio-threads 16
performance.normal-prio-threads 16
performance.low-prio-threads 16
performance.least-prio-threads 1
performance.enable-least-priority on
performance.least-rate-limit 0
performance.cache-size 256MB
performance.flush-behind off
performance.nfs.flush-behind on
performance.write-behind-window-size 1MB
performance.resync-failed-syncs-after-fsyncoff
performance.nfs.write-behind-window-size1MB
performance.strict-o-direct off
performance.nfs.strict-o-direct off
performance.strict-write-ordering off
performance.nfs.strict-write-ordering off
performance.lazy-open off
performance.read-after-open no
performance.read-ahead-page-count 4
performance.md-cache-timeout 1
performance.cache-swift-metadata true
performance.write-behind off
performance.read-ahead on
performance.readdir-ahead on
performance.io-cache on
performance.quick-read on
performance.open-behind off
performance.stat-prefetch on
performance.client-io-threads on
performance.nfs.write-behind off
performance.nfs.read-ahead off
performance.nfs.io-cache off
performance.nfs.quick-read off
performance.nfs.stat-prefetch off
performance.nfs.io-threads off
performance.force-readdirp true
root@recipiente10:~#
Related to the ZFS part of the glusterfs servers I can tell you that
they have a pretty standard configuration (see [1] for details). I have
checked if the acltype=noacl or acltype=posixacl has something to do
with the problem but I found no differences in the behaviour.
Curiosly, we have made a virtual environment with glusterfs over ZFS
over VFS that does not suffer the problem, so we suspect that it has to
do with some low level related detail.
We have tested nfs-ganesha 2.3.3 over centos 7 as intermediate server
for reexporting via nfs the glusterfs volume and the problem does not
show up, but this nfs server seems to be too unstable for production -at
least the versions tried-, specially under heavy work loads.
Any help would be greatly appreciated.
Regards,
Omar
[1]
http://www.rediris.es/jt/jt2015/ponencias/?id=jt2015-jt-ses_4b_seg_red_camp_2-a17b2c1.pdf
[2] https://git-scm.com/
--
----------------------------------------------------------------
Centro de Cálculo Depto. Ingeniería Sistemas Telemáticos
E-mail: omar@xxxxxxxxxx Universidad Politécnica de Madrid
Fax:(+34) 913367333 E.T.S. Ing. Telecomunicación
Tel:(+34) 915495700-Ext.3005 28040 Madrid (Spain)
Tel:(+34) 915495762-Ext.3005
Tel:(+34) 913367366-Ext.3005
----------------------------------------------------------------
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel