On Mon, May 30, 2016 at 10:29 PM, David <dclistslinux@xxxxxxxxx> wrote: > Hi All > > I'm having an issue with slow writes over NFS (v3) when cephfs is mounted > with the kernel driver. Writing a single 4K file from the NFS client is > taking 3 - 4 seconds, however a 4K write (with sync) into the same folder on > the server is fast as you would expect. When mounted with ceph-fuse, I don't > get this issue on the NFS client. > > Test environment is a small cluster with a single MON and single MDS, all > running 10.2.1, CephFS metadata is an ssd pool, data is on spinners. The NFS > server is CentOS 7, I've tested with the current shipped kernel (3.10), > ELrepo 4.4 and ELrepo 4.6. > > More info: > > With the kernel driver, I mount the filesystem with "-o name=admin,secret" > > I've exported a folder with the following options: > > *(rw,root_squash,sync,wdelay,no_subtree_check,fsid=1244,sec=1) > > I then mount the folder on a CentOS 6 client with the following options (all > default): > > rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.3.231,mountvers=3,mountport=597,mountproto=udp,local_lock=none > > A small 4k write is taking 3 - 4 secs: > > # time dd if=/dev/zero of=testfile bs=4k count=1 > 1+0 records in > 1+0 records out > 4096 bytes (4.1 kB) copied, 3.59678 s, 1.1 kB/s > > real 0m3.624s > user 0m0.000s > sys 0m0.001s > > But a sync write on the sever directly into the same folder is fast (this is > with the kernel driver): > > # time dd if=/dev/zero of=testfile2 bs=4k count=1 conv=fdatasync > 1+0 records in > 1+0 records out > 4096 bytes (4.1 kB) copied, 0.0121925 s, 336 kB/s Your nfs export has sync option. 'dd if=/dev/zero of=testfile bs=4k count=1' on nfs client is equivalent to 'dd if=/dev/zero of=testfile bs=4k count=1 conv=fsync' on cephfs. The reason that sync metadata operation takes 3~4 seconds is that the MDS flushes its journal every 5 seconds. Adding async option to nfs export can avoid this delay. > > real 0m0.015s > user 0m0.000s > sys 0m0.002s > > If I mount cephfs with Fuse instead of the kernel, the NFS client write is > fast: > > dd if=/dev/zero of=fuse01 bs=4k count=1 > 1+0 records in > 1+0 records out > 4096 bytes (4.1 kB) copied, 0.026078 s, 157 kB/s > In this case, ceph-fuse sends an extra request (getattr request on directory) to MDS. The request causes MDS to flush its journal. Whether or not client sends the extra request depends on what capabilities it has. What capabilities client has, in turn, depend on how many clients are accessing the directory. In my test, nfs on ceph-fuse is not always fast. Yan, Zheng > Does anyone know what's going on here? > > Thanks > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com