Re: CephFS: slow writes over NFS when fs is mounted with kernel driver but fast with Fuse

David <dclistslinux@xxxxxxxxx> · Fri, 3 Jun 2016 15:26:22 +0100

Zheng, thanks for looking into this, it makes sense although strangely I've set up a new nfs server (different hardware, same OS, Kernel etc.) and I'm unable to recreate the issue. I'm no longer getting the delay, the nfs export is still using sync. I'm now comparing the servers to see what's different on the original server. Apologies if I've wasted your time on this!
Jan, I did some more testing with Fuse on the original server and I was seeing the same issue, yes I was testing from the nfs client. As above I think there was something weird with that original server. Noted on sync vs async, I plan on sticking with sync.

On Fri, Jun 3, 2016 at 5:03 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
On Mon, May 30, 2016 at 10:29 PM, David <dclistslinux@xxxxxxxxx> wrote:

> Hi All

>

> I'm having an issue with slow writes over NFS (v3) when cephfs is mounted

> with the kernel driver. Writing a single 4K file from the NFS client is

> taking 3 - 4 seconds, however a 4K write (with sync) into the same folder on

> the server is fast as you would expect. When mounted with ceph-fuse, I don't

> get this issue on the NFS client.

>

> Test environment is a small cluster with a single MON and single MDS, all

> running 10.2.1, CephFS metadata is an ssd pool, data is on spinners. The NFS

> server is CentOS 7, I've tested with the current shipped kernel (3.10),

> ELrepo 4.4 and ELrepo 4.6.

>

> More info:

>

> With the kernel driver, I mount the filesystem with "-o name=admin,secret"

>

> I've exported a folder with the following options:

>

> *(rw,root_squash,sync,wdelay,no_subtree_check,fsid=1244,sec=1)

>

> I then mount the folder on a CentOS 6 client with the following options (all

> default):

>

> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.3.231,mountvers=3,mountport=597,mountproto=udp,local_lock=none

>

> A small 4k write is taking 3 - 4 secs:

>

>  # time dd if=/dev/zero of=testfile bs=4k count=1

> 1+0 records in

> 1+0 records out

> 4096 bytes (4.1 kB) copied, 3.59678 s, 1.1 kB/s

>

> real    0m3.624s

> user    0m0.000s

> sys     0m0.001s

>

> But a sync write on the sever directly into the same folder is fast (this is

> with the kernel driver):

>

> # time dd if=/dev/zero of=testfile2 bs=4k count=1 conv=fdatasync

> 1+0 records in

> 1+0 records out

> 4096 bytes (4.1 kB) copied, 0.0121925 s, 336 kB/s

Your nfs export has sync option. 'dd if=/dev/zero of=testfile bs=4k

count=1' on nfs client is equivalent to 'dd if=/dev/zero of=testfile

bs=4k count=1 conv=fsync' on cephfs. The reason that sync metadata

operation takes 3~4 seconds is that the MDS flushes its journal every

5 seconds.  Adding async option to nfs export can avoid this delay.

>

> real    0m0.015s

> user    0m0.000s

> sys     0m0.002s

>

> If I mount cephfs with Fuse instead of the kernel, the NFS client write is

> fast:

>

> dd if=/dev/zero of=fuse01 bs=4k count=1

> 1+0 records in

> 1+0 records out

> 4096 bytes (4.1 kB) copied, 0.026078 s, 157 kB/s

>

In this case, ceph-fuse sends an extra request (getattr request on

directory) to MDS. The request causes MDS to flush its journal.

Whether or not client sends the extra request depends on what

capabilities it has.  What capabilities client has, in turn, depend on

how many clients are accessing the directory. In my test, nfs on

ceph-fuse is not always fast.

Yan, Zheng

> Does anyone know what's going on here?

>

> Thanks

>

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com