Zheng, thanks for looking into this, it makes sense although strangely I've set up a new nfs server (different hardware, same OS, Kernel etc.) and I'm unable to recreate the issue. I'm no longer getting the delay, the nfs export is still using sync. I'm now comparing the servers to see what's different on the original server. Apologies if I've wasted your time on this!
Jan, I did some more testing with Fuse on the original server and I was seeing the same issue, yes I was testing from the nfs client. As above I think there was something weird with that original server. Noted on sync vs async, I plan on sticking with sync.
On Fri, Jun 3, 2016 at 5:03 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
On Mon, May 30, 2016 at 10:29 PM, David <dclistslinux@xxxxxxxxx> wrote:
Your nfs export has sync option. 'dd if=/dev/zero of=testfile bs=4k> Hi All
>
> I'm having an issue with slow writes over NFS (v3) when cephfs is mounted
> with the kernel driver. Writing a single 4K file from the NFS client is
> taking 3 - 4 seconds, however a 4K write (with sync) into the same folder on
> the server is fast as you would expect. When mounted with ceph-fuse, I don't
> get this issue on the NFS client.
>
> Test environment is a small cluster with a single MON and single MDS, all
> running 10.2.1, CephFS metadata is an ssd pool, data is on spinners. The NFS
> server is CentOS 7, I've tested with the current shipped kernel (3.10),
> ELrepo 4.4 and ELrepo 4.6.
>
> More info:
>
> With the kernel driver, I mount the filesystem with "-o name=admin,secret"
>
> I've exported a folder with the following options:
>
> *(rw,root_squash,sync,wdelay,no_subtree_check,fsid=1244,sec=1)
>
> I then mount the folder on a CentOS 6 client with the following options (all
> default):
>
> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.3.231,mountvers=3,mountport=597,mountproto=udp,local_lock=none
>
> A small 4k write is taking 3 - 4 secs:
>
> # time dd if=/dev/zero of=testfile bs=4k count=1
> 1+0 records in
> 1+0 records out
> 4096 bytes (4.1 kB) copied, 3.59678 s, 1.1 kB/s
>
> real 0m3.624s
> user 0m0.000s
> sys 0m0.001s
>
> But a sync write on the sever directly into the same folder is fast (this is
> with the kernel driver):
>
> # time dd if=/dev/zero of=testfile2 bs=4k count=1 conv=fdatasync
> 1+0 records in
> 1+0 records out
> 4096 bytes (4.1 kB) copied, 0.0121925 s, 336 kB/s
count=1' on nfs client is equivalent to 'dd if=/dev/zero of=testfile
bs=4k count=1 conv=fsync' on cephfs. The reason that sync metadata
operation takes 3~4 seconds is that the MDS flushes its journal every
5 seconds. Adding async option to nfs export can avoid this delay.
>
> real 0m0.015s
> user 0m0.000s
> sys 0m0.002s
>
> If I mount cephfs with Fuse instead of the kernel, the NFS client write is
> fast:
>
> dd if=/dev/zero of=fuse01 bs=4k count=1
> 1+0 records in
> 1+0 records out
> 4096 bytes (4.1 kB) copied, 0.026078 s, 157 kB/s
>
In this case, ceph-fuse sends an extra request (getattr request on
directory) to MDS. The request causes MDS to flush its journal.
Whether or not client sends the extra request depends on what
capabilities it has. What capabilities client has, in turn, depend on
how many clients are accessing the directory. In my test, nfs on
ceph-fuse is not always fast.
Yan, Zheng
> Does anyone know what's going on here?
>
> Thanks
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com