Re: ceph fs re-export with or without NFS async option

Frank Schilder <frans@xxxxxx> · Mon, 13 Sep 2021 10:42:09 +0000

Hi Jeff,

when exporting the file system with option sync, I would expect/accept maybe a factor of 2 slower, but not 10. I'm wondering if there is IO operations splitting going on. The r/wsizes on the nfs mount are at the max of 1MB, so copying a large file should go fast despite the sync option.

I see an unreasonably high io-wait on the nfs server while the nfsd and ceph kworker do nothing, which I would take as a sign that something (IO have-over) is not working as expected.

Do you have an idea how I can check that IO operations up to 1MB IO size are executed as single operations from nfsd to the ceph fs client? Or any other ideas what might cause the slow IO/high io wait?

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Jeff Layton <jlayton@xxxxxxxxxx>
Sent: 09 September 2021 12:30:30
To: Frank Schilder; ceph-users
Cc: Patrick Donnelly
Subject: Re:  ceph fs re-export with or without NFS async option

On Wed, 2021-09-08 at 16:39 +0000, Frank Schilder wrote:
> Hi all,
>
> I have a question about a ceph fs re-export via nfsd. For NFS v4 mounts the exports option sync is now the default instead of async. I just made the experience that using async gives more than a factor 10 performance improvement. I couldn't find any advice within ceph community information on how dangerous this really is and if there is an alternative way to re-export ceph fs providing the same level of performance.
>
> The ceph fs is a kernel client mount. I don't want to go fuse-mount as the performance is worse than the NFS sync export option.
>

It's dangerous. Exporting with 'async' basically makes the nfs server
pretend that all writes have been written to stable storage when they
haven't. COMMIT calls over the wire basically become no-ops. If the nfs
server crashes then you may see data corruption due to lost writes.

I've not played much with re-exporting kcephfs via nfsd, but you could
also consider using the nfs-ganesha userland server. I doubt though that
the performance will be any better though.

With a network filesystem like cephfs on the backend, you can't really
get around the extra hops that reads and writes have to take on the
network. The only real way to fix that would be to support pNFS, but
there is not currently a consensus on how to do that.
--
Jeff Layton <jlayton@xxxxxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx