Hi Jeff, when exporting the file system with option sync, I would expect/accept maybe a factor of 2 slower, but not 10. I'm wondering if there is IO operations splitting going on. The r/wsizes on the nfs mount are at the max of 1MB, so copying a large file should go fast despite the sync option. I see an unreasonably high io-wait on the nfs server while the nfsd and ceph kworker do nothing, which I would take as a sign that something (IO have-over) is not working as expected. Do you have an idea how I can check that IO operations up to 1MB IO size are executed as single operations from nfsd to the ceph fs client? Or any other ideas what might cause the slow IO/high io wait? Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Jeff Layton <jlayton@xxxxxxxxxx> Sent: 09 September 2021 12:30:30 To: Frank Schilder; ceph-users Cc: Patrick Donnelly Subject: Re: ceph fs re-export with or without NFS async option On Wed, 2021-09-08 at 16:39 +0000, Frank Schilder wrote: > Hi all, > > I have a question about a ceph fs re-export via nfsd. For NFS v4 mounts the exports option sync is now the default instead of async. I just made the experience that using async gives more than a factor 10 performance improvement. I couldn't find any advice within ceph community information on how dangerous this really is and if there is an alternative way to re-export ceph fs providing the same level of performance. > > The ceph fs is a kernel client mount. I don't want to go fuse-mount as the performance is worse than the NFS sync export option. > It's dangerous. Exporting with 'async' basically makes the nfs server pretend that all writes have been written to stable storage when they haven't. COMMIT calls over the wire basically become no-ops. If the nfs server crashes then you may see data corruption due to lost writes. I've not played much with re-exporting kcephfs via nfsd, but you could also consider using the nfs-ganesha userland server. I doubt though that the performance will be any better though. With a network filesystem like cephfs on the backend, you can't really get around the extra hops that reads and writes have to take on the network. The only real way to fix that would be to support pNFS, but there is not currently a consensus on how to do that. -- Jeff Layton <jlayton@xxxxxxxxxx> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx