Re: [ceph-users] who is using nfs-ganesha and cephfs?

Jeff Layton <jlayton@xxxxxxxxxx> · Thu, 09 Nov 2017 09:28:51 -0500

Ouch... yeah the rotten performance is sad but not really surprising.

We add a lot of extra hops and data copies by going through ganesha.
Ganesha also uses the userland client libs and those are organized
around the BCCL (Big Ceph Client Lock).

I think the only way we'll get decent performance over the long haul is
get ganesha out of the data path. A flexfiles pnfs layout is something
of a natural fit on top of cephfs, and I imagine that would get us a lot
closer to the cephfs read/write numbers.

-- Jeff

On Thu, 2017-11-09 at 13:21 +0000, Supriti Singh wrote:
> The email was not delivered to ceph-devel@xxxxxxxxxxxxxxx. So, re-sending it. 
> 
> Few more things regarding the hardware and clients used in our benchmarking setup:
> - The cephfs benchmark were done using kernel cephfs client. 
> - NFS-Ganesha was mounted using nfs version 4. 
> - Single nfs-ganesha server was used. 
> 
> Ceph and Client setup:
> - Each client node has 16 cores and 16 GB RAM.
> - MDS and Ganesha server is running on the same node. 
> - Network interconnect between client and ceph nodes is 40Gbit/s. 
> - Ceph on 8 nodes: (each node has 24 cores/128 GB RAM). 
>   - 5 OSD nodes
>   - 3 MON/MDS nodes
>   - 6 OSD daemons per node - Blustore - SSD/NVME journal 
> 
> 
> ------
> Supriti Singh SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
>  
> 
> 
> 
> > > > Supriti Singh 11/09/17 12:15 PM >>>
> 
> Hi Sage,
> 
> As Lars mentioned, at SUSE, we use ganesha 2.5.2/luminous. We did a preliminary performance comparison of cephfs client
> and nfs-ganesha client. I have attached the results. The results are aggregate bandwidth over 10 clients.
> 
> 1. Test Setup:
> We use fio to read/write to a single 5GB file per thread for 300 seconds. A single job (represented in x-axis) is of
> type {number_of_worker_thread}rw_{block_size}_{op}, where, 
> number_of_worker_threads: 1, 4, 8, 16
> Block size: 4K,64K,1M,4M,8M
> op: rw 
> 
>  
> 2. NFS-Ganesha configuration:
> Parameters set (other than default):
> 1. Graceless = True
> 2. MaxRPCSendBufferSize/MaxRPCRecvBufferSize is set to max value.
> 
> 3. Observations:
> -  For single thread (on each client) and 4k block size, the b/w is around 45% of cephfs 
> - As number of threads increases, the performance drops. It could be related to nfs-ganesha parameter
> "Dispatch_Max_Reqs_Xprt", which defaults to 512. Note, this parameter is important only for v2.5. 
> - We did run with both nfs-ganesha mdcache enabled/disabled. But there were no significant improvements with caching.
> Not sure but it could be related to this issue: https://github.com/nfs-ganesha/nfs-ganesha/issues/223
>   
> The results are still preliminary, and I guess with proper tuning of nfs-ganesha parameters, it could be better.
> 
> Thanks,
> Supriti 
> 
> ------
> Supriti Singh SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
>  
> 
> 
> 
> > > > Lars Marowsky-Bree <lmb@xxxxxxxx> 11/09/17 11:07 AM >>>
> 
> On 2017-11-08T21:41:41, Sage Weil <sweil@xxxxxxxxxx> wrote:
> 
> > Who is running nfs-ganesha's FSAL to export CephFS?  What has your 
> > experience been?
> > 
> > (We are working on building proper testing and support for this into 
> > Mimic, but the ganesha FSAL has been around for years.)
> 
> We use it currently, and it works, but let's not discuss the performance
> ;-)
> 
> How else do you want to build this into Mimic?
> 
> Regards,
>     Lars
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html