+100 to this idea. On Fri, Nov 17, 2017 at 8:27 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > FWIW, it might be interesting at some point to hack together a libcephfs > backend driver for fio. It already has one for librbd so I imagine it > wouldn't be too hard to do, and would probably give us a better raw > comparison between the kernel client and libcephfs. > > On Thu, 2017-11-16 at 11:29 -0500, Matt Benjamin wrote: >> Hi Rafael, >> >> Thanks for taking the time to report your results. >> >> The similarity to Ceph fuse performance is to be expected, because >> both Ceph fuse and the nfs-ganesha FSAL driver use libcephfs, as Jeff >> Layton noted. It's worth noting that nfs-ganesha does not appear to >> be adding i/o or metadata operation latency. >> >> The interesting questions, pushing further on Jeff's point, I think are >> >> 1. libcephfs vs kernel cephfs performance delta, and in particular >> 2. the portion of that delta NOT accounted for by the direct OSD data >> path available to the kernel mode ceph client--the latter can >> eventually be made available to nfs-ganesha via pNFS as Jeff hinted, >> but the former is potentially available for performance improvement >> >> The topic of the big client lock is an old one. I experimented with >> removing it in 2014, branch api-concurrent here >> git@xxxxxxxxxx:linuxbox2/linuxbox-ceph.git. I'm not confident that >> just removing the client lock bottleneck will bring visible >> improvements, though, especially until MDS concurrency improvements >> are in place, but it may be worth revisiting. >> >> Matt >> >> On Thu, Nov 16, 2017 at 3:17 AM, Rafael Lopez <rafael.lopez@xxxxxxxxxx> wrote: >> > We are running RHCS2.3 (jewel) with ganesha 2.4.2 and cephfs fsal, compiled >> > from srpm. experimenting with CTDB for controlling ganesha HA since we run >> > samba on same servers. >> > >> > Haven't done much functionality/stress testing but on face value basic stuff >> > seems to work well (file operations). >> > >> > In terms of performance, last time I tested ganesha it seemed comparable to >> > ceph-fuse (RHCS2.x/jewel, i think luminous ceph-fuse is better). Though I >> > haven't done rigorous metadata tests or multiple client tests. Also our >> > ganesha servers are quite small, as we are thus far only serving cephfs >> > natively. eg 4G ram 1 core. Here are some FIO results: >> > >> > jobs in order are: >> > 1. async 1M >> > 2. sync 1M >> > 3. async 4k >> > 4. sync 4k >> > 5. seq read 1M >> > 6. rand read 4k >> > >> > Ceph cluster is RHCS 2.3 (10.2.7) >> > >> > CEPH-FUSE (10.2.x) >> > WRITE: io=143652MB, aggrb=490328KB/s, minb=490328KB/s, maxb=490328KB/s, >> > mint=300002msec, maxt=300002msec >> > WRITE: io=14341MB, aggrb=48947KB/s, minb=48947KB/s, maxb=48947KB/s, >> > mint=300018msec, maxt=300018msec >> > WRITE: io=9808.2MB, aggrb=33478KB/s, minb=33478KB/s, maxb=33478KB/s, >> > mint=300001msec, maxt=300001msec >> > WRITE: io=424476KB, aggrb=1414KB/s, minb=1414KB/s, maxb=1414KB/s, >> > mint=300003msec, maxt=300003ms >> > READ: io=158069MB, aggrb=539527KB/s, minb=539527KB/s, maxb=539527KB/s, >> > mint=300008msec, maxt=300008msec >> > READ: io=1881.2MB, aggrb=6420KB/s, minb=6420KB/s, maxb=6420KB/s, >> > mint=300001msec, maxt=300001msec >> > >> > ganesha (nfs3) >> > WRITE: io=157891MB, aggrb=538923KB/s, minb=538923KB/s, maxb=538923KB/s, >> > mint=300006msec, maxt=300006msec >> > WRITE: io=38700MB, aggrb=132093KB/s, minb=132093KB/s, maxb=132093KB/s, >> > mint=300006msec, maxt=300006msec >> > WRITE: io=3072.0MB, aggrb=10148KB/s, minb=10148KB/s, maxb=10148KB/s, >> > mint=309957msec, maxt=309957msec >> > WRITE: io=397516KB, aggrb=1325KB/s, minb=1325KB/s, maxb=1325KB/s, >> > mint=300001msec, maxt=300001msec >> > READ: io=82521MB, aggrb=281669KB/s, minb=281669KB/s, maxb=281669KB/s, >> > mint=300002msec, maxt=300002msec >> > READ: io=1322.2MB, aggrb=4513KB/s, minb=4513KB/s, maxb=4513KB/s, >> > mint=300001msec, maxt=300001msec >> > >> > cephfs kernel client >> > WRITE: io=471041MB, aggrb=1568.8MB/s, minb=1568.8MB/s, maxb=1568.8MB/s, >> > mint=300394msec, maxt=300394msec >> > WRITE: io=50005MB, aggrb=170680KB/s, minb=170680KB/s, maxb=170680KB/s, >> > mint=300006msec, maxt=300006msec >> > WRITE: io=169092MB, aggrb=577166KB/s, minb=577166KB/s, maxb=577166KB/s, >> > mint=300000msec, maxt=300000msec >> > WRITE: io=530548KB, aggrb=1768KB/s, minb=1768KB/s, maxb=1768KB/s, >> > mint=300003msec, maxt=300003msec >> > READ: io=121501MB, aggrb=414720KB/s, minb=414720KB/s, maxb=414720KB/s, >> > mint=300002msec, maxt=300002msec >> > READ: io=3264.6MB, aggrb=11142KB/s, minb=11142KB/s, maxb=11142KB/s, >> > mint=300001msec, maxt=300001msec >> > >> > happy to share fio job file if anyone wants it. >> > >> > >> > On 9 November 2017 at 08:41, Sage Weil <sweil@xxxxxxxxxx> wrote: >> > > >> > > Who is running nfs-ganesha's FSAL to export CephFS? What has your >> > > experience been? >> > > >> > > (We are working on building proper testing and support for this into >> > > Mimic, but the ganesha FSAL has been around for years.) >> > > >> > > Thanks! >> > > sage >> > > >> > > -- >> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> > > the body of a message to majordomo@xxxxxxxxxxxxxxx >> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> > >> > >> > >> > -- >> > Rafael Lopez >> > Research Devops Engineer >> > Monash University eResearch Centre >> > >> > T: +61 3 9905 9118 >> > M: +61 (0)427682670 >> > E: rafael.lopez@xxxxxxxxxx >> > >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@xxxxxxxxxxxxxx >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> >> >> > > -- > Jeff Layton <jlayton@xxxxxxxxxx> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html