Hi Manuel, My own experiences are that cephfs kernel client is significantly faster than fuse, however the fuse client is generally more reliable. If you need the extra speed of the kernel client on Centos, it may be worth using the ml kernel, as this gives you much more up to date cephfs support. If I understand your benchmarking, the two machines you are using to test cephfs-fuse vs cephfs-kernel are very different, so too are your test parameters. To get an accurate comparison, why not mount both fuse and kernel on the same machine, and then re-run your tests? best, Jake On 09/07/18 09:18, Manuel Sopena Ballesteros wrote: > Dear ceph community, > > > > I just installed ceph luminous in a small NVMe cluster for testing and I > tested 2 clients: > > > > Client 1: > > VM running centos 7 > > Ceph client: kernel > > # cpus: 4 > > RAM: 16GB > > > > Fio test > > > > # sudo fio --name=xx --filename=/mnt/mycephfs/test.file3 --filesize=100G > --iodepth=1 --rw=write --bs=4M --numjobs=2 --group_reporting > > xx: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) > 4096KiB-4096KiB, ioengine=psync, iodepth=1 > > ... > > fio-3.1 > > Starting 2 processes > > xx: Laying out IO file (1 file / 102400MiB) > > Jobs: 1 (f=1): [_(1),W(1)][100.0%][r=0KiB/s,w=2325MiB/s][r=0,w=581 > IOPS][eta 00m:00s] > > xx: (groupid=0, jobs=2): err= 0: pid=24290: Mon Jul 9 17:54:57 2018 > > write: IOPS=550, BW=2203MiB/s (2310MB/s)(200GiB/92946msec) > > clat (usec): min=946, max=464990, avg=3519.59, stdev=7031.97 > > lat (usec): min=1010, max=465091, avg=3612.53, stdev=7035.85 > > clat percentiles (usec): > > | 1.00th=[ 1188], 5.00th=[ 1631], 10.00th=[ 2245], 20.00th=[ > 2409], > > | 30.00th=[ 2540], 40.00th=[ 2671], 50.00th=[ 2802], 60.00th=[ > 2966], > > | 70.00th=[ 3195], 80.00th=[ 3654], 90.00th=[ 5080], 95.00th=[ > 6521], > > | 99.00th=[ 11469], 99.50th=[ 16450], 99.90th=[100140], > 99.95th=[149947], > > | 99.99th=[291505] > > bw ( MiB/s): min= 224, max= 2064, per=50.01%, avg=1101.97, > stdev=205.16, samples=369 > > iops : min= 56, max= 516, avg=275.27, stdev=51.29, samples=369 > > lat (usec) : 1000=0.01% > > lat (msec) : 2=7.89%, 4=75.24%, 10=15.42%, 20=1.09%, 50=0.22% > > lat (msec) : 100=0.04%, 250=0.08%, 500=0.02% > > cpu : usr=2.31%, sys=76.39%, ctx=15743, majf=1, minf=55 > > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >>=64=0.0% > > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>=64=0.0% > > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>=64=0.0% > > issued rwt: total=0,51200,0, short=0,0,0, dropped=0,0,0 > > latency : target=0, window=0, percentile=100.00%, depth=1 > > > > Run status group 0 (all jobs): > > WRITE: bw=2203MiB/s (2310MB/s), 2203MiB/s-2203MiB/s > (2310MB/s-2310MB/s), io=200GiB (215GB), run=92946-92946msec > > > > Client 2: > > Physical machine running Ubuntu xenial > > Ceph client: FUSE > > # cpus: 56 > > RAM: 512 > > > > Fio test > > > > $ sudo fio --name=xx --filename=/mnt/cephfs/test.file2 --filesize=5G > --iodepth=1 --rw=write --bs=4M --numjobs=1 --group_reporting > > xx: (g=0): rw=write, bs=4M-4M/4M-4M/4M-4M, ioengine=sync, iodepth=1 > > fio-2.2.10 > > Starting 1 process > > xx: Laying out IO file(s) (1 file(s) / 5120MB) > > Jobs: 1 (f=1): [W(1)] [91.7% done] [0KB/580.0MB/0KB /s] [0/145/0 iops] > [eta 00m:01s] > > xx: (groupid=0, jobs=1): err= 0: pid=6065: Mon Jul 9 17:44:02 2018 > > write: io=5120.0MB, bw=497144KB/s, iops=121, runt= 10546msec > > clat (msec): min=3, max=159, avg= 7.94, stdev= 4.81 > > lat (msec): min=3, max=159, avg= 8.08, stdev= 4.82 > > clat percentiles (msec): > > | 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7], > > | 30.00th=[ 7], 40.00th=[ 8], 50.00th=[ 8], 60.00th=[ 9], > > | 70.00th=[ 9], 80.00th=[ 10], 90.00th=[ 11], 95.00th=[ 11], > > | 99.00th=[ 12], 99.50th=[ 13], 99.90th=[ 61], 99.95th=[ 159], > > | 99.99th=[ 159] > > bw (KB /s): min=185448, max=726183, per=97.08%, avg=482611.80, > stdev=118874.09 > > lat (msec) : 4=1.64%, 10=88.20%, 20=10.00%, 100=0.08%, 250=0.08% > > cpu : usr=1.63%, sys=34.44%, ctx=42266, majf=0, minf=1586 > > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >>=64=0.0% > > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>=64=0.0% > > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>=64=0.0% > > issued : total=r=0/w=1280/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 > > latency : target=0, window=0, percentile=100.00%, depth=1 > > > > Run status group 0 (all jobs): > > WRITE: io=5120.0MB, aggrb=497143KB/s, minb=497143KB/s, > maxb=497143KB/s, mint=10546msec, maxt=10546msec > > > > NOTE: I did an iperf test from Client 2 to ceph nodes and the bandwidth > is ~25GBs > > > > *QUESTION:* > > According to the documentation, FUSE is supposed to run slower. I found > the client 2 using FUSE being much slower than client 1. Could someone > advice if this is expected? > > > > Thank you very much > > > > *Manuel Sopena Ballesteros *|* *Big data Engineer > *Garvan Institute of Medical Research * > The Kinghorn Cancer Centre,* *370 Victoria Street, Darlinghurst, NSW 2010 > *T:* + 61 (0)2 9355 5760 | *F:* +61 (0)2 9295 > 8507 | *E:* manuel.sb@xxxxxxxxxxxxx <mailto:manuel.sb@xxxxxxxxxxxxx> > > > > NOTICE > Please consider the environment before printing this email. This message > and any attachments are intended for the addressee named and may contain > legally privileged/confidential/copyright information. If you are not > the intended recipient, you should not read, use, disclose, copy or > distribute this communication. If you have received this message in > error please notify us at once by return email and then delete both > messages. We accept no liability for the distribution of viruses or > similar in electronic communications. This notice should not be removed. > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com