Re: poor cephFS performance on Nautilus 14.2.9 deployed by ceph_ansible

Mark Nelson <mnelson@xxxxxxxxxx> · Wed, 3 Jun 2020 22:27:04 -0500

Try using the kernel client instead of the FUSE client.  The FUSE client 
is known to be slow for a variety of reasons and I suspect you may see 
faster performance with the kernel client.

Thanks,

Mark

On 6/2/20 8:00 PM, Derrick Lin wrote:
Hi guys,

We just deployed a CEPH 14.2.9 cluster with the following hardware:

MDSS x 1
Xeon Gold 5122 3.6Ghz
192GB
Mellanox ConnectX-4 Lx 25GbE

MON x 3
Xeon Bronze 3103 1.7Ghz
48GB
Mellanox ConnectX-4 Lx 25GbE
6 x 600GB 10K SAS

OSD x 5
Xeon Silver 4110 2.1Ghz x 2
192GB
Mellanox ConnectX-4 Lx 25GbE
16 x 10TB 7.2K NLSAS (block)
2 x 2TB Intel P4600 NVMe (block.db)

Network is all Mellanox SN2410/SN2700 configured at 25GbE for both front
and back network.

Just for POC at this stage, the cluster was deployed by ceph_ansible
without much customization and the initial test on its cephFS FUSE mount
performance seems to be very low. We did some test with iozone the result
as follow:

]# /opt/iozone/bin/iozone -i 0 -i 1-r 128k -s 5G -t 20
         Iozone: Performance Test of File I/O
                 Version $Revision: 3.465 $
                 Compiled for 64 bit mode.
                 Build: linux-AMD64

         Contributors:William Norcott, Don Capps, Isom Crawford, Kirby
Collins
                      Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                      Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                      Randy Dunlap, Mark Montague, Dan Million, Gavin
Brebner,
                      Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave
Boone,
                      Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                      Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
                      Vangel Bojaxhi, Ben England, Vikentsi Lapa,
                      Alexey Skidanov.

         Run began: Tue Jun  2 16:40:53 2020

         File size set to 5242880 kB
         Command line used: /opt/iozone/bin/iozone -i 0 -i 1-r -s 5G -t 20
128k
         Output is in kBytes/sec
         Time Resolution = 0.000001 seconds.
         Processor cache size set to 1024 kBytes.
         Processor cache line size set to 32 bytes.
         File stride size set to 17 * record size.
         Throughput test with 20 processes
         Each process writes a 5242880 kByte file in 4 kByte records

         Children see throughput for 20 initial writers  =   35001.12 kB/sec
         Parent sees throughput for 20 initial writers   =   34967.65 kB/sec
         Min throughput per process                      =    1748.22 kB/sec
         Max throughput per process                      =    1751.62 kB/sec
         Avg throughput per process                      =    1750.06 kB/sec
         Min xfer                                        = 5232724.00 kB

         Children see throughput for 20 rewriters        =   35704.79 kB/sec
         Parent sees throughput for 20 rewriters         =   35704.30 kB/sec
         Min throughput per process                      =    1783.44 kB/sec
         Max throughput per process                      =    1786.29 kB/sec
         Avg throughput per process                      =    1785.24 kB/sec
         Min xfer                                        = 5234532.00 kB

         Children see throughput for 20 readers          = 49368539.50 kB/sec
         Parent sees throughput for 20 readers           = 49317231.38 kB/sec
         Min throughput per process                      = 2414424.00 kB/sec
         Max throughput per process                      = 2599996.25 kB/sec
         Avg throughput per process                      = 2468426.98 kB/sec
         Min xfer                                        = 4868708.00 kB

         Children see throughput for 20 re-readers       = 48675891.50 kB/sec
         Parent sees throughput for 20 re-readers        = 48617335.67 kB/sec
         Min throughput per process                      = 2316395.25 kB/sec
         Max throughput per process                      = 2703868.75 kB/sec
         Avg throughput per process                      = 2433794.58 kB/sec
         Min xfer                                        = 4491704.00 kB

We also did some dd tests, the write speed on a single test on our standard
server is ~50MB/s but on a very big memory server, the speed is double ~
80-90MB/s.

We have zero experience on ceph and as said we haven't done more tuning at
this stage. But if this sort of performance is way too low from those
hardware spec?

Any hints will be appreciated.

Cheers
D
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx