Re: Ceph luminous - throughput performance issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Steven,

I've recently done some performance testing on dell hardware. Here are some of my messy results. I was mainly testing the effects of the R0 stripe sizing on the perc card. Each disk has it's own R0 so that write back is enabled. VDs were created like this but with different stripesize `omconfig storage controller controller=1 action="" raid=r0 size=max pdisk=0:0:0 name=sdb readpolicy=ra writepolicy=wb stripesize=1mb`.

I have a few generations of perc cards in my cluster and it seems to me that a single disk R0 with at least a 64k stripesize works well. R0 is better for writes than the non-raid jbod option of some perc cards because it uses the write back cache. Especially in my situation where there are no SSD journals in place. The stripesize does make a difference, larger seems better to a certain point for mixed cluster use. There are a ton of different configurations to test but I only did a few focused on writes.

Kevin


R440, Perc H840 with 2 MD1400 attached with 12 10TB NLSAS drives per md1400. Xfs filestore with 10gb journal lv on each 10tb disk. Ceph cluster set up as a single mon/mgr/osd server for testing. These tables pasted well in my email client, hopefully they stay that way.


rados bench options




stripe
120 write -b 4M -t 16
avg mbs 1109 avg lat: 0.057676
512 bytes
120 write -b 4M -t 16 –nocleanup
Avg 1098 mb/s avg lat:0.0582565
512 bytes
120 seq -t 16
Avg 993 mb/s avg lat: 0.0634972 avg iops 248 512 bytes
120 rand -t 16
avg 1089mb/s avg lat: 0.05789 avg iops 272 512 bytes












120 write -b 4M -t 16
Avg 1012 mb/s avg lat 0.0631924 avg iop 252 128
120 write -b 4M -t 16 –nocleanup
Avg 923 mb/s avg lat 0.069259 avg ios 230 128
120 seq -t 16
avg 930mb/s avg lat 0.0678104 avg iops 232 128
120 rand -t 16
Avg 1076mb/s avg lat: 0.0585474 avg iops 269 128


rados bench options
stripe mb/s iops latency
120 write -b 4M -t 16 1m 1121.9 272 0.0570402

64k 1121.84 280 0.0570439

256k 1122 285 0.0570363
bench 120 write -b 64K -t 16 256k 909.451 14551 0.00109852

64k 726.114 11617 0.00137608

1m 879.748 14075 0.00113562
120 rand -t 16 --run-name seph34 1m 731 182 0.0863446
120 seq -t 16 --run-name seph34 1m 587 146 0.10759
120 seq -t 16 --run-name seph35 1m 806 200 0.157 2 hosts same time
120 write -b 4M -t 16 --run-name seph34 --no-cleanup 64k 1179 294 0.10848 2 hosts same time


Another set of testing using R740xd, perc h740p, 24 1.2TB 10K SAS. Filestore and bluestore testing, filestore has 10gb journal LV. Cluster is a single node mon/mgr/osd server. This hardware was being testing for a small rbd pool so rbd bench was used.


filestore stripe
iops
bytes/s
seconds
bench --io-type write --io-size 8K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 128k 69972.04 573210992.44 187
bench --io-type write --io-size 8K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 128k 70382.53 576573665.28 186
bench --io-type write --io-size 8K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 512k 79604.55 652120481.6 164
bench --io-type write --io-size 8K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 512k 75002.82 614423091.87 174
bench --io-type write --io-size 8K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 1m 71811.46 588279455.86 182
bench --io-type write --io-size 8K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 1m 87000.07 712704574.26 150





4k



bench --io-type write --io-size 4K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 128k 86682.94 355053334.01 302
bench --io-type write --io-size 4K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 128k 97065.03 397578370.73 270
bench --io-type write --io-size 4K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 512k 87254.94 357396223.51 300
bench --io-type write --io-size 4K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 512k 87607.66 358840973.73 299
bench --io-type write --io-size 4K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 1m 78349.87 320921084.1 334
bench --io-type write --io-size 4K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 1m 95970.79 393096346.89 273

























bluestore



8k



bench --io-type write --io-size 8K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 128k 83534.76 684316715.54 156
bench --io-type write --io-size 8K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 128k 74905.4 613624999.05 174
bench --io-type write --io-size 8K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 512k 85308.67 698848604.42 153
bench --io-type write --io-size 8K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 512k 80554.28 659900691.9 162
bench --io-type write --io-size 8K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 1m 84095.15 688907491.3 155
bench --io-type write --io-size 8K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 1m 87225.77 714553534.11
4k



bench --io-type write --io-size 4K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 128k 71581.88 293199373.1 366
bench --io-type write --io-size 4K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 128k 96539.72 395426709.79 271
bench --io-type write --io-size 4K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 512k 81795.88 335035938.39 320
bench --io-type write --io-size 4K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 512k 80033.48 327817115.85 327
bench --io-type write --io-size 4K --io-threads 16 --io-total 100G --io-pattern seq benchmark/img1 1m 79217.56 324475135.8 330
bench --io-type write --io-size 4K --io-threads 32 --io-total 100G --io-pattern seq benchmark/img1 1m 90807.8 371948765.18 288





On 01/31/2018 09:39 AM, Steven Vacaroaia wrote:
Hi,

Is there anyone using DELL servers with PERC controllers willing to provide advise on configuring it for good throughput performance ?

I have 3 servers with 1 SSD and 3 HDD  each
All drives are Entreprise grade 

                Connector          : 00<Internal><Encl Pos 1 >: Slot 0
                Vendor Id          : TOSHIBA
                Product Id         : PX04SHB040
                State              : Online
                Disk Type          : SAS,Solid State Device
                Capacity           : 372.0 GB
                Power State        : Active

                Connector          : 00<Internal><Encl Pos 1 >: Slot 1
                Vendor Id          : TOSHIBA
                Product Id         : AL13SEB600
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 558.375 GB
                Power State        : Active


Created OSD with separate WAL(1 GB)  and DB (15 GB) partitions on SSD 

rados bench is abysmal   

The interesting part is that testing drives with fio is also pretty bad  - that is why I am thinking that my controller config might be the culprit  

See below results using various config

 Commands used 

 megacli -LDInfo -LALL -a0

fio --filename=/dev/sd[a-b]  --direct=1 --sync=1 --rw=write --bs=4k --numjobs=5 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test



SSD drive
Current Cache Policy: WriteThrough, ReadAheadNone, Cached, No Write Cache if Bad BBU
Jobs: 5 (f=5): [W(5)] [100.0% done] [0KB/125.2MB/0KB /s] [0/32.5K/0 iops] [eta 00m:00s]

Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Jobs: 5 (f=5): [W(5)] [100.0% done] [0KB/224.8MB/0KB /s] [0/57.6K/0 iops] [eta 00m:00s]



HDD drive

Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Jobs: 5 (f=5): [W(5)] [100.0% done] [0KB/77684KB/0KB /s] [0/19.5K/0 iops] [eta 00m:00s]


Current Cache Policy: WriteBack, ReadAdaptive, Cached, No Write Cache if Bad BBU
Jobs: 5 (f=5): [W(5)] [100.0% done] [0KB/89036KB/0KB /s] [0/22.3K/0 iops] [eta 00m:00s]

rados bench -p rbd 120 write -t 64 -b 4096 --no-cleanup && rados bench -p rbd 120 -t 64 seq

Total time run:         120.009091
Total writes made:      630542
Write size:             4096
Object size:            4096
Bandwidth (MB/sec):     20.5239
Stddev Bandwidth:       2.43418
Max bandwidth (MB/sec): 37.0391
Min bandwidth (MB/sec): 15.9336
Average IOPS:           5254
Stddev IOPS:            623
Max IOPS:               9482
Min IOPS:               4079
Average Latency(s):     0.0121797
Stddev Latency(s):      0.0208528
Max latency(s):         0.428262
Min latency(s):         0.000859286


Total time run:       88.954502
Total reads made:     630542
Read size:            4096
Object size:          4096
Bandwidth (MB/sec):   27.6889
Average IOPS:         7088
Stddev IOPS:          1701
Max IOPS:             8923
Min IOPS:             1413
Average Latency(s):   0.00901481
Max latency(s):       0.946848
Min latency(s):       0.000286236



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux