Regarding split & merge, I have default values
filestore_merge_threshold = 10
filestore_split_multiple = 2
according to https://bugzilla.redhat.com/show_bug.cgi?id=1219974 the recommended values are
filestore_merge_threshold = 40
filestore_split_multiple = 8
I did tests of 4 pools: 2 replicated pools (x3 ) and 2 EC pools (k=6,m=3)
The pool with the lowest bandwidth has osd tree structure like
├── 20.115s1_head
│ └── DIR_5
│ └── DIR_1
│ ├── DIR_1
│ │ ├── DIR_0
│ │ ├── DIR_1
│ │ ├── DIR_2
│ │ │ ├── DIR_0
│ │ │ ├── DIR_1
│ │ │ ├── DIR_2
│ │ │ ├── DIR_3
│ │ │ ├── DIR_4
│ │ │ ├── DIR_5
│ │ │ ├── DIR_6
│ │ │ ├── DIR_7
│ │ │ ├── DIR_8
│ │ │ ├── DIR_9
│ │ │ ├── DIR_A
│ │ │ ├── DIR_B
│ │ │ ├── DIR_C
│ │ │ ├── DIR_D
│ │ │ ├── DIR_E
│ │ │ └── DIR_F
Tests results
# rados bench -p default.rgw.buckets.data 10 write
hints = 1
Maintaining 16 concurrent writes of 4194432 bytes to objects of size 4194432 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_sg08-09_180679
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 129 113 451.975 452.014 0.0376714 0.128027
2 16 209 193 385.964 320.01 0.119609 0.138517
3 16 235 219 291.974 104.003 0.0337624 0.13731
4 16 235 219 218.981 0 - 0.13731
5 16 266 250 199.983 62.0019 0.111673 0.238424
6 16 317 301 200.649 204.006 0.0340569 0.298489
7 16 396 380 217.124 316.01 0.0379956 0.283458
8 16 444 428 213.981 192.006 0.0304383 0.274193
9 16 485 469 208.426 164.005 0.391956 0.283421
10 16 496 480 191.983 44.0013 0.104497 0.292074
11 16 497 481 174.894 4.00012 0.999985 0.293545
12 16 497 481 160.32 0 - 0.293545
13 16 497 481 147.987 0 - 0.293545
14 16 497 481 137.417 0 - 0.293545
Total time run: 14.493353
Total writes made: 497
Write size: 4194432
Object size: 4194432
Bandwidth (MB/sec): 137.171
Stddev Bandwidth: 147.001
Max bandwidth (MB/sec): 452.014
Min bandwidth (MB/sec): 0
Average IOPS: 34
Stddev IOPS: 36
Max IOPS: 113
Min IOPS: 0
Average Latency(s): 0.464281
Stddev Latency(s): 1.09388
Max latency(s): 6.3723
Min latency(s): 0.023835
Cleaning up (deleting benchmark objects)
Removed 497 objects
Clean up completed and total clean up time :10.622382
#
# rados bench -p benchmark_erasure_coded 10 write
hints = 1
Maintaining 16 concurrent writes of 4202496 bytes to objects of size 4202496 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_sg08-09_180807
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 424 408 1635.11 1635.19 0.0490434 0.0379616
2 16 828 812 1627.03 1619.16 0.0616501 0.0388467
3 16 1258 1242 1659.06 1723.36 0.0304412 0.0384537
4 16 1659 1643 1646.03 1607.13 0.0155402 0.0387351
5 16 2053 2037 1632.61 1579.08 0.0453354 0.0390236
6 16 2455 2439 1629 1611.14 0.0485313 0.0392376
7 16 2649 2633 1507.34 777.516 0.0148972 0.0393161
8 16 2858 2842 1423.61 837.633 0.0157639 0.0449088
9 16 3245 3229 1437.75 1551.02 0.0200845 0.0444847
10 16 3629 3613 1447.85 1539 0.0654451 0.0441569
Total time run: 10.229591
Total writes made: 3630
Write size: 4202496
Object size: 4202496
Bandwidth (MB/sec): 1422.18
Stddev Bandwidth: 341.609
Max bandwidth (MB/sec): 1723.36
Min bandwidth (MB/sec): 777.516
Average IOPS: 354
Stddev IOPS: 85
Max IOPS: 430
Min IOPS: 194
Average Latency(s): 0.0448612
Stddev Latency(s): 0.0712224
Max latency(s): 1.08353
Min latency(s): 0.0134629
Cleaning up (deleting benchmark objects)
Removed 3630 objects
Clean up completed and total clean up time :2.321669
#
# rados bench -p volumes 10 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_sg08-09_180651
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 336 320 1279.89 1280 0.0309006 0.0472524
2 16 653 637 1273.84 1268 0.0465151 0.0495701
3 16 956 940 1253.17 1212 0.0337327 0.0504146
4 16 1256 1240 1239.85 1200 0.0177263 0.0509145
5 16 1555 1539 1231.05 1196 0.0364991 0.0516724
6 16 1868 1852 1234.51 1252 0.0260964 0.0510236
7 16 2211 2195 1254.13 1372 0.040738 0.050847
8 16 2493 2477 1238.35 1128 0.0228582 0.0514979
9 16 2838 2822 1254.07 1380 0.0265224 0.0508641
10 16 3116 3100 1239.85 1112 0.0160151 0.0513104
Total time run: 10.192091
Total writes made: 3117
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1223.3
Stddev Bandwidth: 89.9383
Max bandwidth (MB/sec): 1380
Min bandwidth (MB/sec): 1112
Average IOPS: 305
Stddev IOPS: 22
Max IOPS: 345
Min IOPS: 278
Average Latency(s): 0.0518144
Stddev Latency(s): 0.0529575
Max latency(s): 0.663523
Min latency(s): 0.0122169
Cleaning up (deleting benchmark objects)
Removed 3117 objects
Clean up completed and total clean up time :0.212296
#
# rados bench -p benchmark_replicated 10 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_sg08-09_180779
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 309 293 1171.94 1172 0.0233267 0.0508877
2 16 632 616 1231.87 1292 0.0258237 0.049612
3 16 959 943 1257.19 1308 0.0335615 0.0483464
4 16 1276 1260 1259.85 1268 0.031461 0.0504689
5 16 1643 1627 1301.44 1468 0.0274032 0.0489651
6 16 1991 1975 1316.51 1392 0.0408116 0.0483596
7 16 2328 2312 1320.98 1348 0.0242298 0.048175
8 16 2677 2661 1330.33 1396 0.097513 0.047962
9 16 3042 3026 1344.72 1460 0.0196724 0.0474078
10 16 3384 3368 1347.03 1368 0.0426199 0.0472573
Total time run: 10.482871
Total writes made: 3384
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1291.25
Stddev Bandwidth: 90.4861
Max bandwidth (MB/sec): 1468
Min bandwidth (MB/sec): 1172
Average IOPS: 322
Stddev IOPS: 22
Max IOPS: 367
Min IOPS: 293
Average Latency(s): 0.048763
Stddev Latency(s): 0.0547666
Max latency(s): 0.938211
Min latency(s): 0.0121556
Cleaning up (deleting benchmark objects)
Removed 3384 objects
Clean up completed and total clean up time :0.239684
#
Luis why did you advise against increasing pg_num pgp_num ? I'm wondering which option is better: increasing pg_num or filestore_merge_threshold and filestore_split_multiple ?
Thanks
Jakub
On Thu, Feb 1, 2018 at 9:38 AM, Jaroslaw Owsiewski <jaroslaw.owsiewski@xxxxxxxxxx> wrote:
Hi,maybe "split is on the floor"?Regards--Jarek
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com