Hi All,
I've been lazily investigating performance regression in BlueStore for
last couple of weeks.
Here are some, pretty odd, results I'd like to share.
Preface.
Test scenario:
(1) 4K random RW over pre-filled BlueStore instance using FIO.
(2) 4K random Write over the same BlueStore instance using FIO.
FIO executed against standalone BlueStore. 64 parallel jobs work on 32K
objects 4M size each.
Min alloc size = 4K. CSum is off.
Execution time - 360 seconds.
To smooth the effect of recent mempool/bluestore caching changes config
file has both legacy and latest caching settings:
bluestore_buffer_cache_size = 104857600
bluestore_onode_cache_size = 32768
bluestore_cache_meta_ratio = 1
bluestore_cache_size = 3147483648
Other settings are the same.
Note: (1) & (2) were executed in different order with no significant
difference.
Results for specific commits (earlier commits first):
(1) Commit: 4f09892a84da6603fdc42825fcf8c11359c1cc29 (Merge: ba5d61d
36dc236) Oct 24
R/W: aggrb: ~80Mb/s for both read and write
Write only: aggrb: ~60 Mb/s
(more untested commits here)
(2) Commit: ca1be285f97c6efa2f8aa2cebaf360abb64b78f4 (rgw: support for
x-robots-tag header)
R/W: aggrb: ~108Mb/s for both read and write
Write only: aggrb: ~28 Mb/s
(3) Commit: 81295c61c4507d26ba3f80c52dd53385a4b9e9d7 (global: introduce
mempool_debug config option, asok command)
R/W: aggrb: ~109Mb/s for both read and write
Write only: aggrb: ~28 Mb/s
(4) Commit: 030bc063e44e27f2abcf920f4071c4f3bb5ed9ea (os/bluestore: move
most cache types into mempools)
R/W: aggrb: ~98 Mb/s for both read and write
Write only: aggrb: ~27 Mb/s
(5) Commit: bcf20a1ca12ac0a7d4bd51e0beeda2877b4e0125 (os/bluestore:
restructure cache trimming in terms of mempool)
R/W: aggrb: ~48 Mb/s for both read and write
Write only: aggrb: ~42 Mb/s
(more untested commits here)
(6) Commit: eb8b4c8897d5614eccceab741d8c0d469efa7ce7 (Merge: 12d1d0c
8eb2c9d). (Pretty fresh master snapshot on Nov 14)
R/W: aggrb: ~20 Mb/s for both read and write
Write only: aggrb: ~15 Mb/s
Summary:
In the list above commits (2)-(5) are sequential while there are gaps
between (1)-(2) & (5)-(6)
It looks like we had the best R/W performance at (2) & (3) with gradual
degradation afterwards. (5) looks like the most devastating one.
Another one is somewhere between (5)-(6).
The odd thing is that we had significant negative performance impact for
write-only case when R/W performance was at it's max.
Exact commit causing perf changes between (1)-(2) & (5) -(6) wasn't
investigated..
Any comments?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html