Mark,
will do more investigation and collect some stats.
On 11/16/2016 7:25 AM, Mark Nelson wrote:
On 11/15/2016 05:22 PM, Sage Weil wrote:
On Tue, 15 Nov 2016, Igor Fedotov wrote:
Hi All,
I've been lazily investigating performance regression in BlueStore
for last
couple of weeks.
Here are some, pretty odd, results I'd like to share.
Preface.
Test scenario:
(1) 4K random RW over pre-filled BlueStore instance using FIO.
(2) 4K random Write over the same BlueStore instance using FIO.
FIO executed against standalone BlueStore. 64 parallel jobs work on 32K
objects 4M size each.
Min alloc size = 4K. CSum is off.
Execution time - 360 seconds.
To smooth the effect of recent mempool/bluestore caching changes
config file
has both legacy and latest caching settings:
bluestore_buffer_cache_size = 104857600
bluestore_onode_cache_size = 32768
bluestore_cache_meta_ratio = 1
bluestore_cache_size = 3147483648
Other settings are the same.
Note: (1) & (2) were executed in different order with no significant
difference.
Results for specific commits (earlier commits first):
(1) Commit: 4f09892a84da6603fdc42825fcf8c11359c1cc29 (Merge: ba5d61d
36dc236)
Oct 24
R/W: aggrb: ~80Mb/s for both read and write
Write only: aggrb: ~60 Mb/s
(more untested commits here)
(2) Commit: ca1be285f97c6efa2f8aa2cebaf360abb64b78f4 (rgw: support for
x-robots-tag header)
R/W: aggrb: ~108Mb/s for both read and write
Write only: aggrb: ~28 Mb/s
(3) Commit: 81295c61c4507d26ba3f80c52dd53385a4b9e9d7 (global: introduce
mempool_debug config option, asok command)
R/W: aggrb: ~109Mb/s for both read and write
Write only: aggrb: ~28 Mb/s
(4) Commit: 030bc063e44e27f2abcf920f4071c4f3bb5ed9ea (os/bluestore:
move most
cache types into mempools)
R/W: aggrb: ~98 Mb/s for both read and write
Write only: aggrb: ~27 Mb/s
(5) Commit: bcf20a1ca12ac0a7d4bd51e0beeda2877b4e0125 (os/bluestore:
restructure cache trimming in terms of mempool)
R/W: aggrb: ~48 Mb/s for both read and write
Write only: aggrb: ~42 Mb/s
(more untested commits here)
(6) Commit: eb8b4c8897d5614eccceab741d8c0d469efa7ce7 (Merge: 12d1d0c
8eb2c9d).
(Pretty fresh master snapshot on Nov 14)
R/W: aggrb: ~20 Mb/s for both read and write
Write only: aggrb: ~15 Mb/s
Summary:
In the list above commits (2)-(5) are sequential while there are
gaps between
(1)-(2) & (5)-(6)
It looks like we had the best R/W performance at (2) & (3) with gradual
degradation afterwards. (5) looks like the most devastating one.
Another one is somewhere between (5)-(6).
The odd thing is that we had significant negative performance impact
for
write-only case when R/W performance was at it's max.
Exact commit causing perf changes between (1)-(2) & (5) -(6) wasn't
investigated..
Any comments?
Hrm, I don't see the same regression in my environment.. I tested both
030bc063e44e27f2abcf920f4071c4f3bb5ed9ea and
bcf20a1ca12ac0a7d4bd51e0beeda2877b4e0125 and got essentially identical
results (the latter was marginally faster). I suspect my box is neither
saturating the CPU nor bound much by the storage, so it's strictly a
critical path latency thing. I'm also running a full osd and using rbd
bench-write, like so
make vstart rbd ; MON=1 OSD=1 MDS=0 ../src/vstart.sh -n -x -l
--bluestore
; bin/rbd create foo --size 1000 ; bin/rbd bench-write foo --io-size
4096
--io-threads 32 --io-total 100000000000 --no-rbd-cache --io-pattern rand
Mark, what are you seeing?
:/
I did a bisect a week or two ago and the biggest thing I saw was the
regression due to rocksdb losing optimization flags when we made it an
external project. That was indeed a large regression, but it hsould
be fixed as of last week. Otherwise I've seen a bit of variability
across commits, but nothing like what Igor's seeing. Given that (at
least in our setup) rocksdb compaction is basically the bottleneck,
performance can vary pretty greatly. This is especially true for short
runs, depending on whether or not you've hit a major compaction stall
in the test. I imagine disabling csums probably helps, but I haven't
been testing that way.
Igor, if you have time, it might be worth looking at perf and also the
rocksdb compaction statistics in the OSD logs (and throughput over
time plots) for the lowest and highest performing commits. I'm
surprised you are seeing such a large variation. It would be worth
knowing what's going on.
Mark
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html