\On 03/27/2018 08:43 PM, Josh Durgin wrote:
Hi Lisa, your presentation last week at Cephalocon was quite convincing.
Recordings aren't available yet, so perhaps you can share your slides.
For those who weren't there, Lisa tested many configurations of rocksdb
with bluestore to attempt to keep the pg log out of level 0 in rocksdb,
and thus avoid a large source of write amplification.
None of these tunings were successful, so the conclusion was that the pg
log ought to be stored outside of rocksdb.
Lisa, what are your thoughts on how to store the pg log?
For historical reference, it was moved into leveldb originally to make
it easier to program against correctly [0], but the current PGLog code
has grown too complex despite that.
FWIW, here's a link to our discussion on the list a while back regarding
the same topic:
https://www.spinics.net/lists/ceph-devel/msg38975.html
Beyond just rocksdb: Last month when I was on break I forked memstore
and started tweaking it to run faster. I noticed that beyond how bad
pglog/dup_ops (and maybe pginfo) are for rocksdb, it also causes
significant CPU overhead when doing random writes to a completely
in-memory object representation. I saw maybe 10-20% increased
performance while also dropping cpu consumption from ~5 cores to ~3
cores after hacking log_operation (and a couple of other things) out.
We were spending more wallclock time dealing with pglog/dup_ops/pginfo
than we were actually writing out the object data/metadata.
Mark
Josh
[0]
https://github.com/ceph/ceph/commit/1ef94200e9bce5e0f0ac5d1e563421a9d036c203
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html