Re: storing pg logs outside of rocksdb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 28, 2018 at 9:43 AM, Josh Durgin <jdurgin@xxxxxxxxxx> wrote:
> Hi Lisa, your presentation last week at Cephalocon was quite convincing.
>
> Recordings aren't available yet, so perhaps you can share your slides.

Here are the slides:
https://drive.google.com/file/d/1WC0id77KWLNVllsEcJCgRgEQ-Xzvzqx8/view?usp=sharing
>
> For those who weren't there, Lisa tested many configurations of rocksdb
> with bluestore to attempt to keep the pg log out of level 0 in rocksdb,
> and thus avoid a large source of write amplification.
>
> None of these tunings were successful, so the conclusion was that the pg
> log ought to be stored outside of rocksdb.
>
> Lisa, what are your thoughts on how to store the pg log?
>
> For historical reference, it was moved into leveldb originally to make
> it easier to program against correctly [0], but the current PGLog code
> has grown too complex despite that.
I ever wondered whether we can just put pg log in standalone log
files. The read performance is not critical as they are read when an
OSD node recovers. That is to store other metadata in RocksDB and then
store pg log in standalone journal files. (No transaction for other
metadata and pg log). But then I noticed that we can't differentiate
which OSD has latest data if 3 OSD nodes which contain same pgs fail
during a write request. Some OSDs may have updated data, and other
OSDs may have un-undated data, which all of these have no pg log
appended. In this case, it needs to compare the full objects.

Another method I am investigating is that whether in Rocksdb we can
use fifo case just for pg log. That means we need to handle for each
pg. This needs to update in Rocksdb and every pg log will be written
twice at most. (One to Rocksdb log, and one to level 0).

Any suggestions?

>
> Josh
>
> [0]
> https://github.com/ceph/ceph/commit/1ef94200e9bce5e0f0ac5d1e563421a9d036c203
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Best wishes
Lisa
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux