Re: block.db/block.wal device performance dropped after upgrade to 14.2.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yeah, I know various folks have adopted those settings, though I'm not convinced they are better than our defaults.  Basically you have more smaller buffers and start compacting sooner and theoretically should have a more gradual throttle along with a bunch of changes to compaction, but every time I've tried a setup like that I see more write amplification in L0 presumably due to a larger number of pglog entries not being tomstoned before hitting it (at least on our systems it's not faster at this time, and imposes more wear on DB device).  I suspect something closer to those settings will be better though if we can change the pglog to create/delete new kv pairs for every pglog entry.


In any event, that's good to know about compaction not being involved.  I think this may be a case where the double-caching fix might help significantly if we stop thrashing the rocksdb block cache: https://github.com/ceph/ceph/pull/27705


Mark


On 8/10/20 2:28 AM, Manuel Lausch wrote:
Hi Mark,

rocskdb compactions was one of my first ideas as well. But they don't
correlate. I checkt this with the ceph_rocskdb_log_parser.py from
https://github.com/ceph/cbt.git
I saw only a few compactions on the whole cluster. It didn't seem to be
the problem, although the compactions sometimes took several seconds.

BTW: I configured the following rocksdb options.
   bluestore rocksdb options = compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB

This reduced some IO spikes but the slowops isse while snaptim was not
affected by this.


Manuel

On Fri, 7 Aug 2020 09:43:51 -0500
Mark Nelson <mnelson@xxxxxxxxxx> wrote:

That is super interesting regarding scrubbing.  I would have expected
that to be affected as well.  Any  chance you can check and see if
there is any correlation between rocksdb compaction events, snap
trimming, and increased disk reads?  Also (Sorry if you already
answered this) do we know for sure that it's hitting the
block.db/block.wal device?  I suspect it is, just wanted to verify.


Mark


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux