Re: block.db/block.wal device performance dropped after upgrade to 14.2.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm facing this issue too and I see the attached rocksdb log from Mark in
my cluster which means there is a burst read on my block.db. I've sent some
information from my issue in this thread[1]. Hope you help me with what's
going on in my cluster.

Thanks.

[1]:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/PHB53F3OD7QN5FG3CXGKTLWE77OHIBBO/

On Mon, Aug 10, 2020 at 8:05 PM Mark Nelson <mnelson@xxxxxxxxxx> wrote:

> Yeah, I know various folks have adopted those settings, though I'm not
> convinced they are better than our defaults.  Basically you have more
> smaller buffers and start compacting sooner and theoretically should
> have a more gradual throttle along with a bunch of changes to
> compaction, but every time I've tried a setup like that I see more write
> amplification in L0 presumably due to a larger number of pglog entries
> not being tomstoned before hitting it (at least on our systems it's not
> faster at this time, and imposes more wear on DB device).  I suspect
> something closer to those settings will be better though if we can
> change the pglog to create/delete new kv pairs for every pglog entry.
>
>
> In any event, that's good to know about compaction not being involved.
> I think this may be a case where the double-caching fix might help
> significantly if we stop thrashing the rocksdb block cache:
> https://github.com/ceph/ceph/pull/27705
>
>
> Mark
>
>
> On 8/10/20 2:28 AM, Manuel Lausch wrote:
> > Hi Mark,
> >
> > rocskdb compactions was one of my first ideas as well. But they don't
> > correlate. I checkt this with the ceph_rocskdb_log_parser.py from
> > https://github.com/ceph/cbt.git
> > I saw only a few compactions on the whole cluster. It didn't seem to be
> > the problem, although the compactions sometimes took several seconds.
> >
> > BTW: I configured the following rocksdb options.
> >    bluestore rocksdb options =
> compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB
> >
> > This reduced some IO spikes but the slowops isse while snaptim was not
> > affected by this.
> >
> >
> > Manuel
> >
> > On Fri, 7 Aug 2020 09:43:51 -0500
> > Mark Nelson <mnelson@xxxxxxxxxx> wrote:
> >
> >> That is super interesting regarding scrubbing.  I would have expected
> >> that to be affected as well.  Any  chance you can check and see if
> >> there is any correlation between rocksdb compaction events, snap
> >> trimming, and increased disk reads?  Also (Sorry if you already
> >> answered this) do we know for sure that it's hitting the
> >> block.db/block.wal device?  I suspect it is, just wanted to verify.
> >>
> >>
> >> Mark
> >>
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux