Re: osd daemons still reading disks at full speed while there is no pool activity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nikola,

> yes, some nodes have stray pgs (1..5)  shell I do something about those?

No need to do anything - Ceph will clean those up itself (and is doing
so right now). I just wanted to confirm my hunch.

Enabling buffered I/O should have an immediate effect on the read rate
to your disks. I would recommend upgrading to 14.2.17+, though, as the
improvements to PG cleaning are pretty substantial.

Josh

On Wed, Nov 3, 2021 at 8:13 AM Nikola Ciprich
<nikola.ciprich@xxxxxxxxxxx> wrote:
>
> Hello Josh,
> >
> > Was there PG movement (backfill) happening in this cluster recently?
> > Do the OSDs have stray PGs (e.g. 'ceph daemon osd.NN perf dump | grep
> > numpg_stray' - run this against an affected OSD from the housing
> > node)?
> yes, some nodes have stray pgs (1..5)  shell I do something about those?
>
>
> >
> > I'm wondering if you're running into
> > https://tracker.ceph.com/issues/45765, where cleaning of PGs from OSDs
> hmm, yes, this seems very familiar, problems started with using balancer,
> forgot to mention that!
>
> > leads to a high read rate from disk due to a combination of rocksdb
> > behaviour and caching issues. Turning on bluefs_buffered_io (on by
> > default in 14.2.22) is a mitigation for this problem, but has some
> > side effects to watch out for (write IOPS amplification, for one).
> > Fixes for that linked issue went into 14.2.17, 14.2.22, and then
> > Pacific; we found the 14.2.17 changes to be quite effective by
> > themselves.
> >
> > Even if you don't have stray PGs, trying bluefs_buffered_io might be
> > an interesting experiment. An alternative would be to compact rocksdb
> > for each of your OSDs and see if that helps; compacting eliminates the
> > tombstoned data that can cause problems during iteration, but if you
> > have a workload that generates a lot of rocksdb tombstones (like PG
> > cleaning does), then the problem will return a while after compaction.
> >
>
> hmm, I'll try enabling bluefs_buffered_io (it was indeed false) and do the
> compaction as well anyways..
>
> I'll report back, thanks for the hints!
>
> BR
>
> nik
>
>
> > Josh
> >
>
> --
> -------------------------------------
> Ing. Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28.rijna 168, 709 00 Ostrava
>
> tel.:   +420 591 166 214
> fax:    +420 596 621 273
> mobil:  +420 777 093 799
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: servis@xxxxxxxxxxx
> -------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux