Re: Ceph reef and (slow) backfilling - how to speed it up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In our case it was with a EC pool as well. I believe the PG state was
degraded+recovering / recovery_wait and iirc the PGs just simply sat in the
recovering state without any progress (degraded PG object count did not
decline). A repeer of the PG was attempted but no success there. A restart
of all the OSDs for the given PGs was attempted under mclock. That didnt
work. Switching to wpq for all OSDS in the given PG did resolve the issue.
This was on a 17.2.7 cluster.

Respectfully,

*Wes Dillingham*
LinkedIn <http://www.linkedin.com/in/wesleydillingham>
wes@xxxxxxxxxxxxxxxxx




On Thu, May 2, 2024 at 9:54 AM Sridhar Seshasayee <sseshasa@xxxxxxxxxx>
wrote:

> >
> > Multiple people -- including me -- have also observed backfill/recovery
> > stop completely for no apparent reason.
> >
> > In some cases poking the lead OSD for a PG with `ceph osd down` restores,
> > in other cases it doesn't.
> >
> > Anecdotally this *may* only happen for EC pools on HDDs but that sample
> > size is small.
> >
> >
> Thanks for the information. We will try and reproduce this locally with EC
> pools and investigate this further.
> I will revert with a tracker for this.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux