Re: PGs stuck deep-scrubbing for weeks - 16.2.9

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Apologies, backport link should be: https://github.com/ceph/ceph/pull/46845

On Fri, Jul 15, 2022 at 9:14 PM David Orman <ormandj@xxxxxxxxxxxx> wrote:

> I think you may have hit the same bug we encountered. Cory submitted a
> fix, see if it fits what you've encountered:
>
> https://github.com/ceph/ceph/pull/46727 (backport to Pacific here:
> https://github.com/ceph/ceph/pull/46877 )
> https://tracker.ceph.com/issues/54172
>
> On Fri, Jul 15, 2022 at 8:52 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx>
> wrote:
>
>> We have two clusters one 14.2.22 -> 16.2.7 -> 16.2.9
>>
>> Another 16.2.7 -> 16.2.9
>>
>> Both with a multi disk (spinner block / ssd block.db) and both CephFS
>> around 600 OSDs each with combo of rep-3 and 8+3 EC data pools. Examples
>> of
>> stuck scrubbing PGs from all of the pools.
>>
>> They have generally been behind on scrubbing which we attributed to simply
>> being large disks (10TB) with a heavy write load and the OSDs just having
>> trouble keeping up. On closer inspection it appears we have many PGs that
>> have been lodged in a deep scrubbing state on one cluster for 2 weeks and
>> another for 7 weeks. Wondering if others have been experiencing anything
>> similar. The only example of PGs being stuck scrubbing I have seen in the
>> past has been related to snaptrim PG state but we arent doing anything
>> with
>> snapshots in these new clusters.
>>
>> Granted my cluster has been warning me with "pgs not deep-scrubbed in
>> time"
>> and its on me for not looking more closely into why. Perhaps a separate
>> warning of "PG Stuck Scrubbing for greater than 24 hours" or similar might
>> be helpful to an operator.
>>
>> In any case I was able to get scrubs proceeding again by restarting the
>> primary OSD daemon in the PGs which were stuck. Will monitor closely for
>> additional stuck scrubs.
>>
>>
>> Respectfully,
>>
>> *Wes Dillingham*
>> wes@xxxxxxxxxxxxxxxxx
>> LinkedIn <http://www.linkedin.com/in/wesleydillingham>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux