Re: Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools (bug 53663)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Christian,

since my last mail in Dezember, we changed our ceph-setuo like this:

we added one SSD osd on each ceph host (which were pure HDD before). Then, we moved the problematic pool "de-dus5.rgw.buckets.index“ to those dedicated SSDs (by adding a corresponding crush map).

Since then, no further PG corruptions occurred.

This now has a two sided result:

on the one side, we now do not observe the problematic behavior anymore, 

on the other side, this means, by using just spinning HDDs something is buggy with ceph. If the HDD can not fulfill the data IO requirements, then it probably should not lead to data/PG corruption… 
And, just a blind guess, we only have a few IO requests in our RGW gateway per second - even with spinning HDDs there should not be a problem to store / update the index pool.

I would guess that it correlates with our setup having 7001 shards in the problematic bucket, and the implementation of „multisite“ feature, which will do 7001 „status“ requests per second to check and synchronize between the different rgw sites. And _this_ amount of random IO can not be satisfied by utilizing HDDs…
Anyway it should not lead to corrupted PGs.

Best
Stefan


> Am 08.02.2022 um 16:39 schrieb Christian Rohmann <christian.rohmann@xxxxxxxxx>:
> 
> Hey there again,
> 
> there now was a question from Neha Ojha in https://tracker.ceph.com/issues/53663
> about providing OSD debug logs for a manual deep-scrub on (inconsistent) PGs.
> 
> I did provide the logs of two of those deep-scrubs via ceph-post-file already.
> 
> But since data inconsistencies are the worse of bugs and adding some unpredictability to their occurrence we likely need
> more evidence to have a chance to narrow this down. And since you seem to observe something similar,  could you gather
> and post debug info about them to the ticket as well maybe?
> 
> 
> Regards
> 
> Christian
> 

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux