Re: Contionuous spurious repairs without cause?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

thanks for the hint. We’re definitely running exact same binaries for all. :)

> On 5. Sep 2023, at 16:14, Eugen Block <eblock@xxxxxx> wrote:
> 
> Hi,
> 
> it sounds like you have auto-repair enabled (osd_scrub_auto_repair). I guess you could disable that to see what's going on with the PGs and their replicas. And/or you could enable debug logs. Are all daemons running the same ceph (minor) version? I remember a customer case where different ceph minor versions (but overall Octopus) caused damaged PGs, a repair fixed them everytime. After they updated all daemons to the same minor version those errors were gone.
> 
> Regards,
> Eugen
> 
> Zitat von Christian Theune <ct@xxxxxxxxxxxxxxx>:
> 
>> Hi,
>> 
>> this is a bit older cluster (Nautilus, bluestore only).
>> 
>> We’ve noticed that the cluster is almost continuously repairing PGs. However, they all finish successfully with “0 fixed”. We do not see the trigger why Ceph decides to repair the PGs and it’s happening for a lot of PGs, not any specific individual one.
>> 
>> Deep-scrubs are generally running, but currently a bit late as we had some recoveries in the last week.
>> 
>> Logs look regular aside from the number of repairs. Here’s the last weeks from the perspective of a single PG. There’s one repair, but the same thing seems to happen for all PGs.
>> 
>> 2023-08-06 16:08:17.870 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-06 16:08:18.270 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-07 21:52:22.299 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-07 21:52:22.711 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-09 00:33:42.587 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-09 00:33:43.049 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-10 09:36:00.590 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub starts
>> 2023-08-10 09:36:28.811 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub ok
>> 2023-08-11 12:59:14.219 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-11 12:59:14.567 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-12 13:52:44.073 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-12 13:52:44.483 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-14 01:51:04.774 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub starts
>> 2023-08-14 01:51:33.113 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub ok
>> 2023-08-15 05:18:16.093 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-15 05:18:16.520 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-16 09:47:38.520 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-16 09:47:38.930 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-17 19:25:45.352 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-17 19:25:45.775 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-19 05:40:43.663 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-19 05:40:44.073 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-20 12:06:54.343 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-20 12:06:54.809 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-21 19:23:10.801 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub starts
>> 2023-08-21 19:23:39.936 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub ok
>> 2023-08-23 03:43:21.391 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-23 03:43:21.844 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-24 04:21:17.004 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub starts
>> 2023-08-24 04:21:47.972 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub ok
>> 2023-08-25 06:55:13.588 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-25 06:55:14.087 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-26 09:26:01.174 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-26 09:26:01.561 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-27 11:18:10.828 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-27 11:18:11.264 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-28 19:05:42.104 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-28 19:05:42.693 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-30 07:03:10.327 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-08-30 07:03:10.805 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-08-31 14:43:23.849 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub starts
>> 2023-08-31 14:43:50.723 7fc49b1de640  0 log_channel(cluster) log [DBG] : 278.2f3 deep-scrub ok
>> 2023-09-01 20:53:42.749 7f37ca268640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-09-01 20:53:43.389 7f37c6260640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-09-02 22:57:49.542 7f37ca268640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-09-02 22:57:50.065 7f37c6260640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-09-04 03:16:14.754 7f37ca268640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub starts
>> 2023-09-04 03:16:15.295 7f37ca268640  0 log_channel(cluster) log [DBG] : 278.2f3 scrub ok
>> 2023-09-05 14:50:36.064 7f37ca268640  0 log_channel(cluster) log [DBG] : 278.2f3 repair starts
>> 2023-09-05 14:51:04.407 7f37c6260640  0 log_channel(cluster) log [DBG] : 278.2f3 repair ok, 0 fixed
>> 
>> Googling didn’t help, unfortunately and the bug tracker doesn’t appear to have any relevant issue either.
>> 
>> Any ideas?
>> 
>> Liebe Grüße,
>> Christian Theune
>> 
>> --
>> Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0
>> Flying Circus Internet Operations GmbH · https://flyingcircus.io
>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

Liebe Grüße,
Christian Theune

-- 
Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux