Re: Reef (18.2): Some PG not scrubbed/deep scrubbed for 1 month

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

i have the same issues.
Deep scrub havent finished the jobs on some PGs.

Using ceph 18.2.2.
Initial installed version was 18.0.0


In the logs i see a lot of scrub/deep-scrub starts

Mar 21 14:21:09 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.b deep-scrubstarts Mar 21 14:21:10 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.1a deep-scrubstarts Mar 21 14:21:17 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.1c deep-scrubstarts Mar 21 14:21:19 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 11.1 scrubstarts Mar 21 14:21:27 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 14.6 scrubstarts Mar 21 14:21:30 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 10.c deep-scrubstarts Mar 21 14:21:35 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 12.3 deep-scrubstarts Mar 21 14:21:41 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 6.0 scrubstarts Mar 21 14:21:44 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 8.5 deep-scrubstarts Mar 21 14:21:45 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 5.66 deep-scrubstarts Mar 21 14:21:49 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 5.30 deep-scrubstarts Mar 21 14:21:50 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.b deep-scrubstarts Mar 21 14:21:52 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.1a deep-scrubstarts Mar 21 14:21:54 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.1c deep-scrubstarts Mar 21 14:21:55 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 11.1 scrubstarts Mar 21 14:21:58 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 14.6 scrubstarts Mar 21 14:22:01 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 10.c deep-scrubstarts Mar 21 14:22:04 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 12.3 scrubstarts Mar 21 14:22:13 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 6.0 scrubstarts Mar 21 14:22:15 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 8.5 deep-scrubstarts Mar 21 14:22:20 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 5.66 deep-scrubstarts Mar 21 14:22:27 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 5.30 scrubstarts Mar 21 14:22:30 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.b deep-scrubstarts Mar 21 14:22:32 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.1a deep-scrubstarts Mar 21 14:22:33 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.1c deep-scrubstarts Mar 21 14:22:35 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 11.1 deep-scrubstarts Mar 21 14:22:37 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 14.6 scrubstarts Mar 21 14:22:38 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 10.c scrubstarts Mar 21 14:22:39 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 12.3 scrubstarts Mar 21 14:22:41 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 6.0 deep-scrubstarts Mar 21 14:22:43 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 8.5 deep-scrubstarts Mar 21 14:22:46 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 5.66 deep-scrubstarts Mar 21 14:22:49 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 5.30 scrubstarts Mar 21 14:22:55 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.b deep-scrubstarts Mar 21 14:22:57 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.1a deep-scrubstarts Mar 21 14:22:58 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 13.1c deep-scrubstarts Mar 21 14:23:03 ceph-node10 ceph-osd[3804193]: log_channel(cluster) log [DBG] : 11.1 deep-scrubstarts



*
*The amount of scrubbed/deep-scrubbed pgs changes every few seconds.

[root@ceph-node10 ~]# ceph -s | grep active+clean
   pgs:     214 active+clean
            50 active+clean+scrubbing+deep
            25 active+clean+scrubbing
[root@ceph-node10 ~]# ceph -s | grep active+clean
   pgs:     208 active+clean
            53 active+clean+scrubbing+deep
            28 active+clean+scrubbing
[root@ceph-node10 ~]# ceph -s | grep active+clean
   pgs:     208 active+clean
            53 active+clean+scrubbing+deep
            28 active+clean+scrubbing
[root@ceph-node10 ~]# ceph -s | grep active+clean
   pgs:     207 active+clean
            54 active+clean+scrubbing+deep
            28 active+clean+scrubbing
[root@ceph-node10 ~]# ceph -s | grep active+clean
   pgs:     202 active+clean
            56 active+clean+scrubbing+deep
            31 active+clean+scrubbing
[root@ceph-node10 ~]# ceph -s | grep active+clean
   pgs:     213 active+clean
            45 active+clean+scrubbing+deep
            31 active+clean+scrubbing

ceph pg dump showing PGs which are not deep scrubbed since january.
Some PGs deep scrubbing  over 700000 seconds.

*[ceph: root@ceph-node10 /]#  ceph pg dump pgs | grep -e 'scrubbing f'
5.6e      221223                   0         0          0        0  927795290112            0           0  4073      3000      4073  active+clean+scrubbing+deep  2024-03-20T01:07:21.196293+ 0000  128383'15766927  128383:20517419   [2,4,18,16,14,21]           2   [2,4,18,16,14,21]               2  125519'12328877  2024-01-23T11:25:35.503811+0000  124844'11873951  2024-01-21T22: 24:12.620693+0000              0                    5  deep scrubbing for 270790s                                             53772                0 5.6c      221317                   0         0          0        0  928173256704            0           0  6332         0      6332  active+clean+scrubbing+deep  2024-03-18T09:29:29.233084+ 0000  128382'15788196  128383:20727318     [6,9,12,14,1,4]           6     [6,9,12,14,1,4]               6  127180'14709746  2024-03-06T12:47:57.741921+0000  124817'11821502  2024-01-20T20: 59:40.566384+0000              0                13452  deep scrubbing for 273519s                                            122803                0 5.6a      221325                   0         0          0        0  928184565760            0           0  4649      3000      4649  active+clean+scrubbing+deep  2024-03-13T03:48:54.065125+ 0000  128382'16031499  128383:21221685     [13,11,1,2,9,8]          13     [13,11,1,2,9,8]              13  127181'14915404  2024-03-06T13:16:58.635982+0000  125967'12517899  2024-01-28T09: 13:08.276930+0000              0                10078  deep scrubbing for 726001s                                            184819                0 5.54      221050                   0         0          0        0  927036203008            0           0  4864      3000      4864  active+clean+scrubbing+deep  2024-03-18T00:17:48.086231+ 0000  128383'15584012  128383:20293678  [0,20,18,19,11,12]           0  [0,20,18,19,11,12]               0  127195'14651908  2024-03-07T09:22:31.078448+0000  124816'11813857  2024-01-20T16: 43:15.755200+0000              0                 9808  deep scrubbing for 306667s                                            142126                0 5.47      220849                   0         0          0        0  926233448448            0           0  5592         0      5592  active+clean+scrubbing+deep  2024-03-12T08:10:39.413186+ 0000  128382'15653864  128383:20403071  [16,15,20,0,13,21]          16  [16,15,20,0,13,21]              16  127183'14600433  2024-03-06T18:21:03.057165+0000  124809'11792397  2024-01-20T05: 27:07.617799+0000              0                13066  deep scrubbing for 796697s                                            209193                0
dumped pgs


*


regards
Bernhard






On 20/03/2024 21:12, Bandelow, Gunnar wrote:
Hi,

i just wanted to mention, that i am running a cluster with reef 18.2.1 with the same issue.

4 PGs start to deepscrub but dont finish since mid february. In the pg dump they are shown as scheduled for deep scrub. They sometimes change their status from active+clean to active+clean+scrubbing+deep and back.

Best regards,
Gunnar

=======================================================

Gunnar Bandelow
Universitätsrechenzentrum (URZ)
Universität Greifswald
Felix-Hausdorff-Straße 18
17489 Greifswald
Germany

Tel.: +49 3834 420 1450




--- Original Nachricht ---
*Betreff: * Re: Reef (18.2): Some PG not scrubbed/deep scrubbed for 1 month *Von: *"Michel Jouvin" <michel.jouvin@xxxxxxxxxxxxxxx <mailto:michel.jouvin@xxxxxxxxxxxxxxx>>
*An: *ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
*Datum: *20-03-2024 20:00



    Hi Rafael,

    Good to know I am not alone!

    Additional information ~6h after the OSD restart: over the 20 PGs
    impacted, 2 have been processed successfully... I don't have a clear
    picture on how Ceph prioritize the scrub of one PG over another, I
    had
    thought that the oldest/expired scrubs are taken first but it may
    not be
    the case. Anyway, I have seen a very significant decrese of the scrub
    activity this afternoon and the cluster is not loaded at all
    (almost no
    users yet)...

    Michel

    Le 20/03/2024 à 17:55, quaglio@xxxxxxxxxx
    <mailto:quaglio@xxxxxxxxxx> a écrit :
    > Hi,
    >      I upgraded a cluster 2 weeks ago here. The situation is the
    same
    > as Michel.
    >      A lot of PGs no scrubbed/deep-scrubed.
    >
    > Rafael.
    >
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    > To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>


_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux