Re: Reef (18.2): Some PG not scrubbed/deep scrubbed for 1 month

Michel Jouvin <michel.jouvin@xxxxxxxxxxxxxxx> · Fri, 22 Mar 2024 10:11:00 +0100

Pierre,

Yes, as mentioned in my initial email, I checked the OSD state and found 
nothing wrong either in the OSD logs or in the system logs (SMART errors).

Thanks for the advice of increasing osd_max_scrubs, I may try it, but I 
doubt it is a contention problem because it really only affects a fixed 
set of PGs (no new PGS have a "stucked scrub") and there is a 
significant scrubbing activity going on continuously (~10K PGs in the 
cluster).

Again, it is not a problem for me to try to kick out the suspect OSD and 
see it fixes the issue but as this cluster is pretty simple/low in terms 
of activity and I see nothing that may explain why we have this 
situation on a pretty new cluster (9 months, created in Quincy) and not 
on our 2 other production clusters, much more used, one of them being 
the backend storage of a significant OpenStack clouds, a cluster created 
10 years ago with Infernetis and upgraded since then, a better candidate 
for this kind of problems! So, I'm happy to contribute to 
troubleshooting a potential issue in Reef if somebody finds it useful 
and can help. Else I'll try the approach that worked for Gunnar.

Best regards,

Michel

Le 22/03/2024 à 09:59, Pierre Riteau a écrit :
Hello Michel,

It might be worth mentioning that the next releases of Reef and Quincy 
should increase the default value of osd_max_scrubs from 1 to 3. See 
the Reef pull request: https://github.com/ceph/ceph/pull/55173
You could try increasing this configuration setting if you 
haven't already, but note that it can impact client I/O performance.

Also, if the delays appear to be related to a single OSD, have you 
checked the health and performance of this device?

On Fri, 22 Mar 2024 at 09:29, Michel Jouvin 
<michel.jouvin@xxxxxxxxxxxxxxx> wrote:

    Hi,

    As I said in my initial message, I'd in mind to do exactly the
    same as I
    identified in my initial analysis that all the PGs with this problem
    where sharing one OSD (but only 20 PGs had the problem over ~200
    hosted
    by the OSD). But as I don't feel I'm in an urgent situation, I was
    wondering if collecting more information on the problem may have some
    value and which one... If it helps, I add below the `pg dump` for
    the 17
    PGs still with a "stucked scrub".

    I observed the "stucked scrubs" is lowering very slowly. In the
    last 12
    hours, 1 more PG was successfully scrubbed/deep scrubbed. In case
    it was
    not clear in my initial message, the lists of PGs with a too old
    scrub
    and too old deep scrub are the same.

    Without an answer, next week i may consider doing what you did:
    remove
    the suspect OSD (instead of just restarting it) and see it
    unblocks the
    stucked scrubs.

    Best regards,

    Michel

    --------------------------------- "ceph pg dump pgs" for the 17
    PGs with
    a too old scrub and deep scrub (same list)
    ------------------------------------------------------------

    PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED UNFOUND
    BYTES        OMAP_BYTES*  OMAP_KEYS*  LOG    LOG_DUPS DISK_LOG  STATE
    STATE_STAMP                      VERSION       REPORTED
    UP                 UP_PRIMARY  ACTING ACTING_PRIMARY
    LAST_SCRUB    SCRUB_STAMP LAST_DEEP_SCRUB
    DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN LAST_SCRUB_DURATION
    SCRUB_SCHEDULING OBJECTS_SCRUBBED  OBJECTS_TRIMMED
    29.7e3       260                   0         0          0 0
    1090519040            0           0   1978       500
    1978                 active+clean 2024-03-21T18:28:53.369789+0000
    39202'2478    83812:97136 [29,141,64,194]          29
    [29,141,64,194]              29 39202'2478
    2024-02-17T19:56:34.413412+0000       39202'2478
    2024-02-17T19:56:34.413412+0000              0 3  queued for deep
    scrub
    0                0
    25.7cc         0                   0         0          0 0
    0            0           0      0      1076 0
    active+clean 2024-03-21T18:09:48.104279+0000     46253'548
    83812:89843        [29,50,173]          29 [29,50,173]
    29     39159'536 2024-02-17T18:14:54.950401+0000 39159'536
    2024-02-17T18:14:54.950401+0000              0 1  queued for deep
    scrub
    0                0
    25.70c         0                   0         0          0 0
    0            0           0      0       918 0
    active+clean 2024-03-21T18:00:57.942902+0000 46253'514
    83812:95212 [29,195,185]          29 [29,195,185]              29
    39159'530  2024-02-18T03:56:17.559531+0000        39159'530
    2024-02-16T17:39:03.281785+0000              0 1  queued for deep
    scrub
    0                0
    29.70c       249                   0         0          0 0
    1044381696            0           0   1987       600
    1987                 active+clean 2024-03-21T18:35:36.848189+0000
    39202'2587    83812:99628 [29,138,63,12]          29
    [29,138,63,12]              29 39202'2587
    2024-02-17T21:34:22.042560+0000       39202'2587
    2024-02-17T21:34:22.042560+0000              0 1  queued for deep
    scrub
    0                0
    29.705       231                   0         0          0 0
    968884224            0           0   1959       500 1959
    active+clean 2024-03-21T18:18:22.028551+0000 39202'2459
    83812:91258 [29,147,173,61]          29 [29,147,173,61]
    29 39202'2459  2024-02-17T16:41:40.421763+0000 39202'2459
    2024-02-17T16:41:40.421763+0000              0 1  queued for deep
    scrub
    0                0
    29.6b9       236                   0         0          0 0
    989855744            0           0   1956       500 1956
    active+clean 2024-03-21T18:11:29.912132+0000 39202'2456
    83812:95607 [29,199,74,16]          29 [29,199,74,16]
    29 39202'2456  2024-02-17T11:46:06.706625+0000 39202'2456
    2024-02-17T11:46:06.706625+0000              0 1  queued for deep
    scrub
    0                0
    25.56e         0                   0         0          0 0
    0            0           0      0      1158 0
    active+clean+scrubbing+deep 2024-03-22T08:09:38.840145+0000
    46253'514   83812:637482 [111,29,128]         111
    [111,29,128]             111 39159'579
    2024-03-06T17:57:53.158936+0000        39159'579
    2024-03-06T17:57:53.158936+0000              0 1  queued for deep
    scrub
    0                0
    25.56a         0                   0         0          0 0
    0            0           0      0      1055 0
    active+clean 2024-03-21T18:00:57.940851+0000     46253'545
    83812:93475        [29,19,211]          29 [29,19,211]
    29     46253'545 2024-03-07T11:12:45.881545+0000 46253'545
    2024-03-07T11:12:45.881545+0000              0 28  queued for deep
    scrub
    0                0
    25.55a         0                   0         0          0 0
    0            0           0      0      1022 0
    active+clean 2024-03-21T18:10:24.124914+0000     46253'565
    83812:89876        [29,58,195]          29 [29,58,195]
    29     46253'561 2024-02-17T06:54:35.320454+0000 46253'561
    2024-02-17T06:54:35.320454+0000              0 28  queued for deep
    scrub
    0                0
    29.c0        256                   0         0          0 0
    1073741824            0           0   1986       600 1986
    active+clean+scrubbing+deep 2024-03-22T08:09:12.849868+0000
    39202'2586   83812:603625 [22,150,29,56]          22
    [22,150,29,56]              22 39202'2586
    2024-03-07T18:53:22.952868+0000       39202'2586
    2024-03-07T18:53:22.952868+0000              0 1  queued for deep
    scrub
    0                0
    18.6       15501                   0         0          0 0
    63959444676            0           0   2068      3000 2068
    active+clean+scrubbing+deep 2024-03-22T02:29:24.508889+0000
    81688'663900  83812:1272160 [187,29,211]         187
    [187,29,211]             187 52735'663878
    2024-03-06T16:36:32.080259+0000     52735'663878
    2024-03-06T16:36:32.080259+0000              0 684445  deep scrubbing
    for 20373s 449                0
    16.15          0                   0         0          0 0
    0            0           0      0         0 0
    active+clean 2024-03-21T18:20:29.632554+0000           0'0
    83812:104893        [29,165,85]          29 [29,165,85]
    29           0'0 2024-02-17T06:54:06.370647+0000              0'0
    2024-02-17T06:54:06.370647+0000              0 28  queued for deep
    scrub
    0                0
    25.45          0                   0         0          0 0
    0            0           0      0      1036 0
    active+clean 2024-03-21T18:10:24.125134+0000     39159'561
    83812:93649         [29,13,58]          29 [29,13,58]
    29     39159'512 2024-02-27T12:27:35.728176+0000 39159'512
    2024-02-27T12:27:35.728176+0000              0 1  queued for deep
    scrub
    0                0
    29.249       260                   0         0          0 0
    1090519040            0           0   1970       500
    1970                 active+clean 2024-03-21T18:29:22.588805+0000
    39202'2470    83812:96016 [29,191,18,143]          29
    [29,191,18,143]              29 39202'2470
    2024-02-17T13:32:42.910335+0000       39202'2470
    2024-02-17T13:32:42.910335+0000              0 1  queued for deep
    scrub
    0                0
    29.25a       248                   0         0          0 0
    1040187392            0           0   1952       600
    1952                 active+clean 2024-03-21T18:20:29.623422+0000
    39202'2552    83812:99157 [29,200,85,164]          29
    [29,200,85,164]              29 39202'2552
    2024-02-17T08:33:14.326087+0000       39202'2552
    2024-02-17T08:33:14.326087+0000              0 1  queued for deep
    scrub
    0                0
    25.3cf         0                   0         0          0 0
    0            0           0      0      1343 0
    active+clean 2024-03-21T18:16:00.933375+0000     46253'598
    83812:91659        [29,75,175]          29 [29,75,175]
    29     46253'598 2024-02-17T11:48:51.840600+0000 46253'598
    2024-02-17T11:48:51.840600+0000              0 28  queued for deep
    scrub
    0                0
    29.4ec       243                   0         0          0 0
    1019215872            0           0   1933       500
    1933                 active+clean 2024-03-21T18:15:35.389598+0000
    39202'2433   83812:101501 [29,206,63,17]          29
    [29,206,63,17]              29 39202'2433
    2024-02-17T15:10:41.027755+0000       39202'2433
    2024-02-17T15:10:41.027755+0000              0 3  queued for deep
    scrub
    0                0

    Le 22/03/2024 à 08:16, Bandelow, Gunnar a écrit :
    > Hi Michael,
    >
    > i think yesterday i found the culprit in my case.
    >
    > After inspecting "ceph pg dump" and especially the column
    > "last_scrub_duration". I found, that any PG without proper
    scrubbing
    > was located on one of three OSDs (and all these OSDs share the same
    > SSD for their DB). I put them on "out" and now after backfill and
    > remapping everything seems to be fine.
    >
    > Only the log is still flooded with "scrub starts" and i have no
    clue
    > why these OSDs are causing the problems.
    > Will investigate further.
    >
    > Best regards,
    > Gunnar
    >
    > ===================================
    >
    >  Gunnar Bandelow
    >  Universitätsrechenzentrum (URZ)
    >  Universität Greifswald
    >  Felix-Hausdorff-Straße 18
    >  17489 Greifswald
    >  Germany
    >
    >  Tel.: +49 3834 420 1450
    >
    >
    > --- Original Nachricht ---
    > *Betreff: * Re: Reef (18.2): Some PG not scrubbed/deep
    > scrubbed for 1 month
    > *Von: *"Michel Jouvin" <michel.jouvin@xxxxxxxxxxxxxxx
    > <mailto:michel.jouvin@xxxxxxxxxxxxxxx>>
    > *An: *ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
    > *Datum: *21-03-2024 23:40
    >
    >
    >
    >     Hi,
    >
    >     Today we decided to upgrade from 18.2.0 to 18.2.2. No real
    hope of a
    >     direct impact (nothing in the change log related to something
    >     similar)
    >     but at least all daemons were restarted so we thought that
    may be
    >     this
    >     will clear the problem at least temporarily. Unfortunately
    it has not
    >     been the case. The same pages are still stuck, despite
    continuous
    >     activity of scrubbing/deep scrubbing in the cluster...
    >
    >     I'm happy to provide more information if somebody tells me
    what to
    >     look
    >     at...
    >
    >     Cheers,
    >
    >     Michel
    >
    >     Le 21/03/2024 à 14:40, Bernhard Krieger a écrit :
    >     > Hi,
    >     >
    >     > i have the same issues.
    >     > Deep scrub havent finished the jobs on some PGs.
    >     >
    >     > Using ceph 18.2.2.
    >     > Initial installed version was 18.0.0
    >     >
    >     >
    >     > In the logs i see a lot of scrub/deep-scrub starts
    >     >
    >     > Mar 21 14:21:09 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.b deep-scrubstarts
    >     > Mar 21 14:21:10 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.1a deep-scrubstarts
    >     > Mar 21 14:21:17 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.1c deep-scrubstarts
    >     > Mar 21 14:21:19 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 11.1 scrubstarts
    >     > Mar 21 14:21:27 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 14.6 scrubstarts
    >     > Mar 21 14:21:30 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 10.c deep-scrubstarts
    >     > Mar 21 14:21:35 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 12.3 deep-scrubstarts
    >     > Mar 21 14:21:41 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 6.0 scrubstarts
    >     > Mar 21 14:21:44 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 8.5 deep-scrubstarts
    >     > Mar 21 14:21:45 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 5.66 deep-scrubstarts
    >     > Mar 21 14:21:49 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 5.30 deep-scrubstarts
    >     > Mar 21 14:21:50 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.b deep-scrubstarts
    >     > Mar 21 14:21:52 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.1a deep-scrubstarts
    >     > Mar 21 14:21:54 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.1c deep-scrubstarts
    >     > Mar 21 14:21:55 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 11.1 scrubstarts
    >     > Mar 21 14:21:58 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 14.6 scrubstarts
    >     > Mar 21 14:22:01 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 10.c deep-scrubstarts
    >     > Mar 21 14:22:04 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 12.3 scrubstarts
    >     > Mar 21 14:22:13 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 6.0 scrubstarts
    >     > Mar 21 14:22:15 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 8.5 deep-scrubstarts
    >     > Mar 21 14:22:20 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 5.66 deep-scrubstarts
    >     > Mar 21 14:22:27 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 5.30 scrubstarts
    >     > Mar 21 14:22:30 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.b deep-scrubstarts
    >     > Mar 21 14:22:32 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.1a deep-scrubstarts
    >     > Mar 21 14:22:33 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.1c deep-scrubstarts
    >     > Mar 21 14:22:35 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 11.1 deep-scrubstarts
    >     > Mar 21 14:22:37 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 14.6 scrubstarts
    >     > Mar 21 14:22:38 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 10.c scrubstarts
    >     > Mar 21 14:22:39 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 12.3 scrubstarts
    >     > Mar 21 14:22:41 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 6.0 deep-scrubstarts
    >     > Mar 21 14:22:43 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 8.5 deep-scrubstarts
    >     > Mar 21 14:22:46 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 5.66 deep-scrubstarts
    >     > Mar 21 14:22:49 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 5.30 scrubstarts
    >     > Mar 21 14:22:55 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.b deep-scrubstarts
    >     > Mar 21 14:22:57 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.1a deep-scrubstarts
    >     > Mar 21 14:22:58 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 13.1c deep-scrubstarts
    >     > Mar 21 14:23:03 ceph-node10 ceph-osd[3804193]:
    log_channel(cluster)
    >     > log [DBG] : 11.1 deep-scrubstarts
    >     >
    >     >
    >     >
    >     > *
    >     > *The amount of scrubbed/deep-scrubbed pgs changes every
    few seconds.
    >     >
    >     > [root@ceph-node10 ~]# ceph -s | grep active+clean
    >     >    pgs:     214 active+clean
    >     >             50 active+clean+scrubbing+deep
    >     >             25 active+clean+scrubbing
    >     > [root@ceph-node10 ~]# ceph -s | grep active+clean
    >     >    pgs:     208 active+clean
    >     >             53 active+clean+scrubbing+deep
    >     >             28 active+clean+scrubbing
    >     > [root@ceph-node10 ~]# ceph -s | grep active+clean
    >     >    pgs:     208 active+clean
    >     >             53 active+clean+scrubbing+deep
    >     >             28 active+clean+scrubbing
    >     > [root@ceph-node10 ~]# ceph -s | grep active+clean
    >     >    pgs:     207 active+clean
    >     >             54 active+clean+scrubbing+deep
    >     >             28 active+clean+scrubbing
    >     > [root@ceph-node10 ~]# ceph -s | grep active+clean
    >     >    pgs:     202 active+clean
    >     >             56 active+clean+scrubbing+deep
    >     >             31 active+clean+scrubbing
    >     > [root@ceph-node10 ~]# ceph -s | grep active+clean
    >     >    pgs:     213 active+clean
    >     >             45 active+clean+scrubbing+deep
    >     >             31 active+clean+scrubbing
    >     >
    >     > ceph pg dump showing PGs which are not deep scrubbed since
    january.
    >     > Some PGs deep scrubbing  over 700000 seconds.
    >     >
    >     > *[ceph: root@ceph-node10 /]#  ceph pg dump pgs | grep -e
    >     'scrubbing f'
    >     > 5.6e      221223                   0         0          0
           0
    >     >  927795290112            0           0  4073      3000
         4073
    >     >  active+clean+scrubbing+deep  2024-03-20T01:07:21.196293+
    >     > 0000  128383'15766927  128383:20517419   [2,4,18,16,14,21]
    >               2
    >     >   [2,4,18,16,14,21]               2  125519'12328877
    >     >  2024-01-23T11:25:35.503811+0000  124844'11873951
     2024-01-21T22:
    >     > 24:12.620693+0000              0                    5  deep
    >     scrubbing
    >     > for 270790s                                             53772
    >     >                0
    >     > 5.6c      221317                   0         0          0
           0
    >     >  928173256704            0           0  6332         0
         6332
    >     >  active+clean+scrubbing+deep  2024-03-18T09:29:29.233084+
    >     > 0000  128382'15788196  128383:20727318     [6,9,12,14,1,4]
    >               6
    >     >     [6,9,12,14,1,4]               6  127180'14709746
    >     >  2024-03-06T12:47:57.741921+0000  124817'11821502
     2024-01-20T20:
    >     > 59:40.566384+0000              0                13452  deep
    >     scrubbing
    >     > for 273519s                                            122803
    >     >                0
    >     > 5.6a      221325                   0         0          0
           0
    >     >  928184565760            0           0  4649      3000
         4649
    >     >  active+clean+scrubbing+deep  2024-03-13T03:48:54.065125+
    >     > 0000  128382'16031499  128383:21221685     [13,11,1,2,9,8]
    >              13
    >     >     [13,11,1,2,9,8]              13  127181'14915404
    >     >  2024-03-06T13:16:58.635982+0000  125967'12517899
     2024-01-28T09:
    >     > 13:08.276930+0000              0                10078  deep
    >     scrubbing
    >     > for 726001s                                            184819
    >     >                0
    >     > 5.54      221050                   0         0          0
           0
    >     >  927036203008            0           0  4864      3000
         4864
    >     >  active+clean+scrubbing+deep  2024-03-18T00:17:48.086231+
    >     > 0000  128383'15584012  128383:20293678  [0,20,18,19,11,12]
    >               0
    >     >  [0,20,18,19,11,12]               0  127195'14651908
    >     >  2024-03-07T09:22:31.078448+0000  124816'11813857
     2024-01-20T16:
    >     > 43:15.755200+0000              0                 9808  deep
    >     scrubbing
    >     > for 306667s                                            142126
    >     >                0
    >     > 5.47      220849                   0         0          0
           0
    >     >  926233448448            0           0  5592         0
         5592
    >     >  active+clean+scrubbing+deep  2024-03-12T08:10:39.413186+
    >     > 0000  128382'15653864  128383:20403071  [16,15,20,0,13,21]
    >              16
    >     >  [16,15,20,0,13,21]              16  127183'14600433
    >     >  2024-03-06T18:21:03.057165+0000  124809'11792397
     2024-01-20T05:
    >     > 27:07.617799+0000              0                13066  deep
    >     scrubbing
    >     > for 796697s                                            209193
    >     >                0
    >     > dumped pgs
    >     >
    >     >
    >     > *
    >     >
    >     >
    >     > regards
    >     > Bernhard
    >     >
    >     >
    >     >
    >     >
    >     >
    >     >
    >     > On 20/03/2024 21:12, Bandelow, Gunnar wrote:
    >     >> Hi,
    >     >>
    >     >> i just wanted to mention, that i am running a cluster
    with reef
    >     >> 18.2.1 with the same issue.
    >     >>
    >     >> 4 PGs start to deepscrub but dont finish since mid
    february. In
    >     the
    >     >> pg dump they are shown as scheduled for deep scrub. They
    sometimes
    >     >> change their status from active+clean to
    >     active+clean+scrubbing+deep
    >     >> and back.
    >     >>
    >     >> Best regards,
    >     >> Gunnar
    >     >>
    >     >> =======================================================
    >     >>
    >     >> Gunnar Bandelow
    >     >> Universitätsrechenzentrum (URZ)
    >     >> Universität Greifswald
    >     >> Felix-Hausdorff-Straße 18
    >     >> 17489 Greifswald
    >     >> Germany
    >     >>
    >     >> Tel.: +49 3834 420 1450
    >     >>
    >     >>
    >     >>
    >     >>
    >     >> --- Original Nachricht ---
    >     >> *Betreff: * Re: Reef (18.2): Some PG not
    scrubbed/deep
    >     >> scrubbed for 1 month
    >     >> *Von: *"Michel Jouvin" <michel.jouvin@xxxxxxxxxxxxxxx
    >     <mailto:michel.jouvin@xxxxxxxxxxxxxxx>
    >     >> <michel.jouvin@xxxxxxxxxxxxxxx
    >     <mailto:michel.jouvin@xxxxxxxxxxxxxxx>>>
    >     >> *An: *ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
    >     <ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>>
    >     >> *Datum: *20-03-2024 20:00
    >     >>
    >     >>
    >     >>
    >     >>     Hi Rafael,
    >     >>
    >     >>     Good to know I am not alone!
    >     >>
    >     >>     Additional information ~6h after the OSD restart:
    over the
    >     20 PGs
    >     >>     impacted, 2 have been processed successfully... I don't
    >     have a clear
    >     >>     picture on how Ceph prioritize the scrub of one PG over
    >     another, I
    >     >>     had
    >     >>     thought that the oldest/expired scrubs are taken
    first but
    >     it may
    >     >>     not be
    >     >>     the case. Anyway, I have seen a very significant
    decrese of
    >     the
    >     >> scrub
    >     >>     activity this afternoon and the cluster is not loaded
    at all
    >     >>     (almost no
    >     >>     users yet)...
    >     >>
    >     >>     Michel
    >     >>
    >     >>     Le 20/03/2024 à 17:55, quaglio@xxxxxxxxxx
    >     <mailto:quaglio@xxxxxxxxxx>
    >     >>     <quaglio@xxxxxxxxxx <mailto:quaglio@xxxxxxxxxx>> a
    écrit :
    >     >>     > Hi,
    >     >>     >      I upgraded a cluster 2 weeks ago here. The
    situation
    >     is the
    >     >>     same
    >     >>     > as Michel.
    >     >>     >      A lot of PGs no scrubbed/deep-scrubed.
    >     >>     >
    >     >>     > Rafael.
    >     >>     >
    >     >>     > _______________________________________________
    >     >>     > ceph-users mailing list -- ceph-users@xxxxxxx
    >     <mailto:ceph-users@xxxxxxx>
    >     >>     <ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>>
    >     >>     > To unsubscribe send an email to
    ceph-users-leave@xxxxxxx
    >     <mailto:ceph-users-leave@xxxxxxx>
    >     >>     <ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>>
    >     >> _______________________________________________
    >     >>     ceph-users mailing list -- ceph-users@xxxxxxx
    >     <mailto:ceph-users@xxxxxxx>
    >     >>     <ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>>
    >     >>     To unsubscribe send an email to ceph-users-leave@xxxxxxx
    >     <mailto:ceph-users-leave@xxxxxxx>
    >     >>     <ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>>
    >     >>
    >     >>
    >     >> _______________________________________________
    >     >> ceph-users mailing list --ceph-users@xxxxxxx
    >     <mailto:ceph-users@xxxxxxx>
    >     >> To unsubscribe send an email toceph-users-leave@xxxxxxx
    >     <mailto:toceph-users-leave@xxxxxxx>
    >     >
    >     > _______________________________________________
    >     > ceph-users mailing list -- ceph-users@xxxxxxx
    >     <mailto:ceph-users@xxxxxxx>
    >     > To unsubscribe send an email to ceph-users-leave@xxxxxxx
    >     <mailto:ceph-users-leave@xxxxxxx>
    >     _______________________________________________
    >     ceph-users mailing list -- ceph-users@xxxxxxx
    >     <mailto:ceph-users@xxxxxxx>
    >     To unsubscribe send an email to ceph-users-leave@xxxxxxx
    >     <mailto:ceph-users-leave@xxxxxxx>
    >
    >
    > _______________________________________________
    > ceph-users mailing list --ceph-users@xxxxxxx
    > To unsubscribe send an email toceph-users-leave@xxxxxxx
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx