Re: 6 pgs not deep-scrubbed in time

Michel Niyoyita <micou12@xxxxxxxxx> · Mon, 29 Jan 2024 08:42:41 +0200

Now they are increasing , Friday I tried to deep-scrubbing manually and
they have been successfully done , but Monday morning I found that they are
increasing to 37 , is it the best to deep-scrubbing manually while we are
using the cluster? if not what is the best to do in order to address that .

Best Regards.

Michel

 ceph -s
  cluster:
    id:     cb0caedc-eb5b-42d1-a34f-96facfda8c27
    health: HEALTH_WARN
            37 pgs not deep-scrubbed in time

  services:
    mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 11M)
    mgr: ceph-mon2(active, since 11M), standbys: ceph-mon3, ceph-mon1
    osd: 48 osds: 48 up (since 11M), 48 in (since 11M)
    rgw: 6 daemons active (6 hosts, 1 zones)

  data:
    pools:   10 pools, 385 pgs
    objects: 6.00M objects, 23 TiB
    usage:   151 TiB used, 282 TiB / 433 TiB avail
    pgs:     381 active+clean
             4   active+clean+scrubbing+deep

  io:
    client:   265 MiB/s rd, 786 MiB/s wr, 3.87k op/s rd, 699 op/s wr

On Sun, Jan 28, 2024 at 6:14 PM E Taka <0etaka0@xxxxxxxxx> wrote:

> 22 is more often there than the others. Other operations may be blocked
> because of a deep-scrub is not finished yet. I would remove OSD 22, just to
> be sure about this: ceph orch osd rm osd.22
>
> If this does not help, just add it again.
>
> Am Fr., 26. Jan. 2024 um 08:05 Uhr schrieb Michel Niyoyita <
> micou12@xxxxxxxxx>:
>
>> It seems that are different OSDs as shown here . how have you managed to
>> sort this out?
>>
>> ceph pg dump | grep -F 6.78
>> dumped all
>> 6.78       44268                   0         0          0        0
>> 178679640118            0           0  10099     10099
>>  active+clean  2024-01-26T03:51:26.781438+0200  107547'115445304
>> 107547:225274427  [12,36,37]          12  [12,36,37]              12
>> 106977'114532385  2024-01-24T08:37:53.597331+0200  101161'109078277
>> 2024-01-11T16:07:54.875746+0200              0
>> root@ceph-osd3:~# ceph pg dump | grep -F 6.60
>> dumped all
>> 6.60       44449                   0         0          0        0
>> 179484338742          716          36  10097     10097
>>  active+clean  2024-01-26T03:50:44.579831+0200  107547'153238805
>> 107547:287193139   [32,5,29]          32   [32,5,29]              32
>> 107231'152689835  2024-01-25T02:34:01.849966+0200  102171'147920798
>> 2024-01-13T19:44:26.922000+0200              0
>> 6.3a       44807                   0         0          0        0
>> 180969005694            0           0  10093     10093
>>  active+clean  2024-01-26T03:53:28.837685+0200  107547'114765984
>> 107547:238170093  [22,13,11]          22  [22,13,11]              22
>> 106945'113739877  2024-01-24T04:10:17.224982+0200  102863'109559444
>> 2024-01-15T05:31:36.606478+0200              0
>> root@ceph-osd3:~# ceph pg dump | grep -F 6.5c
>> 6.5c       44277                   0         0          0        0
>> 178764978230            0           0  10051     10051
>>  active+clean  2024-01-26T03:55:23.339584+0200  107547'126480090
>> 107547:264432655  [22,37,30]          22  [22,37,30]              22
>> 107205'125858697  2024-01-24T22:32:10.365869+0200  101941'120957992
>> 2024-01-13T09:07:24.780936+0200              0
>> dumped all
>> root@ceph-osd3:~# ceph pg dump | grep -F 4.12
>> dumped all
>> 4.12           0                   0         0          0        0
>>      0            0           0      0         0
>>  active+clean  2024-01-24T08:36:48.284388+0200               0'0
>>  107546:152711   [22,19,7]          22   [22,19,7]              22
>>      0'0  2024-01-24T08:36:48.284307+0200               0'0
>> 2024-01-13T09:09:22.176240+0200              0
>> root@ceph-osd3:~# ceph pg dump | grep -F 10.d
>> dumped all
>> 10.d           0                   0         0          0        0
>>      0            0           0      0         0
>>  active+clean  2024-01-24T04:04:33.641541+0200               0'0
>>  107546:142651   [14,28,1]          14   [14,28,1]              14
>>      0'0  2024-01-24T04:04:33.641451+0200               0'0
>> 2024-01-12T08:04:02.078062+0200              0
>> root@ceph-osd3:~# ceph pg dump | grep -F 5.f
>> dumped all
>> 5.f            0                   0         0          0        0
>>      0            0           0      0         0
>>  active+clean  2024-01-25T08:19:04.148941+0200               0'0
>>  107546:161331  [11,24,35]          11  [11,24,35]              11
>>      0'0  2024-01-25T08:19:04.148837+0200               0'0
>> 2024-01-12T06:06:00.970665+0200              0
>>
>>
>> On Fri, Jan 26, 2024 at 8:58 AM E Taka <0etaka0@xxxxxxxxx> wrote:
>>
>>> We had the same problem. It turned out that one disk was slowly dying.
>>> It was easy to identify by the commands (in your case):
>>>
>>> ceph pg dump | grep -F 6.78
>>> ceph pg dump | grep -F 6.60
>>> …
>>>
>>> This command shows the OSDs of a PG in square brackets. If is there
>>> always the same number, then you've found the OSD which causes the slow
>>> scrubs.
>>>
>>> Am Fr., 26. Jan. 2024 um 07:45 Uhr schrieb Michel Niyoyita <
>>> micou12@xxxxxxxxx>:
>>>
>>>> Hello team,
>>>>
>>>> I have a cluster in production composed by  3 osds servers with 20 disks
>>>> each deployed using ceph-ansibleand ubuntu OS , and the version is
>>>> pacific
>>>> . These days is in WARN state caused by pgs which are not deep-scrubbed
>>>> in
>>>> time . I tried to deep-scrubbed some pg manually but seems that the
>>>> cluster
>>>> can be slow, would like your assistance in order that my cluster can be
>>>> in
>>>> HEALTH_OK state as before without any interuption of service . The
>>>> cluster
>>>> is used as openstack backend storage.
>>>>
>>>> Best Regards
>>>>
>>>> Michel
>>>>
>>>>
>>>>  ceph -s
>>>>   cluster:
>>>>     id:     cb0caedc-eb5b-42d1-a34f-96facfda8c27
>>>>     health: HEALTH_WARN
>>>>             6 pgs not deep-scrubbed in time
>>>>
>>>>   services:
>>>>     mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 11M)
>>>>     mgr: ceph-mon2(active, since 11M), standbys: ceph-mon3, ceph-mon1
>>>>     osd: 48 osds: 48 up (since 11M), 48 in (since 11M)
>>>>     rgw: 6 daemons active (6 hosts, 1 zones)
>>>>
>>>>   data:
>>>>     pools:   10 pools, 385 pgs
>>>>     objects: 5.97M objects, 23 TiB
>>>>     usage:   151 TiB used, 282 TiB / 433 TiB avail
>>>>     pgs:     381 active+clean
>>>>              4   active+clean+scrubbing+deep
>>>>
>>>>   io:
>>>>     client:   59 MiB/s rd, 860 MiB/s wr, 155 op/s rd, 665 op/s wr
>>>>
>>>> root@ceph-osd3:~# ceph health detail
>>>> HEALTH_WARN 6 pgs not deep-scrubbed in time
>>>> [WRN] PG_NOT_DEEP_SCRUBBED: 6 pgs not deep-scrubbed in time
>>>>     pg 6.78 not deep-scrubbed since 2024-01-11T16:07:54.875746+0200
>>>>     pg 6.60 not deep-scrubbed since 2024-01-13T19:44:26.922000+0200
>>>>     pg 6.5c not deep-scrubbed since 2024-01-13T09:07:24.780936+0200
>>>>     pg 4.12 not deep-scrubbed since 2024-01-13T09:09:22.176240+0200
>>>>     pg 10.d not deep-scrubbed since 2024-01-12T08:04:02.078062+0200
>>>>     pg 5.f not deep-scrubbed since 2024-01-12T06:06:00.970665+0200
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>
>>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx