Re: 6 pgs not deep-scrubbed in time

E Taka <0etaka0@xxxxxxxxx> · Sun, 28 Jan 2024 17:14:31 +0100

22 is more often there than the others. Other operations may be blocked
because of a deep-scrub is not finished yet. I would remove OSD 22, just to
be sure about this: ceph orch osd rm osd.22

If this does not help, just add it again.

Am Fr., 26. Jan. 2024 um 08:05 Uhr schrieb Michel Niyoyita <
micou12@xxxxxxxxx>:

> It seems that are different OSDs as shown here . how have you managed to
> sort this out?
>
> ceph pg dump | grep -F 6.78
> dumped all
> 6.78       44268                   0         0          0        0
> 178679640118            0           0  10099     10099
>  active+clean  2024-01-26T03:51:26.781438+0200  107547'115445304
> 107547:225274427  [12,36,37]          12  [12,36,37]              12
> 106977'114532385  2024-01-24T08:37:53.597331+0200  101161'109078277
> 2024-01-11T16:07:54.875746+0200              0
> root@ceph-osd3:~# ceph pg dump | grep -F 6.60
> dumped all
> 6.60       44449                   0         0          0        0
> 179484338742          716          36  10097     10097
>  active+clean  2024-01-26T03:50:44.579831+0200  107547'153238805
> 107547:287193139   [32,5,29]          32   [32,5,29]              32
> 107231'152689835  2024-01-25T02:34:01.849966+0200  102171'147920798
> 2024-01-13T19:44:26.922000+0200              0
> 6.3a       44807                   0         0          0        0
> 180969005694            0           0  10093     10093
>  active+clean  2024-01-26T03:53:28.837685+0200  107547'114765984
> 107547:238170093  [22,13,11]          22  [22,13,11]              22
> 106945'113739877  2024-01-24T04:10:17.224982+0200  102863'109559444
> 2024-01-15T05:31:36.606478+0200              0
> root@ceph-osd3:~# ceph pg dump | grep -F 6.5c
> 6.5c       44277                   0         0          0        0
> 178764978230            0           0  10051     10051
>  active+clean  2024-01-26T03:55:23.339584+0200  107547'126480090
> 107547:264432655  [22,37,30]          22  [22,37,30]              22
> 107205'125858697  2024-01-24T22:32:10.365869+0200  101941'120957992
> 2024-01-13T09:07:24.780936+0200              0
> dumped all
> root@ceph-osd3:~# ceph pg dump | grep -F 4.12
> dumped all
> 4.12           0                   0         0          0        0
>      0            0           0      0         0
>  active+clean  2024-01-24T08:36:48.284388+0200               0'0
>  107546:152711   [22,19,7]          22   [22,19,7]              22
>      0'0  2024-01-24T08:36:48.284307+0200               0'0
> 2024-01-13T09:09:22.176240+0200              0
> root@ceph-osd3:~# ceph pg dump | grep -F 10.d
> dumped all
> 10.d           0                   0         0          0        0
>      0            0           0      0         0
>  active+clean  2024-01-24T04:04:33.641541+0200               0'0
>  107546:142651   [14,28,1]          14   [14,28,1]              14
>      0'0  2024-01-24T04:04:33.641451+0200               0'0
> 2024-01-12T08:04:02.078062+0200              0
> root@ceph-osd3:~# ceph pg dump | grep -F 5.f
> dumped all
> 5.f            0                   0         0          0        0
>      0            0           0      0         0
>  active+clean  2024-01-25T08:19:04.148941+0200               0'0
>  107546:161331  [11,24,35]          11  [11,24,35]              11
>      0'0  2024-01-25T08:19:04.148837+0200               0'0
> 2024-01-12T06:06:00.970665+0200              0
>
>
> On Fri, Jan 26, 2024 at 8:58 AM E Taka <0etaka0@xxxxxxxxx> wrote:
>
>> We had the same problem. It turned out that one disk was slowly dying. It
>> was easy to identify by the commands (in your case):
>>
>> ceph pg dump | grep -F 6.78
>> ceph pg dump | grep -F 6.60
>> …
>>
>> This command shows the OSDs of a PG in square brackets. If is there
>> always the same number, then you've found the OSD which causes the slow
>> scrubs.
>>
>> Am Fr., 26. Jan. 2024 um 07:45 Uhr schrieb Michel Niyoyita <
>> micou12@xxxxxxxxx>:
>>
>>> Hello team,
>>>
>>> I have a cluster in production composed by  3 osds servers with 20 disks
>>> each deployed using ceph-ansibleand ubuntu OS , and the version is
>>> pacific
>>> . These days is in WARN state caused by pgs which are not deep-scrubbed
>>> in
>>> time . I tried to deep-scrubbed some pg manually but seems that the
>>> cluster
>>> can be slow, would like your assistance in order that my cluster can be
>>> in
>>> HEALTH_OK state as before without any interuption of service . The
>>> cluster
>>> is used as openstack backend storage.
>>>
>>> Best Regards
>>>
>>> Michel
>>>
>>>
>>>  ceph -s
>>>   cluster:
>>>     id:     cb0caedc-eb5b-42d1-a34f-96facfda8c27
>>>     health: HEALTH_WARN
>>>             6 pgs not deep-scrubbed in time
>>>
>>>   services:
>>>     mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 11M)
>>>     mgr: ceph-mon2(active, since 11M), standbys: ceph-mon3, ceph-mon1
>>>     osd: 48 osds: 48 up (since 11M), 48 in (since 11M)
>>>     rgw: 6 daemons active (6 hosts, 1 zones)
>>>
>>>   data:
>>>     pools:   10 pools, 385 pgs
>>>     objects: 5.97M objects, 23 TiB
>>>     usage:   151 TiB used, 282 TiB / 433 TiB avail
>>>     pgs:     381 active+clean
>>>              4   active+clean+scrubbing+deep
>>>
>>>   io:
>>>     client:   59 MiB/s rd, 860 MiB/s wr, 155 op/s rd, 665 op/s wr
>>>
>>> root@ceph-osd3:~# ceph health detail
>>> HEALTH_WARN 6 pgs not deep-scrubbed in time
>>> [WRN] PG_NOT_DEEP_SCRUBBED: 6 pgs not deep-scrubbed in time
>>>     pg 6.78 not deep-scrubbed since 2024-01-11T16:07:54.875746+0200
>>>     pg 6.60 not deep-scrubbed since 2024-01-13T19:44:26.922000+0200
>>>     pg 6.5c not deep-scrubbed since 2024-01-13T09:07:24.780936+0200
>>>     pg 4.12 not deep-scrubbed since 2024-01-13T09:09:22.176240+0200
>>>     pg 10.d not deep-scrubbed since 2024-01-12T08:04:02.078062+0200
>>>     pg 5.f not deep-scrubbed since 2024-01-12T06:06:00.970665+0200
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx