Now they are increasing , Friday I tried to deep-scrubbing manually and they have been successfully done , but Monday morning I found that they are increasing to 37 , is it the best to deep-scrubbing manually while we are using the cluster? if not what is the best to do in order to address that . Best Regards. Michel ceph -s cluster: id: cb0caedc-eb5b-42d1-a34f-96facfda8c27 health: HEALTH_WARN 37 pgs not deep-scrubbed in time services: mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 11M) mgr: ceph-mon2(active, since 11M), standbys: ceph-mon3, ceph-mon1 osd: 48 osds: 48 up (since 11M), 48 in (since 11M) rgw: 6 daemons active (6 hosts, 1 zones) data: pools: 10 pools, 385 pgs objects: 6.00M objects, 23 TiB usage: 151 TiB used, 282 TiB / 433 TiB avail pgs: 381 active+clean 4 active+clean+scrubbing+deep io: client: 265 MiB/s rd, 786 MiB/s wr, 3.87k op/s rd, 699 op/s wr On Sun, Jan 28, 2024 at 6:14 PM E Taka <0etaka0@xxxxxxxxx> wrote: > 22 is more often there than the others. Other operations may be blocked > because of a deep-scrub is not finished yet. I would remove OSD 22, just to > be sure about this: ceph orch osd rm osd.22 > > If this does not help, just add it again. > > Am Fr., 26. Jan. 2024 um 08:05 Uhr schrieb Michel Niyoyita < > micou12@xxxxxxxxx>: > >> It seems that are different OSDs as shown here . how have you managed to >> sort this out? >> >> ceph pg dump | grep -F 6.78 >> dumped all >> 6.78 44268 0 0 0 0 >> 178679640118 0 0 10099 10099 >> active+clean 2024-01-26T03:51:26.781438+0200 107547'115445304 >> 107547:225274427 [12,36,37] 12 [12,36,37] 12 >> 106977'114532385 2024-01-24T08:37:53.597331+0200 101161'109078277 >> 2024-01-11T16:07:54.875746+0200 0 >> root@ceph-osd3:~# ceph pg dump | grep -F 6.60 >> dumped all >> 6.60 44449 0 0 0 0 >> 179484338742 716 36 10097 10097 >> active+clean 2024-01-26T03:50:44.579831+0200 107547'153238805 >> 107547:287193139 [32,5,29] 32 [32,5,29] 32 >> 107231'152689835 2024-01-25T02:34:01.849966+0200 102171'147920798 >> 2024-01-13T19:44:26.922000+0200 0 >> 6.3a 44807 0 0 0 0 >> 180969005694 0 0 10093 10093 >> active+clean 2024-01-26T03:53:28.837685+0200 107547'114765984 >> 107547:238170093 [22,13,11] 22 [22,13,11] 22 >> 106945'113739877 2024-01-24T04:10:17.224982+0200 102863'109559444 >> 2024-01-15T05:31:36.606478+0200 0 >> root@ceph-osd3:~# ceph pg dump | grep -F 6.5c >> 6.5c 44277 0 0 0 0 >> 178764978230 0 0 10051 10051 >> active+clean 2024-01-26T03:55:23.339584+0200 107547'126480090 >> 107547:264432655 [22,37,30] 22 [22,37,30] 22 >> 107205'125858697 2024-01-24T22:32:10.365869+0200 101941'120957992 >> 2024-01-13T09:07:24.780936+0200 0 >> dumped all >> root@ceph-osd3:~# ceph pg dump | grep -F 4.12 >> dumped all >> 4.12 0 0 0 0 0 >> 0 0 0 0 0 >> active+clean 2024-01-24T08:36:48.284388+0200 0'0 >> 107546:152711 [22,19,7] 22 [22,19,7] 22 >> 0'0 2024-01-24T08:36:48.284307+0200 0'0 >> 2024-01-13T09:09:22.176240+0200 0 >> root@ceph-osd3:~# ceph pg dump | grep -F 10.d >> dumped all >> 10.d 0 0 0 0 0 >> 0 0 0 0 0 >> active+clean 2024-01-24T04:04:33.641541+0200 0'0 >> 107546:142651 [14,28,1] 14 [14,28,1] 14 >> 0'0 2024-01-24T04:04:33.641451+0200 0'0 >> 2024-01-12T08:04:02.078062+0200 0 >> root@ceph-osd3:~# ceph pg dump | grep -F 5.f >> dumped all >> 5.f 0 0 0 0 0 >> 0 0 0 0 0 >> active+clean 2024-01-25T08:19:04.148941+0200 0'0 >> 107546:161331 [11,24,35] 11 [11,24,35] 11 >> 0'0 2024-01-25T08:19:04.148837+0200 0'0 >> 2024-01-12T06:06:00.970665+0200 0 >> >> >> On Fri, Jan 26, 2024 at 8:58 AM E Taka <0etaka0@xxxxxxxxx> wrote: >> >>> We had the same problem. It turned out that one disk was slowly dying. >>> It was easy to identify by the commands (in your case): >>> >>> ceph pg dump | grep -F 6.78 >>> ceph pg dump | grep -F 6.60 >>> … >>> >>> This command shows the OSDs of a PG in square brackets. If is there >>> always the same number, then you've found the OSD which causes the slow >>> scrubs. >>> >>> Am Fr., 26. Jan. 2024 um 07:45 Uhr schrieb Michel Niyoyita < >>> micou12@xxxxxxxxx>: >>> >>>> Hello team, >>>> >>>> I have a cluster in production composed by 3 osds servers with 20 disks >>>> each deployed using ceph-ansibleand ubuntu OS , and the version is >>>> pacific >>>> . These days is in WARN state caused by pgs which are not deep-scrubbed >>>> in >>>> time . I tried to deep-scrubbed some pg manually but seems that the >>>> cluster >>>> can be slow, would like your assistance in order that my cluster can be >>>> in >>>> HEALTH_OK state as before without any interuption of service . The >>>> cluster >>>> is used as openstack backend storage. >>>> >>>> Best Regards >>>> >>>> Michel >>>> >>>> >>>> ceph -s >>>> cluster: >>>> id: cb0caedc-eb5b-42d1-a34f-96facfda8c27 >>>> health: HEALTH_WARN >>>> 6 pgs not deep-scrubbed in time >>>> >>>> services: >>>> mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 11M) >>>> mgr: ceph-mon2(active, since 11M), standbys: ceph-mon3, ceph-mon1 >>>> osd: 48 osds: 48 up (since 11M), 48 in (since 11M) >>>> rgw: 6 daemons active (6 hosts, 1 zones) >>>> >>>> data: >>>> pools: 10 pools, 385 pgs >>>> objects: 5.97M objects, 23 TiB >>>> usage: 151 TiB used, 282 TiB / 433 TiB avail >>>> pgs: 381 active+clean >>>> 4 active+clean+scrubbing+deep >>>> >>>> io: >>>> client: 59 MiB/s rd, 860 MiB/s wr, 155 op/s rd, 665 op/s wr >>>> >>>> root@ceph-osd3:~# ceph health detail >>>> HEALTH_WARN 6 pgs not deep-scrubbed in time >>>> [WRN] PG_NOT_DEEP_SCRUBBED: 6 pgs not deep-scrubbed in time >>>> pg 6.78 not deep-scrubbed since 2024-01-11T16:07:54.875746+0200 >>>> pg 6.60 not deep-scrubbed since 2024-01-13T19:44:26.922000+0200 >>>> pg 6.5c not deep-scrubbed since 2024-01-13T09:07:24.780936+0200 >>>> pg 4.12 not deep-scrubbed since 2024-01-13T09:09:22.176240+0200 >>>> pg 10.d not deep-scrubbed since 2024-01-12T08:04:02.078062+0200 >>>> pg 5.f not deep-scrubbed since 2024-01-12T06:06:00.970665+0200 >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx