Hi Laimis, I apologize for not paying attention to the Reddit link/discussion in your previous message. Forget about osd_scrub_chunk_max. It's very unlikely to explain why scrubbing is so slow that it doesn't progress (if at all) for many v19.2 users. Given the number of testimonies and recent modifications in the code related to scrubbing, I would encourage you to create a bug report in the tracker so that this issue can be investigated. Cheers, Frédéric. ________________________________ De : Frédéric Nass Envoyé : mercredi 27 novembre 2024 17:17 À : Laimis Juzeliūnas Cc: ceph-users Objet : Re: Squid: deep scrub issues Hi Laimis, Might be the result of osd_scrub_chunk_max now being 15 instead of 25 previously. See [1] and [2]. Cheers, Frédéric. [1] https://tracker.ceph.com/issues/68057 [2] https://github.com/ceph/ceph/pull/59791/commits/0841603023ba53923a986f2fb96ab7105630c9d3 ----- Le 26 Nov 24, à 23:36, Laimis Juzeliūnas laimis.juzeliunas@xxxxxxxxxx a écrit : > Hello Ceph community, > > Wanted to highlight one observation and gather any Squid users having similar > experiences. > Since upgrading to 19.2.0 (from 18.4.0) we have observed that pg deep scrubbing > times have drastically increased. Some pgs take 2-5 days to complete deep > scrubbing while others increase to 20+ days. This causes the deep scrubbing > queue to fill up and the cluster almost constantly has 'pgs not deep-scrubbed > in time' alerts. > We have on average 67 pgs/osd: running on 15TB hdd disks this results in > 200GB-ish pgs. While fairly large - these pgs did not cause such increase in > deep scrubs when on Reef. > > "ceph pg dump | grep 'deep scrubbing for'" will always have a few entries of > quite morbid scrubs like the following: > 7.3e 121289 0 0 0 0 225333247207 > 0 0 127 0 127 active+clean+scrubbing+deep > 2024-11-13T09:37:42.549418+0000 490179'5220664 490179:23902923 > [268,27,122] 268 [268,27,122] 268 483850'5203141 > 2024-11-02T11:33:57.835277+0000 472713'5197481 > 2024-10-11T04:30:00.639763+0000 0 21873 deep > scrubbing for 1169147s > 34.247 62618 0 0 0 0 179797964677 > 0 0 101 50 101 active+clean+scrubbing+deep > 2024-11-05T06:27:52.288785+0000 490179'22729571 490179:80672442 > [34,97,25] 34 [34,97,25] 34 481331'22436869 > 2024-10-23T16:06:50.092439+0000 471395'22289914 > 2024-10-07T19:29:26.115047+0000 0 204864 deep > scrubbing for 1871733s > > Not pointing any fingers but Squid release had "better scrub scheduling" > announced. > Though this is not scheduling directly, but maybe this change had any impact > causing such behaviour? > > Scrubbing configurations: > ceph config get osd | grep scrub > global advanced osd_deep_scrub_interval > 2678400.000000 > global advanced osd_deep_scrub_large_omap_object_key_threshold 500000 > global advanced osd_max_scrubs 5 > global advanced osd_scrub_auto_repair true > global advanced osd_scrub_max_interval > 2678400.000000 > global advanced osd_scrub_min_interval > 172800.000000 > > > Cluster details (backfilling expected and caused by some manual reweights): > cluster: > id: 96df99f6-fc1a-11ea-90a4-6cb3113cb732 > health: HEALTH_WARN > 24 pgs not deep-scrubbed in time > > services: > mon: 5 daemons, quorum > ceph-node004,ceph-node003,ceph-node001,ceph-node005,ceph-node002 (age 4d) > mgr: ceph-node001.hgythj(active, since 11d), standbys: > ceph-node002.jphtvg > mds: 20/20 daemons up, 12 standby > osd: 384 osds: 384 up (since 25h), 384 in (since 5d); 5 remapped pgs > rbd-mirror: 2 daemons active (2 hosts) > rgw: 64 daemons active (32 hosts, 1 zones) > > data: > volumes: 1/1 healthy > pools: 14 pools, 8681 pgs > objects: 758.42M objects, 1.5 PiB > usage: 4.6 PiB used, 1.1 PiB / 5.7 PiB avail > pgs: 275177/2275254543 objects misplaced (0.012%) > 6807 active+clean > 989 active+clean+scrubbing+deep > 880 active+clean+scrubbing > 5 active+remapped+backfilling > > io: > client: 37 MiB/s rd, 59 MiB/s wr, 1.72k op/s rd, 439 op/s wr > recovery: 70 MiB/s, 38 objects/s > > > One thread of other users experiencing same 19.2.0 prolonged deep scrub issues: > https://www.reddit.com/r/ceph/comments/1guynak/strange_issue_where_scrubdeep_scrub_never_finishes/ > Any hints or help would be greately appreciated! > > > Thanks in advance, > Laimis J. > laimis.juzeliunas@xxxxxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx