Thank you Anthony. I did have an empty pool that I had provisioned for developers that was never used. I’ve removed that pool and the 0 object PGs are gone. I don’t know why I didn’t realize that. Removing that pool halved the # of PGs not scrubbed in time. This is entirely an HDD cluster. I don’t constrain my scrubs, and I had already set the osd_deep_scrub_interval to 2 weeks, and increased the osd_scrub_load_threshold to 5. But that didn’t help much. I’ve moved our operations to our failover cluster so hopefully this one can catch up now. I don’t understand how this started out of the blue, but at least now, the number is decreasing. Jeff > On Jan 3, 2023, at 12:57 AM, Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote: > > Look closely at your output. The PGs with 0 objects. Are only “every other” due to how the command happened to order the output. > > Note that the empty PGs all have IDs matching “3.*”. The numeric prefix of a PG ID reflects the cardinal ID of the pool to which it belongs. I strongly suspect that you have a pool with no data. > > > >>> Strangely, ceph pg dump gives shows every other PG with 0 objects. An attempt to perform a deep scrub (or scrub) on one of these PGs does nothing. The cluster appears to be running fine, but obviously there’s an issue. What should my next steps be to troubleshoot ? >>>> PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG STATE STATE_STAMP VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN >>>> 3.e9b 0 0 0 0 0 0 0 0 0 0 active+clean 2022-12-31 22:49:07.629579 0'0 23686:19820 [28,79] 28 [28,79] 28 0'0 2022-12-31 22:49:07.629508 0'0 2022-12-31 22:49:07.629508 0 >>>> 1.e99 60594 0 0 0 0 177433523272 0 0 3046 3046 active+clean 2022-12-21 14:35:08.175858 23686'268137 23686:1732399 [178,115] 178 [178,115] 178 23675'267613 2022-12-21 11:01:10.403525 23675'267613 2022-12-21 11:01:10.403525 0 >>>> 3.e9a 0 0 0 0 0 0 0 0 0 0 active+clean 2022-12-31 09:16:48.644619 0'0 23686:22855 [51,140] 51 [51,140] 51 0'0 2022-12-31 09:16:48.644568 0'0 2022-12-30 02:35:23.367344 0 >>>> 1.e98 59962 0 0 0 0 177218669411 0 0 3035 3035 active+clean 2022-12-28 14:14:49.908560 23686'265576 23686:1357499 [92,86] 92 [92,86] 92 23686'265445 2022-12-28 14:14:49.908522 23686'265445 2022-12-28 14:14:49.908522 0 >>>> 3.e95 0 0 0 0 0 0 0 0 0 0 active+clean 2022-12-31 06:09:39.442932 0'0 23686:22757 [48,83] 48 [48,83] 48 0'0 2022-12-31 06:09:39.442879 0'0 2022-12-18 09:33:47.892142 0 > > > As to your PGs not scrubbed in time, what sort of hardware are your OSDs? Here are some thoughts, especially if they’re HDDs. > > * If you don’t need that empty pool, delete it, then evaluate how many PGs on average your OSDs hold (eg. `ceph osd df`). If you have an unusually high number of PGs per, maybe just maybe you’re running afoul of osd_scrub_extended_sleep / osd_scrub_sleep . In other words, individual scrubs on empty PGs may naturally be very fast, but they may be DoSing because of the efforts Ceph makes to spread out the impact of scrubs. > > * Do you limit scrubs to certain times via osd_scrub_begin_hour, osd_scrub_end_hour, osd_scrub_begin_week_day, osd_scrub_end_week_day? I’ve seen operators who constraint scrubs to only a few overnight / weekend hours, but doing so can hobble Ceph’s ability to get through them all in time. > > * Similarly, a value of osd_scrub_load_threshold that’s too low can also result in starvation. The load average statistic can be misleading on modern SMP systems with lots of cores. I’ve witnessed 32c/64t OSD nodes report a load average of like 40, but with tools like htop one could see that they were barely breaking a sweat. > > * If you have osd_scrub_during_recovery disabled and experience a lot of backfill / recovery / rebalance traffic, that can starve scrubs too. IMHO with recent releases this should almost always be enabled, ymmv. > > * Back when I ran busy (read: underspend) HDD clusters I had to bump osd_deep_scrub_interval by a factor of 4x due to how slow and seek-bound the LFF spinners were. Of course, the longer one spaces out scrubs, the less effective they are at detecting problems before they’re impactful. > > > >
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx