That is super useful. Thank you so much for sharing! :) Kevin ________________________________________ From: Frank Schilder <frans@xxxxxx> Sent: Friday, October 25, 2024 8:03 AM To: ceph-users@xxxxxxx Subject: Re: pgs not deep-scrubbed in time and pgs not scrubbed in time Check twice before you click! This email originated from outside PNNL. Hi, you might want to take a look here: https://github.com/frans42/ceph-goodies/blob/main/doc/TuningScrub.md Don't set max_scrubs > 1 on HDD OSDs, you will almost certainly regret it like I did. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Joachim Kraftmayer <joachim.kraftmayer@xxxxxxxxx> Sent: Wednesday, October 23, 2024 10:49 AM To: Eugen Block Cc: ceph-users@xxxxxxx Subject: Re: pgs not deep-scrubbed in time and pgs not scrubbed in time Hi Götz, we have solved several scenarios like this in the past. Do you recognize a trend as to whether the pg not scrub is decreasing, remaining the same or increasing? There are a few osd configuration parameters that can be used to set the scrubbing behavior of the cluster: https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/ https://docs.clyso.com/blog/ceph-blocked-requests-in-the-cluster-caused-by-deep-scrubing-operations/ Regards, Joachim joachim.kraftmayer@xxxxxxxxx http://www.clyso.com/ Hohenzollernstr. 27, 80801 Munich Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306 Am Mi., 23. Okt. 2024 um 09:37 Uhr schrieb Eugen Block <eblock@xxxxxx>: > Hi Götz, > > usually, OSDs start (deep-)scrubbing PGs after they have been powered > on. You should see PGs in (deep-)scrubbing state right now. Depending > on your PG sizes, number of OSDs etc., that can take some time, of > course. But eventually the number should decrease over time. If you > have the defaults for deep_scrub_interval (1 week) and your cluster > hasn't complained before, you'll probably get rid of the warning > within a week or so. If you want to speed things up and your cluster > can handle the load, you could temporarily increase osd_max_scrubs > (max concurrent scrubs on a single OSD, default 1, can be updated at > runtime): > > ceph config set osd osd_max_scrubs 2 > > It doesn't sound like you had this warning before, so I assume it will > eventually clear. If not, you can check out the docs [0] and my recent > blog post [1] about this topic. > > Regards, > Eugen > > [0] > > https://docs.ceph.com/en/latest/rados/operations/health-checks/#pg-not-deep-scrubbed > [1] > > https://heiterbiswolkig.blogs.nde.ag/2024/09/06/pgs-not-deep-scrubbed-in-time/ > > Zitat von Götz Reinicke <goetz.reinicke@xxxxxxxxxxxxxxx>: > > > Hello Ceph Community, > > > > My cluster was hit by a power outage some month ago. Luckily no data > > was destroyed and powering up the nodes and services went well. > > > > But till than some pgs are still shown as not scrubbed in time. > > Googling and searching the list showed some debugging hints like > > „ceph pg deep-scrub“ the pgs or restarting osd deamons. > > > > Nothing „solved“ that issue here. I’m on ceph version 18.2.4 now. > > > > Is there anything special what I can do to have thous pgs scrubbed? > > I like having the cluster health state ok not warning :) Or will > > time solve the problem when the pgs are in there regular cycle for > > being scrubbed again? > > > > > > Thanks for hints and suggestion . Best regards Götz > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx