Hi Christian, thank you for your time. The problem is deep scrub only. Jewel 10.2.2 is used. Thank you for your hint with manual deep scrubs on specific OSD's. I didnt come up with that idea. ----- Where do you know osd_scrub_sleep from ? I am saw here lately on the mailinglist multiple times many "hidden" config options. ( while hidden is everything which is not mentioned in the doku @ ceph.com ). ceph.com does not know about osd_scrub_sleep config option ( except mentioned in (past) release notes ) The search engine finds it mainly in github or bugtracker. Is there any source of a (complete) list of available config options, useable by normal admin's ? Or is it really neccessary to grab through source codes and release notes to collect that kind information on your own ? -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:info@xxxxxxxxxxxxxxxxx Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3 63571 Gelnhausen HRB 93402 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic Steuer Nr.: 35 236 3622 1 UST ID: DE274086107 Am 20.10.2016 um 14:39 schrieb Christian Balzer: > > Hello, > > On Thu, 20 Oct 2016 11:23:54 +0200 Oliver Dzombic wrote: > >> Hi, >> >> we have here globally: >> >> osd_client_op_priority = 63 >> osd_disk_thread_ioprio_class = idle >> osd_disk_thread_ioprio_priority = 7 >> osd_max_scrubs = 1 >> > If you google for osd_max_scrubs you will find plenty of threads, bug > reports, etc. > > The most significant and benificial impact for client I/O can be achieved > by telling scrub to release its deadly grip on the OSDs with something like > osd_scrub_sleep = 0.1 > > Also which version, Hammer IIRC? > Jewel's unified queue should help as well, but no first hand experience > here. > >> to influence the scrubbing performance and >> >> osd_scrub_begin_hour = 1 >> osd_scrub_end_hour = 7 >> >> to influence the scrubbing time frame >> >> >> Now, as it seems, this time frame is/was not enough, so ceph started >> scrubbing all the time, i assume because of the age of the objects. >> > You may want to line things up, so that OSDs/PGs are evenly spread out. > For example with 6 OSDs, manually initiate a deep scrub each day (at 01:00 > in your case), so that only a specific subset is doing deep scrub conga. > > >> And it does it with: >> >> 4 active+clean+scrubbing+deep >> >> ( instead of the configured 1 ) >> > That's per OSD, not global, see above, google. > >> >> So now, we experience a situation, where the spinning drives are so >> busy, that the IO performance got too bad. >> >> The only reason that its not a catastrophy is, that we have a cache tier >> in front of it, which loweres the IO needs on the spnning drives. >> >> Unluckily we have also some pools going directly on the spinning drives. >> >> So these pools experience a very bad IO performance. >> >> So we had to disable scrubbing during business houres ( which is not >> really a solution ). >> > It is, unfortunately, for many people. > As mentioned many times, if your cluster is having issues with deep-scrubs > during peak hours, it will also be unhappy if you loose an OSD and > backfills happen. > If it is unhappy with normal scrubs, you need to upgrade/expand HW > immediately. > >> So any idea why >> >> 1. 4-5 scrubs we can see, while osd_max_scrubs = 1 is set ? > See above. > > With BlueStore in the wings and reduced (negated?) need for deep-scrubs, I > doubt this will see much coding effort. > >> 2. Why the impact on the spinning drives is so hard, while we lowered >> the IO priority for it ? >> > That has only a small impact, deep-scrub by its very nature reads all > objects and thus kills I/Os by seeks and polluting caches. > > > Christian > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com