Slow requests are not exactly tied to the PG number, but we were getting slow requests whenever backfills or recoveries fired up - increasing the number of PGs helped with this as the “blocks” of work are much smaller than before. We have roughly the same number of OSDs as you but only one really important pool (“volumes”), we ended with 16384 PGs for this one. Number of threads increased exponentionaly, some latencies wet down, some went up, in the end it works just as well as before with the added benefit of better data distribution and a better behaving cluster. But YMMV - once you go up you can’t go down. Jan > On 01 Jun 2015, at 10:57, huang jun <hjwsm1989@xxxxxxxxx> wrote: > > hi,jan > > 2015-06-01 15:43 GMT+08:00 Jan Schermer <jan@xxxxxxxxxxx>: >> We had to disable deep scrub or the cluster would me unusable - we need to turn it back on sooner or later, though. >> With minimal scrubbing and recovery settings, everything is mostly good. Turned out many issues we had were due to too few PGs - once we increased them from 4K to 16K everything sped up nicely (because the chunks are smaller), but during heavy activity we are still getting some “slow IOs”. > > How many PGs do you set ? we get "slow requests" many times, but > didn't relate it to PG number. > And we follow the equation below for every pool: > > (OSDs * 100) > Total PGs = --------------------- > pool size > our cluster has 157 OSDs and 3 POOLs, we set pg_num to 8192 for every pool, > but osd cpu utlity percentage is up to 300% after restart, we think > it's loading pgs during the period. > and we will try different PG number when we get "slow request" > > thanks! > >> I believe there is an ionice knob in newer versions (we still run Dumpling), and that should do the trick no matter how much additional “load” is put on the OSDs. >> Everybody’s bottleneck will be different - we run all flash so disk IO is not a problem but an OSD daemon is - no ionice setting will help with that, it just needs to be faster ;-) >> >> Jan >> >> >>> On 30 May 2015, at 01:17, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> >>> On Fri, May 29, 2015 at 2:47 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: >>>> Many people have reported that they need to lower the osd recovery config options to minimize the impact of recovery on client io. We are talking about changing the defaults as follows: >>>> >>>> osd_max_backfills to 1 (from 10) >>>> osd_recovery_max_active to 3 (from 15) >>>> osd_recovery_op_priority to 1 (from 10) >>>> osd_recovery_max_single_start to 1 (from 5) >>> >>> I'm under the (possibly erroneous) impression that reducing the number >>> of max backfills doesn't actually reduce recovery speed much (but will >>> reduce memory use), but that dropping the op priority can. I'd rather >>> we make users manually adjust values which can have a material impact >>> on their data safety, even if most of them choose to do so. >>> >>> After all, even under our worst behavior we're still doing a lot >>> better than a resilvering RAID array. ;) >>> -Greg >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > thanks > huangjun -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html