> Il giorno 27 apr 2018, alle ore 05:27, Joseph Qi <jiangqi903@xxxxxxxxx> ha scritto: > > Hi Paolo, > > On 18/4/27 01:27, Paolo Valente wrote: >> >> >>> Il giorno 25 apr 2018, alle ore 14:13, Joseph Qi <jiangqi903@xxxxxxxxx> ha scritto: >>> >>> Hi Paolo, >>> >> >> Hi Joseph >> >>> ... >>> Could you run blktrace as well when testing your case? There are several >>> throtl traces to help analyze whether it is caused by frequently >>> upgrade/downgrade. >> >> Certainly. You can find a trace attached. Unfortunately, I'm not >> familiar with the internals of blk-throttle and low limit, so, if you >> want me to analyze the trace, give me some hints on what I have to >> look for. Otherwise, I'll be happy to learn from your analysis. >> > > I've taken a glance at your blktrace attached. It is only upgrade at first and > then downgrade (just adjust limit, not to LIMIT_LOW) frequently. > But I don't know why it always thinks throttle group is not idle. > > For example: > fio-2336 [004] d... 428.458249: 8,16 m N throtl avg_idle=90, idle_threshold=1000, bad_bio=10, total_bio=84, is_idle=0, scale=9 > fio-2336 [004] d... 428.458251: 8,16 m N throtl downgrade, scale 4 > > In throtl_tg_is_idle(): > is_idle = ... || > (tg->latency_target && tg->bio_cnt && > tg->bad_bio_cnt * 5 < tg->bio_cnt); > > It should be idle and allow run more bandwidth. But here the result shows not > idle (is_idle=0). I have to do more investigation to figure it out why. > Hi Joseph, actually this doesn't surprise me much, for this scenario I expected exactly that blk-throttle would have considered the random-I/O group, for most of the time, 1) non idle, 2) above the 100usec target latency, and 3) below low limit, In fact, 1) The group can evidently issue I/O at a much higher rate than that received, so, immediately after its last pending I/O has been served, the group issues new I/O; in the end, it is is non idle most of the time 2) To try to enforce the 10MB/s limit, blk-throttle necessarily makes the group oscillate around 10MB/s, which means that the group is frequently below limit (this would not have held only if the group had actually received much more than 10MB/s, but it is not so) 3) For each of the 4k random I/Os of the group, the time needed by the drive to serve that I/O is already around 40-50usec. So, since the group is of course not constantly in service, it is very easy that, because of throttling, the latency of most I/Os of the group goes beyond 100usec. But, as it is often the case for me, I might have simply misunderstood blk-throttle parameters, and I might be just wrong here. Thanks, Paolo > You can also filter these logs using: > grep throtl trace | grep -E 'upgrade|downgrade|is_idle' > > Thanks, > Joseph