Re: very inaccurate %util of iostat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 23 2020 at 11:19pm -0400,
Ming Lei <ming.lei@xxxxxxxxxx> wrote:

> Hi Guys,
> 
> Commit 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
> changes calculation of 'io_ticks' a lot.
> 
> In theory, io_ticks counts the time when there is any IO in-flight or in-queue,
> so it has to rely on in-flight counting of IO.
> 
> However, commit 5b18b5a73760 changes io_ticks's accounting into the
> following way:
> 
> 	stamp = READ_ONCE(part->stamp);
> 	if (unlikely(stamp != now)) {
> 		if (likely(cmpxchg(&part->stamp, stamp, now) == stamp))
> 			__part_stat_add(part, io_ticks, 1);
> 	}
> 
> So this way doesn't use any in-flight IO's info, simply adding 1 if stamp
> changes compared with previous stamp, no matter if there is any in-flight
> IO or not.
> 
> Now when there is very heavy IO on disks, %util is still much less than
> 100%, especially on HDD, the reason could be that IO latency can be much more
> than 1ms in case of 1000HZ, so the above calculation is very inaccurate.
> 
> Another extreme example is that if IOs take long time to complete, such
> as IO stall, %util may show 0% utilization, instead of 100%.

Hi Ming,

Your email triggered a memory of someone else (Konstantin Khlebnikov)
having reported and fixed this relatively recently, please see this
patchset: https://lkml.org/lkml/2020/3/2/336

Obviously this needs fixing.  If you have time to review/polish the
proposed patches that'd be great.

Mike




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux