Re: [PATCH v1 1/4] io_uring: only account cqring wait time as iowait if enabled for a ring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/24/24 8:31 AM, Pavel Begunkov wrote:
> On 2/24/24 05:07, David Wei wrote:
>> Currently we unconditionally account time spent waiting for events in CQ
>> ring as iowait time.
>>
>> Some userspace tools consider iowait time to be CPU util/load which can
>> be misleading as the process is sleeping. High iowait time might be
>> indicative of issues for storage IO, but for network IO e.g. socket
>> recv() we do not control when the completions happen so its value
>> misleads userspace tooling.
>>
>> This patch gates the previously unconditional iowait accounting behind a
>> new IORING_REGISTER opcode. By default time is not accounted as iowait,
>> unless this is explicitly enabled for a ring. Thus userspace can decide,
>> depending on the type of work it expects to do, whether it wants to
>> consider cqring wait time as iowait or not.
> 
> I don't believe it's a sane approach. I think we agree that per
> cpu iowait is a silly and misleading metric. I have hard time to
> define what it is, and I'm sure most probably people complaining
> wouldn't be able to tell as well. Now we're taking that metric
> and expose even more knobs to userspace.

For sure, it's a stupid metric. But at the same time, educating people
on this can be like talking to a brick wall, and it'll be years of doing
that before we're making a dent in it. Hence I do think that just
exposing the knob and letting the storage side use it, if they want, is
the path of least resistance. I'm personally not going to do a crusade
on iowait to eliminate it, I don't have the time for that. I'll educate
people when it comes up, like I have been doing, but pulling this to
conclusion would be 10+ years easily.

> Another argument against is that per ctx is not the right place
> to have it. It's a system metric, and you can imagine some system
> admin looking for it. Even in cases when had some meaning w/o
> io_uring now without looking at what flags io_uring has it's
> completely meaningless, and it's too much to ask.
> 
> I don't understand why people freak out at seeing hi iowait,
> IMHO it perfectly fits the definition of io_uring waiting for
> IO / completions, but at this point it might be better to just
> revert it to the old behaviour of not reporting iowait at all.
> And if we want to save the cpu freq iowait optimisation, we
> should just split notion of iowait reporting and iowait cpufreq
> tuning.

For io_uring, splitting the cpufreq from iowait is certainly the right
approach. And then just getting rid of iowait completely on the io_uring
side. This can be done without preaching about iowait to everyone that
has bad metrics for their healt monitoring, which is why I like that a
lot. I did ponder that the other day as well.

You still kind of run into a problem with that in terms of when short vs
long waits are expected. On the io_uring side, we use the "do I have
any requests pending" for that, which is obviously not fine grained
enough. We could apply it on just "do I have any requests against
regular files" instead, which would then translate to needing further
tracking on our side. Probably fine to just apply it for the existing
logic, imho.

-- 
Jens Axboe





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux