Re: [PATCH v2 RESEND] io_uring/fdinfo: add timeout_list to fdinfo

Pavel Begunkov <asml.silence@xxxxxxxxx> · Thu, 24 Oct 2024 19:10:00 +0100

On 10/24/24 18:31, Jens Axboe wrote:
On Sat, Oct 12, 2024 at 3:30?AM Ruyi Zhang <ruyi.zhang@xxxxxxxxxxx> wrote:
...
I don't think there is any difference, it'd be a matter of
doubling the number of in flight timeouts to achieve same
timings. Tell me, do you really have a good case where you
need that (pretty verbose)? Why not drgn / bpftrace it out
of the kernel instead?

  Of course, this information is available through existing tools.
  But I think that most of the io_uring metadata has been exported
  from the fdinfo file, and the purpose of adding the timeout
  information is the same as before, easier to use. This way,
  I don't have to write additional scripts to get all kinds of data.

  And as far as I know, the io_uring_show_fdinfo function is
  only called once when the user is viewing the
  /proc/xxx/fdinfo/x file once. I don't think we normally need to
  look at this file as often, and only look at it when the program
  is abnormal, and the timeout_list is very long in the extreme case,
  so I think the performance impact of adding this code is limited.

I do think it's useful, sometimes the only thing you have to poke at
after-the-fact is the fdinfo information. At the same time, would it be

If you have an fd to print fdinfo, you can just well run drgn
or any other debugging tool. We keep pushing more debugging code
that can be extracted with bpf and other tools, and not only
it bloats the code, but potentially cripples the entire kernel.

more useful to dump _some_ of the info, even if we can't get all of it?
Would not be too hard to just stop dumping if need_resched() is set, and

need_resched() takes eternity in the eyes of hard irqs, that is
surely one way to make the system unusable. Will we even get the
request for rescheduling considering that irqs are off => timers
can't run?

even note that - you can always retry, as this info is generally grabbed
from the console anyway, not programmatically. That avoids the worst
possible scenario, which is a malicious setup with a shit ton of pending
timers, while still allowing it to be useful for a normal setup. And
this patch could just do that, rather than attempt to re-architect how
the timers are tracked and which locking it uses.

Or it can be done with one of the existing tools that already
exist specifically for that purpose, which don't need any additional
kernel and custom handling in the kernel, and users won't need to
wait until the patch lands into your kernel and can be run right
away.

--
Pavel Begunkov