On Sat, Oct 12, 2024 at 3:30?AM Ruyi Zhang <ruyi.zhang@xxxxxxxxxxx> wrote: > > --- > On 2024-10-10 15:35 Pavel Begunkov wrote: > >> Two questions: > >> > >> 1. I agree with you, we shouldn't walk a potentially very > >> long list under spinlock. but i can't find any other way > >> to get all the timeout > > > If only it's just under the spin, but with disabled irqs... > > >> information than to walk the timeout_list. Do you have any > >> good ideas? > > > In the long run it'd be great to replace the spinlock > > with a mutex, i.e. just ->uring_lock, but that would might be > > a bit involving as need to move handling to the task context. > > Yes, it makes more sense to replace spin_lock, but that would > require other related logic to be modified, and I don't think > it's wise to do that for the sake of a piece of debugging > information. > > >> 2. I also agree seq_printf heavier, if we use > >> seq_put_decimal_ull and seq_puts to concatenate strings, > >> I haven't tested whether it's more efficient or not, but > >> the code is certainly not as readable as the former. It's > >> also possible that I don't fully understand what you mean > >> and want to hear your opinion. > > > I don't think there is any difference, it'd be a matter of > > doubling the number of in flight timeouts to achieve same > > timings. Tell me, do you really have a good case where you > > need that (pretty verbose)? Why not drgn / bpftrace it out > > of the kernel instead? > > Of course, this information is available through existing tools. > But I think that most of the io_uring metadata has been exported > from the fdinfo file, and the purpose of adding the timeout > information is the same as before, easier to use. This way, > I don't have to write additional scripts to get all kinds of data. > > And as far as I know, the io_uring_show_fdinfo function is > only called once when the user is viewing the > /proc/xxx/fdinfo/x file once. I don't think we normally need to > look at this file as often, and only look at it when the program > is abnormal, and the timeout_list is very long in the extreme case, > so I think the performance impact of adding this code is limited. I do think it's useful, sometimes the only thing you have to poke at after-the-fact is the fdinfo information. At the same time, would it be more useful to dump _some_ of the info, even if we can't get all of it? Would not be too hard to just stop dumping if need_resched() is set, and even note that - you can always retry, as this info is generally grabbed from the console anyway, not programmatically. That avoids the worst possible scenario, which is a malicious setup with a shit ton of pending timers, while still allowing it to be useful for a normal setup. And this patch could just do that, rather than attempt to re-architect how the timers are tracked and which locking it uses. -- Jens Axboe