On Tue, 2021-08-10 at 17:48 -0400, Tony Battersby wrote: > > > I just ran into this problem also - coredumps from an io_uring > program > to a pipe are truncated. But I am using kernel 5.10.57, which does > NOT > have commit 12db8b690010 ("entry: Add support for TIF_NOTIFY_SIGNAL") > or > commit 06af8679449d ("coredump: Limit what can interrupt > coredumps"). > Kernel 5.4 works though, so I bisected the problem to commit > f38c7e3abfba ("io_uring: ensure async buffered read-retry is setup > properly") in kernel 5.9. Note that my io_uring program uses only > async > buffered reads, which may be why this particular commit makes a > difference to my program. > > My io_uring program is a multi-purpose long-running program with many > threads. Most threads don't use io_uring but a few of them do. > Normally, my core dumps are piped to a program so that they can be > compressed before being written to disk, but I can also test writing > the > core dumps directly to disk. This is what I have found: > > *) Unpatched 5.10.57: if a thread that doesn't use io_uring triggers > a > coredump, the core file is written correctly, whether it is written > to > disk or piped to a program, even if another thread is using io_uring > at > the same time. > > *) Unpatched 5.10.57: if a thread that uses io_uring triggers a > coredump, the core file is truncated, whether written directly to > disk > or piped to a program. > > *) 5.10.57+backport 06af8679449d: if a thread that uses io_uring > triggers a coredump, and the core is written directly to disk, then > it > is written correctly. > > *) 5.10.57+backport 06af8679449d: if a thread that uses io_uring > triggers a coredump, and the core is piped to a program, then it is > truncated. > > *) 5.10.57+revert f38c7e3abfba: core dumps are written correctly, > whether written directly to disk or piped to a program. > > Tony Battersby > Cybernetics > Tony, this is super interesting details. I'm leaving for few days so I will not be able to look into it until I am back but here is my interpretation of your findings: f38c7e3abfba makes it more likely that your task ends up in a fd read wait queue. Previously the io_uring req queuing was failing and returning EAGAIN. Now it ends up using io uring fast poll. When the core dump gets written through a pipe, pipe_write must block waiting for some event. If the task gets waken up by the io_uring wait queue entry instead, it must somehow make pipe_write fails. So the problem must be a mix of TIF_NOTIFY_SIGNAL and the fact that io_uring wait queue entries aren't cleaned up while doing the core dump. I have a new modif to try out. I'll hopefully be able to submit a patch to fix that once I come back (I cannot do it now or else, I'll never leave ;-))