On Wed, 2021-06-16 at 15:00 -0500, Eric W. Biederman wrote: > Olivier Langlois <olivier@xxxxxxxxxxxxxx> writes: > > > I redid my test but this time instead of dumping directly into a > > file, > > I did let the coredump be piped to the systemd coredump module and > > the > > coredump generation isn't working as expected when piping. > > > > So your code review conclusions are correct. > > Thank you for confirming that. > > Do you know how your test program is using io_uring? > > I have been trying to put the pieces together on what io_uring is > doing > that stops the coredump. The fact that it takes a little while > before > it kills the coredump is a little puzzling. The code looks like all > of > the io_uring operations should have been canceled before the coredump > starts. > > With a very simple setup, I guess that this could easily be reproducible. Make a TCP connection with a server that is streaming non-stop data and enter a loop where you keep initiating async OP_IOURING_READ operations on your TCP fd. Once you have that, manually sending a SIG_SEGV is a sure fire way to stumble into the problem. This is how I am testing the patches. IRL, it is possible to call io_uring_enter() to submit operations and return from the syscall without waiting on all events to have completed. Once the process is back in userspace, if it stumble into a bug that triggers a coredump, any remaining pending I/O operations can set TIF_SIGNAL_NOTIFY while the coredump is generated. I have read the part of your previous email where you share the result of your ongoing investigation. I didn't comment as the definitive references in io_uring matters are Jens and Pavel but I am going to share my opinion on the matter. I think that you did put the finger on the code cleaning up the io_uring instance in regards to pending operations. It seems to be io_uring_release() which is probably called from exit_files() which happens to be after the call to exit_mm(). At first, I did entertain the idea of considering if it could be possible to duplicate some of the operations performed by io_uring_release() related to the infamous TIF_SIGNAL_NOTIFY setting into io_uring_files_cancel() which is called before exit_mm(). but the idea is useless as it is not the other threads of the group that are causing the TIF_SIGNAL_NOTIFY problem. It is the thread calling do_coredump() which is done by the signal handing code even before that thread enters do_exit() and start to be cleaned up. That thread when it enters do_coredump() is still fully loaded and operational in terms of io_uring functionality. I guess that this io_uring cancel all pending operations hook would have to be called from do_coredump or from get_signal() but if it is the way to go, I feel that this is a change major enough that wouldn't dare going there without the blessing of the maintainers in cause....