Apologies, I have sent the wrong mail. Here is the mail I really wanted to send, with answers to some of the questions Paolo raised the last time I sent it. -----------------------------------8<------------------------------ >From 566bb198546423c024cdebc50d0aade7ed638a40 Mon Sep 17 00:00:00 2001 From: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx> Date: Mon, 23 Oct 2023 14:13:46 +0200 Subject: [PATCH v2] net: Do not break out of sk_stream_wait_memory() with TIF_NOTIFY_SIGNAL It can happen that a socket sends the remaining data at close() time. With io_uring and KTLS it can happen that sk_stream_wait_memory() bails out with -512 (-ERESTARTSYS) because TIF_NOTIFY_SIGNAL is set for the current task. This flag has been set in io_req_normal_work_add() by calling task_work_add(). It seems signal_pending() is too broad, so this patch replaces it with task_sigpending(), thus ignoring the TIF_NOTIFY_SIGNAL flag. A discussion of this issue can be found at https://lore.kernel.org/20231010141932.GD3114228@xxxxxxxxxxxxxx Suggested-by: Jens Axboe <axboe@xxxxxxxxx> Fixes: 12db8b690010c ("entry: Add support for TIF_NOTIFY_SIGNAL") Link: https://lore.kernel.org/r/20231023121346.4098160-1-s.hauer@xxxxxxxxxxxxxx Signed-off-by: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx> --- Changes since v1: - only replace signal_pending() with task_sigpending() where we need it, in sk_stream_wait_memory() I'd like to pick up the discussion on this patch as it is still needed for our usecase. Paolo Abeni raised some concerns about this patch for which I didn't have good answers. I am referencing them here again with an attempts to answer them. Jens, maybe you also have a few words here. Paolo raised some concerns in https://lore.kernel.org/all/e1e15554bfa5cfc8048d6074eedbc83c4d912c98.camel@xxxxxxxxxx/: > To be more explicit: why this will not cause user-space driven > connect() from missing relevant events? Note I dropped the hunk in sk_stream_wait_connect() and sk_stream_wait_close() in this version. Userspace driven signals are still catched with task_sigpending() which tests for TIF_SIGPENDING. signal_pending() will additionally check for TIF_NOTIFY_SIGNAL which is exclusively used by task_work_add() to add work to a task. > Why this is needed only here and not in all the many others event loops waiting > for signals? There might be other cases too, but this one particularly bites us. I guess in other places the work can be picked up again after being interrupted by a signal, but here it can't. > Why can't the issue be addressed in any place more closely tied to the > scenario, e.g. io_uring or ktls? At least I don't have an idea how this could be done. All components seem to do the right thing on their own, but the pieces do not seem to fit together for this particular case. Jens, please correct me if I am wrong or if you have something to add, this whole thing is outside my comfort zone, but I'd be very happy if we could solve this issue somehow. Thanks, Sascha net/core/stream.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/stream.c b/net/core/stream.c index b16dfa568a2d5..55d90251205a6 100644 --- a/net/core/stream.c +++ b/net/core/stream.c @@ -134,7 +134,7 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p) goto do_error; if (!*timeo_p) goto do_eagain; - if (signal_pending(current)) + if (task_sigpending(current)) goto do_interrupted; sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); if (sk_stream_memory_free(sk) && !vm_wait) -- 2.39.2 -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |