On 3/15/24 10:01, Sascha Hauer wrote:
It can happen that a socket sends the remaining data at close() time.
With io_uring and KTLS it can happen that sk_stream_wait_memory() bails
out with -512 (-ERESTARTSYS) because TIF_NOTIFY_SIGNAL is set for the
current task. This flag has been set in io_req_normal_work_add() by
calling task_work_add().
The entire idea of task_work is to interrupt syscalls and let io_uring
do its job, otherwise it wouldn't free resources it might be holding,
and even potentially forever block the syscall.
I'm not that sure about connect / close (are they not restartable?),
but it doesn't seem to be a good idea for sk_stream_wait_memory(),
which is the normal TCP blocking send path. I'm thinking of some kinds
of cases with a local TCP socket pair, the tx queue is full as well
and the rx queue of the other end, and io_uring has to run to receive
the data.
If interruptions are not welcome you can use different io_uring flags,
see IORING_SETUP_COOP_TASKRUN and/or IORING_SETUP_DEFER_TASKRUN.
Maybe I'm missing something, why not restart your syscall?
This patch replaces signal_pending() with task_sigpending(), thus ignoring
the TIF_NOTIFY_SIGNAL flag.
A discussion of this issue can be found at
https://lore.kernel.org/20231010141932.GD3114228@xxxxxxxxxxxxxx
Suggested-by: Jens Axboe <axboe@xxxxxxxxx>
Signed-off-by: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx>
---
net/core/stream.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/core/stream.c b/net/core/stream.c
index 96fbcb9bbb30a..e9e17b48e0122 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -67,7 +67,7 @@ int sk_stream_wait_connect(struct sock *sk, long *timeo_p)
return -EPIPE;
if (!*timeo_p)
return -EAGAIN;
- if (signal_pending(tsk))
+ if (task_sigpending(tsk))
return sock_intr_errno(*timeo_p);
add_wait_queue(sk_sleep(sk), &wait);
@@ -103,7 +103,7 @@ void sk_stream_wait_close(struct sock *sk, long timeout)
do {
if (sk_wait_event(sk, &timeout, !sk_stream_closing(sk), &wait))
break;
- } while (!signal_pending(current) && timeout);
+ } while (!task_sigpending(current) && timeout);
remove_wait_queue(sk_sleep(sk), &wait);
}
@@ -134,7 +134,7 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
goto do_error;
if (!*timeo_p)
goto do_eagain;
- if (signal_pending(current))
+ if (task_sigpending(current))
goto do_interrupted;
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
if (sk_stream_memory_free(sk) && !vm_wait)
--
Pavel Begunkov