Re: signals not reliably interrupting io_uring_enter anymore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, 

On July 3, 2020 5:48:21 PM PDT, Jens Axboe <axboe@xxxxxxxxx> wrote:
>On 7/3/20 6:15 PM, Andres Freund wrote:
>> Hi,
>> 
>> On 2020-07-03 17:00:49 -0700, Andres Freund wrote:
>>> I haven't yet fully analyzed the problem, but after updating to
>>> cdd3bb54332f82295ed90cd0c09c78cd0c0ee822 io_uring using postgres
>does
>>> not work reliably anymore.
>>>
>>> The symptom is that io_uring_enter(IORING_ENTER_GETEVENTS) isn't
>>> interrupted by signals anymore. The signal handler executes, but
>>> afterwards the syscall is restarted. Previously io_uring_enter
>reliably
>>> returned EINTR in that case.
>>>
>>> Currently postgres relies on signals interrupting io_uring_enter().
>We
>>> probably can find a way to not do so, but it'd not be entirely
>trivial.
>>>
>>> I suspect the issue is
>>>
>>> commit ce593a6c480a22acba08795be313c0c6d49dd35d (tag:
>io_uring-5.8-2020-07-01, linux-block/io_uring-5.8)
>>> Author: Jens Axboe <axboe@xxxxxxxxx>
>>> Date:   2020-06-30 12:39:05 -0600
>>>
>>>     io_uring: use signal based task_work running
>>>
>>> as that appears to have changed the error returned by
>>> io_uring_enter(GETEVENTS) after having been interrupted by a signal
>from
>>> EINTR to ERESTARTSYS.
>>>
>>>
>>> I'll check to make sure that the issue doesn't exist before the
>above
>>> commit.
>> 
>> Indeed, on cd77006e01b3198c75fb7819b3d0ff89709539bb the PG issue
>doesn't
>> exist, which pretty much confirms that the above commit is the issue.
>> 
>> What was the reason for changing EINTR to ERESTARTSYS in the above
>> commit? I assume trying to avoid returning spurious EINTRs to
>userland?
>
>Yeah, for when it's running task_work. I wonder if something like the
>below will do the trick?
>
>
>diff --git a/fs/io_uring.c b/fs/io_uring.c
>index 700644a016a7..0efa73d78451 100644
>--- a/fs/io_uring.c
>+++ b/fs/io_uring.c
>@@ -6197,11 +6197,11 @@ static int io_cqring_wait(struct io_ring_ctx
>*ctx, int min_events,
> 	do {
> 		prepare_to_wait_exclusive(&ctx->wait, &iowq.wq,
> 						TASK_INTERRUPTIBLE);
>-		/* make sure we run task_work before checking for signals */
>-		if (current->task_works)
>-			task_work_run();
> 		if (signal_pending(current)) {
>-			ret = -ERESTARTSYS;
>+			if (current->task_works)
>+				ret = -ERESTARTSYS;
>+			else
>+				ret = -EINTR;
> 			break;
> 		}
> 		if (io_should_wake(&iowq, false))
>@@ -6210,7 +6210,7 @@ static int io_cqring_wait(struct io_ring_ctx
>*ctx, int min_events,
> 	} while (1);
> 	finish_wait(&ctx->wait, &iowq.wq);
> 
>-	restore_saved_sigmask_unless(ret == -ERESTARTSYS);
>+	restore_saved_sigmask_unless(ret == -EINTR);
> 
>	return READ_ONCE(rings->cq.head) == READ_ONCE(rings->cq.tail) ? ret :
>0;
> }

I'll try in a bit. Suspect however that there'd be trouble if there were both an actual signal and task work pending?

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux