Re: bug with fastpoll accept and sqpoll + IOSQE_FIXED_FILE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/3/21 4:49 AM, Pavel Begunkov wrote:
> On 02/02/2021 20:56, Pavel Begunkov wrote:
>> On 02/02/2021 20:48, Jens Axboe wrote:
>>> On 2/2/21 1:34 PM, Pavel Begunkov wrote:
>>>> On 02/02/2021 17:41, Pavel Begunkov wrote:
>>>>> On 02/02/2021 17:24, Jens Axboe wrote:
>>>>>> On 2/2/21 10:10 AM, Victor Stewart wrote:
>>>>>>>> Can you send the updated test app?
>>>>>>>
>>>>>>> https://gist.github.com/victorstewart/98814b65ed702c33480487c05b40eb56
>>>>>>>
>>>>>>> same link i just updated the same gist
>>>>>>
>>>>>> And how are you running it?
>>>>>
>>>>> with SQPOLL    with    FIXED FLAG -> FAILURE: failed with error = ???
>>>>> 	-> io_uring_wait_cqe_timeout() strangely returns -1, (-EPERM??)
>>>>
>>>> Ok, _io_uring_get_cqe() is just screwed twice
>>>>
>>>> TL;DR
>>>> we enter into it with submit=0, do an iteration, which decrements it,
>>>> then a second iteration passes submit=-1, which is returned back by
>>>> the kernel as a result and propagated back from liburing...
>>>
>>> Yep, that's what I came up with too. We really just need a clear way
>>> of knowing when to break out, and when to keep going. Eg if we've
>>> done a loop and don't end up calling the system call, then there's
>>> no point in continuing.
>>
>> We can bodge something up (and forget about it), and do much cleaner
>> for IORING_FEAT_EXT_ARG, because we don't have LIBURING_UDATA_TIMEOUT
>> reqs for it and so can remove peek and so on.
> 
> This version looks reasonably simple, and even passes tests and all
> issues found by Victor's test. Didn't test it yet, but should behave
> similarly in regard of internal timeouts (pre IORING_FEAT_EXT_ARG).
> 
> static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
> 			     struct get_data *data)
> {
> 	struct io_uring_cqe *cqe = NULL;
> 	int ret = 0, err;
> 
> 	do {
> 		unsigned flags = 0;
> 		unsigned nr_available;
> 		bool enter = false;
> 
> 		err = __io_uring_peek_cqe(ring, &cqe, &nr_available);
> 		if (err)
> 			break;
> 
> 		/* IOPOLL won't proceed when there're not reaped CQEs */
> 		if (cqe && (ring->flags & IORING_SETUP_IOPOLL))
> 			data->wait_nr = 0;
> 
> 		if (data->wait_nr > nr_available || cq_ring_needs_flush(ring)) {
> 			flags = IORING_ENTER_GETEVENTS | data->get_flags;
> 			enter = true;
> 		}
> 		if (data->submit) {
> 			sq_ring_needs_enter(ring, &flags);
> 			enter = true;
> 		}
> 		if (!enter)
> 			break;
> 
> 		ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
> 					    data->wait_nr, flags, data->arg,
> 					    data->sz);
> 		if (ret < 0) {
> 			err = -errno;
> 			break;
> 		}
> 		data->submit -= ret;
> 	} while (1);
> 
> 	*cqe_ptr = cqe;
> 	return err;
> }

So here's my take on this - any rewrite of _io_uring_get_cqe() is going
to end up adding special cases, that's unfortunately just the nature of
the game. And since we're going to be doing a new liburing release very
shortly, this isn't a great time to add a rewrite of it. It'll certainly
introduce more bugs than it solves, and hence regressions, no matter how
careful we are.

Hence my suggestion is to just patch this in a trivial kind of fashion,
even if it doesn't really make the function any prettier. But it'll be
safer for a release, and then we can rework the function after.

With that in mind, here's my suggestion. The premise is if we go through
the loop and don't do io_uring_enter(), then there's no point in
continuing. That's the trivial fix.


diff --git a/src/queue.c b/src/queue.c
index 94f791e..4161aa7 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -89,12 +89,13 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
 {
 	struct io_uring_cqe *cqe = NULL;
 	const int to_wait = data->wait_nr;
-	int ret = 0, err;
+	int err;
 
 	do {
 		bool cq_overflow_flush = false;
 		unsigned flags = 0;
 		unsigned nr_available;
+		int ret = -2;
 
 		err = __io_uring_peek_cqe(ring, &cqe, &nr_available);
 		if (err)
@@ -117,7 +118,9 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
 			ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
 					data->wait_nr, flags, data->arg,
 					data->sz);
-		if (ret < 0) {
+		if (ret == -2) {
+			break;
+		} else if (ret < 0) {
 			err = -errno;
 		} else if (ret == (int)data->submit) {
 			data->submit = 0;

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux