On Thu, Feb 6, 2020 at 8:51 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > > On 2/6/20 12:15 PM, Jens Axboe wrote: > > On 2/6/20 10:33 AM, Stefano Garzarella wrote: > >> > >> > >> On Fri, Jan 31, 2020 at 4:39 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > >>> > >>> On 1/31/20 7:29 AM, Stefano Garzarella wrote: > >>>> Hi Jens, > >>>> this is a v2 of the epoll test. > >>>> > >>>> v1 -> v2: > >>>> - if IORING_FEAT_NODROP is not available, avoid to overflow the CQ > >>>> - add 2 new tests to test epoll with IORING_FEAT_NODROP > >>>> - cleanups > >>>> > >>>> There are 4 sub-tests: > >>>> 1. test_epoll > >>>> 2. test_epoll_sqpoll > >>>> 3. test_epoll_nodrop > >>>> 4. test_epoll_sqpoll_nodrop > >>>> > >>>> In the first 2 tests, I try to avoid to queue more requests than we have room > >>>> for in the CQ ring. These work fine, I have no faults. > >>> > >>> Thanks! > >>> > >>>> In the tests 3 and 4, if IORING_FEAT_NODROP is supported, I try to submit as > >>>> much as I can until I get a -EBUSY, but they often fail in this way: > >>>> the submitter manages to submit everything, the receiver receives all the > >>>> submitted bytes, but the cleaner loses completion events (I also tried to put a > >>>> timeout to epoll_wait() in the cleaner to be sure that it is not related to the > >>>> patch that I send some weeks ago, but the situation doesn't change, it's like > >>>> there is still overflow in the CQ). > >>>> > >>>> Next week I'll try to investigate better which is the problem. > >>> > >>> Does it change if you have an io_uring_enter() with GETEVENTS set? I wonder if > >>> you just pruned the CQ ring but didn't flush the internal side. > >> > >> If I do io_uring_enter() with GETEVENTS set and wait_nr = 0 it solves > >> the issue, I think because we call io_cqring_events() that flushes the > >> overflow list. > >> > >> At this point, should we call io_cqring_events() (that flushes the > >> overflow list) in io_uring_poll()? > >> I mean something like this: > >> > >> diff --git a/fs/io_uring.c b/fs/io_uring.c > >> index 77f22c3da30f..2769451af89a 100644 > >> --- a/fs/io_uring.c > >> +++ b/fs/io_uring.c > >> @@ -6301,7 +6301,7 @@ static __poll_t io_uring_poll(struct file *file, poll_table *wait) > >> if (READ_ONCE(ctx->rings->sq.tail) - ctx->cached_sq_head != > >> ctx->rings->sq_ring_entries) > >> mask |= EPOLLOUT | EPOLLWRNORM; > >> - if (READ_ONCE(ctx->rings->cq.head) != ctx->cached_cq_tail) > >> + if (!io_cqring_events(ctx, false)) > >> mask |= EPOLLIN | EPOLLRDNORM; > >> > >> return mask; > > > > That's not a bad idea, would just have to verify that it is indeed safe > > to always call the flushing variant from there. > > Double checked, and it should be fine. We may be invoked with > ctx->uring_lock held, but that's fine. > Maybe yes, I'll check better and I'll send a patch :-) Thanks, Stefano