On 5/6/21 4:46 PM, Olivier Langlois wrote: > On Thu, 2021-05-06 at 04:42 -0400, Olivier Langlois wrote: >> On Wed, 2021-05-05 at 23:17 -0400, Olivier Langlois wrote: >>> Note that the poll remove sqe and the following poll add sqe don't >>> have >>> exactly the same user_data. >>> >>> I have this statement in between: >>> /* increment generation counter to avoid handling old events */ >>> ++anfds [fd].egen; >>> >>> poll remove cancel the previous poll add having gen 1 in its user >>> data. >>> the next poll add has it user_data storing gen var set to 2: >>> >>> 1 3 remove 85 1 >>> 1 3 add 85 2 >>> >>> 85 gen 1 res -125 >>> 85 gen 1 res 4 >>> >> Good news! >> >> I have used the io_uring tracepoints and they confirm that io_uring >> works as expected: >> >> For the above printfs, I get the following perf traces: >> >> 11940.259 Execution SVC/134675 io_uring:io_uring_submit_sqe(ctx: >> 0xffff9d3c4368c000, opcode: 7, force_nonblock: 1) >> 11940.270 Execution SVC/134675 io_uring:io_uring_complete(ctx: >> 0xffff9d3c4368c000, user_data: 4294967382, res: -125) >> 11940.272 Execution SVC/134675 io_uring:io_uring_complete(ctx: >> 0xffff9d3c4368c000) >> 11940.275 Execution SVC/134675 io_uring:io_uring_file_get(ctx: >> 0xffff9d3c4368c000, fd: 86) >> 11940.277 Execution SVC/134675 io_uring:io_uring_submit_sqe(ctx: >> 0xffff9d3c4368c000, opcode: 6, user_data: 4294967382, force_nonblock: >> 1) >> 11940.279 Execution SVC/134675 io_uring:io_uring_complete(ctx: >> 0xffff9d3c4368c000, user_data: 4294967382, res: 4) >> >> So, it seems the compiler is playing games on me. It prints the correct >> gen 2 value but is passing 1 to io_uring_sqe_set_data()... >> >> I'll try to turn optimization off to see if it helps. >> >> thx a lot again for your exceptional work! >> >> > The more I fool around trying to find the problem, the more I think > that my problem is somewhere in the liburing (v2.0) code because of a > possibly missing memory barrier. > > The function that I do use to submit the sqes is > io_uring_wait_cqe_timeout(). > > My problem did appear when I did replace libev original boilerplate > code for liburing (v2.0) used for filling and submitting the sqes. > > Do you remember when you pointed out that I wasn't setting the > user_data field for my poll remove request in: > > io_uring_prep_poll_remove(sqe, > iouring_build_user_data(IOURING_POLL, fd, anfds [fd].egen)); > // io_uring_sqe_set_data(sqe, > iouring_build_user_data(IOURING_POLL, fd, anfds [fd].egen)); > > ? > > The last call to remove the non-existant 'add 85 2' is replied by > ENOENT by io_uring and this was caught by an assert in my case > IOURING_POLL cqe handler. > > iouring_decode_user_data(cqe->user_data, &type, &fd, &gen); > > switch (type) { > > This is impossible to end up there because if you do not explicitly set > user_data, io_uring_prep_rw() is setting it to 0. > > In order for my assert to be hit, user_data would have to be set with > the commented out io_uring_sqe_set_data(), and it happens to also be > the value of the previously sent sqe user_data... > > I did replace the code above with: > > io_uring_prep_poll_remove(sqe, > iouring_build_user_data(IOURING_POLL_ADD, fd, anfds [fd].egen)); > io_uring_sqe_set_data(sqe, iouring_build_user_data(IOURING_POLL_REMOVE, > fd, anfds [fd].egen)); > > and my assert for cqe->res != -ENOENT has stopped being hit. > > There is clearly something messing with the sqe user_data communication > between user and kernel and I start to suspect that this might be > somewhere inside io_uring_wait_cqe_timeout()... What's your kernel? IORING_FEAT_EXT_ARG? e.g. ring->features & IORING_FEAT_EXT_ARG Because: /* * Like io_uring_wait_cqe(), except it accepts a timeout value as well. Note * that an sqe is used internally to handle the timeout. For kernel doesn't * support IORING_FEAT_EXT_ARG, applications using this function must never * set sqe->user_data to LIBURING_UDATA_TIMEOUT! * * For kernels without IORING_FEAT_EXT_ARG (5.10 and older), if 'ts' is * specified, the application need not call io_uring_submit() before * calling this function, as we will do that on its behalf. From this it also * follows that this function isn't safe to use for applications that split SQ * and CQ handling between two threads and expect that to work without * synchronization, as this function manipulates both the SQ and CQ side. * * For kernels with IORING_FEAT_EXT_ARG, no implicit submission is done and * hence this function is safe to use for applications that split SQ and CQ * handling between two threads. */ -- Pavel Begunkov