On 10/19/23 2:06 AM, Ming Lei wrote: > Hello Jens, > > Guang Wu found that tests::net::test_tcp_recv_multi in rust:io_uring > hangs, and no such issue in RH test kernel. > > - git clone https://github.com/tokio-rs/io-uring.git > - cd io-uring > - cargo run --package io-uring-test > > I figured out that it is made by missing the last CQE with -ENOBUFS, > which is caused by commit a2741c58ac67 ("io_uring/net: don't retry recvmsg() > unnecessarily"). > > I am not sure if the last CQE should be returned and that depends how normal > recv_multi is written, but IORING_CQE_F_MORE in the previous CQE shouldn't be > returned at least. Is this because it depends on this spurious retry? IOW, it adds N buffers and triggers N receives, then depends on an internal extra retry which would then yield -ENOBUFS? Because that sounds like a broken test. As long as the recv triggers successfully, IORING_CQE_F_MORE will be set. Only if it his some terminating condition would it trigger a CQE without the MORE flag set. If it remains armed and ready to trigger again, it will have MORE set. I'll take a look, this is pure guesswork on my side right now. We've done quite a lot of testing with recv multishot with this change, and haven't had any issues. Which is why I'm a bit skeptical. -- Jens Axboe