On 10/19/23 5:47 AM, Ming Lei wrote: > On Thu, Oct 19, 2023 at 05:31:11AM -0600, Jens Axboe wrote: >> On 10/19/23 2:06 AM, Ming Lei wrote: >>> Hello Jens, >>> >>> Guang Wu found that tests::net::test_tcp_recv_multi in rust:io_uring >>> hangs, and no such issue in RH test kernel. >>> >>> - git clone https://github.com/tokio-rs/io-uring.git >>> - cd io-uring >>> - cargo run --package io-uring-test >>> >>> I figured out that it is made by missing the last CQE with -ENOBUFS, >>> which is caused by commit a2741c58ac67 ("io_uring/net: don't retry recvmsg() >>> unnecessarily"). >>> >>> I am not sure if the last CQE should be returned and that depends how normal >>> recv_multi is written, but IORING_CQE_F_MORE in the previous CQE shouldn't be >>> returned at least. >> >> Is this because it depends on this spurious retry? IOW, it adds N >> buffers and triggers N receives, then depends on an internal extra retry >> which would then yield -ENOBUFS? Because that sounds like a broken test. > > Yeah, that is basically what the test does. > > The test gets two recv CQEs, both have IORING_CQE_F_MORE. And it waits for 3 > CQEs, and never return because there isn't the 3rd CQE after > a2741c58ac67 ("io_uring/net: don't retry recvmsg() unnecessarily") > is merged. Right, and this is why it's invalid. If you send two, you will get two. This is a misunderstanding of how recv multishot works, and the test relied on an odd quirk where we'd sometimes re-trigger a recv even though we did not have to. >> As long as the recv triggers successfully, IORING_CQE_F_MORE will be >> set. Only if it his some terminating condition would it trigger a CQE >> without the MORE flag set. If it remains armed and ready to trigger >> again, it will have MORE set. I'll take a look, this is pure guesswork >> on my side right now. > > .B IORING_CQE_F_MORE > If set, the application should expect more completions from the request. This > is used for requests that can generate multiple completions, such as multi-shot > requests, receive, or accept. > > I understand that if one CQE is received with IORING_CQE_F_MORE, it is > reasonable for userspace to wait for one extra CQE, is this expectation > correct? Or the documentation needs to be updated? This is correct, if IORING_CQE_F_MORE is set, the request remains armed and will trigger an even again. That next event may not have IORING_CQE_F_MORE set, but there will always be that next even _as long as something causes a new cqe to be issued_. The test sets up two buffers, and arms a recv multishot for them. It then proceeds to send two buffers, which are received and completed. Each of those CQEs will have MORE set, because they both completed successfully, and if someone sends more data, it will trigger again. What the test should do is ensure that another recv is triggered, I've attached a dummy patch below. This is simply a broken test. No errors have occurred (eg running out of buffers, or receive being shorter than it should be), hence there's no reason for io_uring to terminate the multishot request. diff --git a/io-uring-test/src/tests/net.rs b/io-uring-test/src/tests/net.rs index 18beb20773cf..82208443f2df 100644 --- a/io-uring-test/src/tests/net.rs +++ b/io-uring-test/src/tests/net.rs @@ -1100,7 +1100,9 @@ pub fn test_tcp_recv_multi<S: squeue::EntryMarker, C: cqueue::EntryMarker>( // Send one package made of two segments, and receive as two buffers, each max length 1024 // so the first buffer received should be length 1024 and the second length 256. let mut input = vec![0xde; 1024]; - input.extend_from_slice(&[0xad; 256]); + input.extend_from_slice(&[0xad; 1024]); + let mut enobufs = vec![0xff; 256]; + input.append(&mut enobufs); let mut bufs = vec![0; 2 * 1024]; // provide bufs @@ -1118,7 +1120,7 @@ pub fn test_tcp_recv_multi<S: squeue::EntryMarker, C: cqueue::EntryMarker>( assert_eq!(cqe.user_data(), 0x21); assert_eq!(cqe.result(), 0); - // write all 1024 + 256 + // write all 1024 + 1024 + 256 send_stream.write_all(&input)?; // multishot recv using a buf_group with 1024 length buffers @@ -1143,7 +1145,7 @@ pub fn test_tcp_recv_multi<S: squeue::EntryMarker, C: cqueue::EntryMarker>( assert_eq!(&bufs[..1024], &input[..1024]); assert_eq!(cqes[1].user_data(), 0x22); - assert_eq!(cqes[1].result(), 256); // length 256 + assert_eq!(cqes[1].result(), 1024); // length 1024 assert!(cqueue::more(cqes[1].flags())); assert_eq!(cqueue::buffer_select(cqes[1].flags()), Some(1)); assert_eq!(&bufs[1024..(1024 + 256)], &input[1024..(1024 + 256)]); -- Jens Axboe