There was a bug in the echo server code with re-registering the buffer: https://github.com/frevib/io_uring-echo-server/commit/aa6f2a09ca14c6aa17779a22343b9e7d4b3c7994 Please try the latest master branch, maybe it'll help you with your own code as well. -- Hielke de Vries On Fri, Jul 3, 2020, at 22:22, Daniele Salvatore Albano wrote: > On Fri, 3 Jul 2020 at 21:12, Jens Axboe <axboe@xxxxxxxxx> wrote: > > > > On 7/3/20 1:09 PM, Daniele Salvatore Albano wrote: > > > On Fri, 3 Jul 2020, 20:57 Jens Axboe, <axboe@xxxxxxxxx <mailto:axboe@xxxxxxxxx>> wrote: > > > > > > On 7/3/20 12:48 PM, Daniele Salvatore Albano wrote: > > > > Hi, > > > > > > > > I have recently started to play with io_uring and liburing but I am > > > > facing an odd issue, of course initially I thought it was my code but > > > > after further investigation and testing some other code ( > > > > https://github.com/frevib/io_uring-echo-server/tree/io-uring-op-provide-buffers > > > > ) I faced the same behaviour. > > > > > > > > When using the IOSQE_BUFFER_SELECT with RECV I always get the first > > > > read right but all the subsequent return a buffer id different from > > > > what was used by the kernel. > > > > > > > > The problem starts to happen only after io_uring_prep_provide_buffers > > > > is invoked to put back the buffer, the bid set is the one from cflags > > > >>> 16. > > > > > > > > The logic is as follow: > > > > - io_uring_prep_provide_buffers + io_uring_submit + io_uring_wait_cqe > > > > initialize all the buffers at the beginning > > > > - within io_uring_for_each_cqe, when accepting a new connection a recv > > > > sqe is submitted with the IOSQE_BUFFER_SELECT flag > > > > - within io_uring_for_each_cqe, when recv a send sqe is submitted > > > > using as buffer the one specified in cflags >> 16 > > > > - within io_uring_for_each_cqe, when send a provide buffers for the > > > > bid used to send the data and a recv sqes are submitted. > > > > > > > > If I drop io_uring_prep_provide_buffers both in my code and in the > > > > code I referenced above it just works, but of course at some point > > > > there are no more buffers available. > > > > > > > > To further debug the issue I reduced the amount of provided buffers > > > > and started to print out the entire bufferset and I noticed that after > > > > the first correct RECV the kernel stores the data in the first buffer > > > > of the group id but always returns the last buffer id. > > > > It is like after calling io_uring_prep_provide_buffers the information > > > > on the kernel side gets corrupted, I tried to follow the logic on the > > > > kernel side but there is nothing apparent that would make me > > > > understand why I am facing this behaviour. > > > > > > > > The original author of that code told me on SO that he wrote & tested > > > > it on the kernel 5.6 + the provide buffers branch, I am facing this > > > > issue with 5.7.6, 5.8-rc1 and 5.8-rc3. The liburing library is built > > > > out of the branch, I didn't do too much testing with different > > > > versions but I tried to figure out where the issue was for the last > > > > week and within this period I have pulled multiple times the repo. > > > > > > > > Any hint or suggestion? > > > > > > Do you have a simple test case for this that can be run standalone? > > > I'll take a look, but I'd rather not spend time re-creating a test case > > > if you already have one. > > > > > > -- > > > Jens Axboe > > > > > > > > > I will shrink down the code to produce a simple test case but not sure > > > how much code I will be able to lift because it's showing this > > > behaviour on a second recv of a connection so I will need to keep all > > > the boilerplate code to get there. > > > > That's fine, I'm just looking to avoid having to write it from scratch. > > Plus a test case is easier to deal with than trying to write a test case > > based on your description, less room for interpretative errors. > > > > -- > > Jens Axboe > > > > Hi, > > attached the test case, to make it as compact as possible I dropped as > well the error code checking as well. > > I have added some fprintf around the code, just connect to localhost > port 5001 using telnet (it will send a line, it will be a bit easier > to check the output). > > On the first message you will see a like like > [CQE][RECV] fd 5, cqe flags high: 9, cqe flags low: 1 > > and a number of lines to show the content of the buffers with the last > buffer containing the message sent via telnet. > > On the second message you will instead see again > [CQE][RECV] fd 5, cqe flags high: 5, cqe flags low: 1 > > but the buffer actually containing the sent line will be the number 0. > > On all the successive submits the used buffer will still be 0 but the > high part of cqe->flags will still contain 9. > > Or at least this is what I am experiencing. > > If you comment out line 110, 111 and 112 it will work as (I think) > expected but of course you will finish the buffers (and get an > undefined behaviour because the code is not managing the errors at > all). > > > Thanks! > Daniele > > Attachments: > * test.c