Re: Keep getting the same buffer ID when RECV with IOSQE_BUFFER_SELECT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There was a bug in the echo server code with re-registering the buffer: https://github.com/frevib/io_uring-echo-server/commit/aa6f2a09ca14c6aa17779a22343b9e7d4b3c7994 

Please try the latest master branch, maybe it'll help you with your own code as well. 

--
Hielke de Vries


On Fri, Jul 3, 2020, at 22:22, Daniele Salvatore Albano wrote:
> On Fri, 3 Jul 2020 at 21:12, Jens Axboe <axboe@xxxxxxxxx> wrote:
> >
> > On 7/3/20 1:09 PM, Daniele Salvatore Albano wrote:
> > > On Fri, 3 Jul 2020, 20:57 Jens Axboe, <axboe@xxxxxxxxx <mailto:axboe@xxxxxxxxx>> wrote:
> > >
> > >     On 7/3/20 12:48 PM, Daniele Salvatore Albano wrote:
> > >     > Hi,
> > >     >
> > >     > I have recently started to play with io_uring and liburing but I am
> > >     > facing an odd issue, of course initially I thought it was my code but
> > >     > after further investigation and testing some other code (
> > >     > https://github.com/frevib/io_uring-echo-server/tree/io-uring-op-provide-buffers
> > >     > ) I faced the same behaviour.
> > >     >
> > >     > When using the IOSQE_BUFFER_SELECT with RECV I always get the first
> > >     > read right but all the subsequent return a buffer id different from
> > >     > what was used by the kernel.
> > >     >
> > >     > The problem starts to happen only after io_uring_prep_provide_buffers
> > >     > is invoked to put back the buffer, the bid set is the one from cflags
> > >     >>> 16.
> > >     >
> > >     > The logic is as follow:
> > >     > - io_uring_prep_provide_buffers + io_uring_submit + io_uring_wait_cqe
> > >     > initialize all the buffers at the beginning
> > >     > - within io_uring_for_each_cqe, when accepting a new connection a recv
> > >     > sqe is submitted with the IOSQE_BUFFER_SELECT flag
> > >     > - within io_uring_for_each_cqe, when recv a send sqe is submitted
> > >     > using as buffer the one specified in cflags >> 16
> > >     > - within io_uring_for_each_cqe, when send a provide buffers for the
> > >     > bid used to send the data and a recv sqes are submitted.
> > >     >
> > >     > If I drop io_uring_prep_provide_buffers both in my code and in the
> > >     > code I referenced above it just works, but of course at some point
> > >     > there are no more buffers available.
> > >     >
> > >     > To further debug the issue I reduced the amount of provided buffers
> > >     > and started to print out the entire bufferset and I noticed that after
> > >     > the first correct RECV the kernel stores the data in the first buffer
> > >     > of the group id but always returns the last buffer id.
> > >     > It is like after calling io_uring_prep_provide_buffers the information
> > >     > on the kernel side gets corrupted, I tried to follow the logic on the
> > >     > kernel side but there is nothing apparent that would make me
> > >     > understand why I am facing this behaviour.
> > >     >
> > >     > The original author of that code told me on SO that he wrote & tested
> > >     > it on the kernel 5.6 + the provide buffers branch, I am facing this
> > >     > issue with 5.7.6, 5.8-rc1 and 5.8-rc3. The liburing library is built
> > >     > out of the branch, I didn't do too much testing with different
> > >     > versions but I tried to figure out where the issue was for the last
> > >     > week and within this period I have pulled multiple times the repo.
> > >     >
> > >     > Any hint or suggestion?
> > >
> > >     Do you have a simple test case for this that can be run standalone?
> > >     I'll take a look, but I'd rather not spend time re-creating a test case
> > >     if you already have one.
> > >
> > >     --
> > >     Jens Axboe
> > >
> > >
> > > I will shrink down the code to produce a simple test case but not sure
> > > how much code I will be able to lift because it's showing this
> > > behaviour on a second recv of a connection so I will need to keep all
> > > the boilerplate code to get there.
> >
> > That's fine, I'm just looking to avoid having to write it from scratch.
> > Plus a test case is easier to deal with than trying to write a test case
> > based on your description, less room for interpretative errors.
> >
> > --
> > Jens Axboe
> >
> 
> Hi,
> 
> attached the test case, to make it as compact as possible I dropped as
> well the error code checking as well.
> 
> I have added some fprintf around the code, just connect to localhost
> port 5001 using telnet (it will send a line, it will be a bit easier
> to check the output).
> 
> On the first message you will see a like like
> [CQE][RECV] fd 5, cqe flags high: 9, cqe flags low: 1
> 
> and a number of lines to show the content of the buffers with the last
> buffer containing the message sent via telnet.
> 
> On the second message you will instead see again
> [CQE][RECV] fd 5, cqe flags high: 5, cqe flags low: 1
> 
> but the buffer actually containing the sent line will be the number 0.
> 
> On all the successive submits the used buffer will still be 0 but the
> high part of cqe->flags will still contain 9.
> 
> Or at least this is what I am experiencing.
> 
> If you comment out line 110, 111 and 112 it will work as (I think)
> expected but of course you will finish the buffers (and get an
> undefined behaviour because the code is not managing the errors at
> all).
> 
> 
> Thanks!
> Daniele
> 
> Attachments:
> * test.c



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux