Re: [PATCH] io_uring/net: cache provided buffer group value for multishot receives

Jens Axboe <axboe@xxxxxxxxx> · Mon, 23 Jan 2023 07:06:04 -0700

On 1/23/23 7:04 AM, Dylan Yudaken wrote:
> On Sun, 2023-01-22 at 10:13 -0700, Jens Axboe wrote:
>> If we're using ring provided buffers with multishot receive, and we
>> end
>> up doing an io-wq based issue at some points that also needs to
>> select
>> a buffer, we'll lose the initially assigned buffer group as
>> io_ring_buffer_select() correctly clears the buffer group list as the
>> issue isn't serialized by the ctx uring_lock. This is fine for normal
>> receives as the request puts the buffer and finishes, but for
>> multishot,
>> we will re-arm and do further receives. On the next trigger for this
>> multishot receive, the receive will try and pick from a buffer group
>> whose value is the same as the buffer ID of the las receive. That is
>> obviously incorrect, and will result in a premature -ENOUFS error for
>> the receive even if we had available buffers in the correct group.
>>
>> Cache the buffer group value at prep time, so we can restore it for
>> future receives. This only needs doing for the above mentioned case,
>> but
>> just do it by default to keep it easier to read.
>>
>> Cc: stable@xxxxxxxxxxxxxxx
>> Fixes: b3fdea6ecb55 ("io_uring: multishot recv")
>> Fixes: 9bb66906f23e ("io_uring: support multishot in recvmsg")
>> Cc: Dylan Yudaken <dylany@xxxxxxxx>
>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>
>> ---
>>
>> diff --git a/io_uring/net.c b/io_uring/net.c
>> index fbc34a7c2743..07a6aa39ab6f 100644
>> --- a/io_uring/net.c
>> +++ b/io_uring/net.c
>> @@ -62,6 +62,7 @@ struct io_sr_msg {
>>         u16                             flags;
>>         /* initialised and used only by !msg send variants */
>>         u16                             addr_len;
>> +       u16                             buf_group;
>>         void __user                     *addr;
>>         /* used only for send zerocopy */
>>         struct io_kiocb                 *notif;
>> @@ -580,6 +581,15 @@ int io_recvmsg_prep(struct io_kiocb *req, const
>> struct io_uring_sqe *sqe)
>>                 if (req->opcode == IORING_OP_RECV && sr->len)
>>                         return -EINVAL;
>>                 req->flags |= REQ_F_APOLL_MULTISHOT;
>> +               /*
>> +                * Store the buffer group for this multishot receive
>> separately,
>> +                * as if we end up doing an io-wq based issue that
>> selects a
>> +                * buffer, it has to be committed immediately and
>> that will
>> +                * clear ->buf_list. This means we lose the link to
>> the buffer
>> +                * list, and the eventual buffer put on completion
>> then cannot
>> +                * restore it.
>> +                */
>> +               sr->buf_group = req->buf_index;
>>         }
>>  
>>  #ifdef CONFIG_COMPAT
>> @@ -816,8 +826,10 @@ int io_recvmsg(struct io_kiocb *req, unsigned
>> int issue_flags)
>>         if (kmsg->msg.msg_inq)
>>                 cflags |= IORING_CQE_F_SOCK_NONEMPTY;
>>  
>> -       if (!io_recv_finish(req, &ret, cflags, mshot_finished,
>> issue_flags))
>> +       if (!io_recv_finish(req, &ret, cflags, mshot_finished,
>> issue_flags)) {
>> +               req->buf_index = sr->buf_group;
> 
> I think this is better placed in io_recv_prep_retry()? It would remove
> the duplicated logic below

True, let's move it there instead, and then perhaps also add a comment.
I'll make that change.

-- 
Jens Axboe