On Sun, 2023-01-22 at 10:13 -0700, Jens Axboe wrote: > If we're using ring provided buffers with multishot receive, and we > end > up doing an io-wq based issue at some points that also needs to > select > a buffer, we'll lose the initially assigned buffer group as > io_ring_buffer_select() correctly clears the buffer group list as the > issue isn't serialized by the ctx uring_lock. This is fine for normal > receives as the request puts the buffer and finishes, but for > multishot, > we will re-arm and do further receives. On the next trigger for this > multishot receive, the receive will try and pick from a buffer group > whose value is the same as the buffer ID of the las receive. That is > obviously incorrect, and will result in a premature -ENOUFS error for > the receive even if we had available buffers in the correct group. > > Cache the buffer group value at prep time, so we can restore it for > future receives. This only needs doing for the above mentioned case, > but > just do it by default to keep it easier to read. > > Cc: stable@xxxxxxxxxxxxxxx > Fixes: b3fdea6ecb55 ("io_uring: multishot recv") > Fixes: 9bb66906f23e ("io_uring: support multishot in recvmsg") > Cc: Dylan Yudaken <dylany@xxxxxxxx> > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > > --- > > diff --git a/io_uring/net.c b/io_uring/net.c > index fbc34a7c2743..07a6aa39ab6f 100644 > --- a/io_uring/net.c > +++ b/io_uring/net.c > @@ -62,6 +62,7 @@ struct io_sr_msg { > u16 flags; > /* initialised and used only by !msg send variants */ > u16 addr_len; > + u16 buf_group; > void __user *addr; > /* used only for send zerocopy */ > struct io_kiocb *notif; > @@ -580,6 +581,15 @@ int io_recvmsg_prep(struct io_kiocb *req, const > struct io_uring_sqe *sqe) > if (req->opcode == IORING_OP_RECV && sr->len) > return -EINVAL; > req->flags |= REQ_F_APOLL_MULTISHOT; > + /* > + * Store the buffer group for this multishot receive > separately, > + * as if we end up doing an io-wq based issue that > selects a > + * buffer, it has to be committed immediately and > that will > + * clear ->buf_list. This means we lose the link to > the buffer > + * list, and the eventual buffer put on completion > then cannot > + * restore it. > + */ > + sr->buf_group = req->buf_index; > } > > #ifdef CONFIG_COMPAT > @@ -816,8 +826,10 @@ int io_recvmsg(struct io_kiocb *req, unsigned > int issue_flags) > if (kmsg->msg.msg_inq) > cflags |= IORING_CQE_F_SOCK_NONEMPTY; > > - if (!io_recv_finish(req, &ret, cflags, mshot_finished, > issue_flags)) > + if (!io_recv_finish(req, &ret, cflags, mshot_finished, > issue_flags)) { > + req->buf_index = sr->buf_group; I think this is better placed in io_recv_prep_retry()? It would remove the duplicated logic below > goto retry_multishot; > + } > > if (mshot_finished) { > /* fast path, check for non-NULL to avoid function > call */ > @@ -918,8 +930,10 @@ int io_recv(struct io_kiocb *req, unsigned int > issue_flags) > if (msg.msg_inq) > cflags |= IORING_CQE_F_SOCK_NONEMPTY; > > - if (!io_recv_finish(req, &ret, cflags, ret <= 0, > issue_flags)) > + if (!io_recv_finish(req, &ret, cflags, ret <= 0, > issue_flags)) { > + req->buf_index = sr->buf_group; > goto retry_multishot; > + } > > return ret; > } >