On 2/24/20 8:50 AM, Jann Horn wrote: > On Mon, Feb 24, 2020 at 3:56 AM Jens Axboe <axboe@xxxxxxxxx> wrote: > [...] >> With IORING_OP_PROVIDE_BUFFER, an application can register buffers to >> use for any request. The request then sets IOSQE_BUFFER_SELECT in the >> sqe, and a given group ID in sqe->buf_group. When the fd becomes ready, >> a free buffer from the specified group is selected. If none are >> available, the request is terminated with -ENOBUFS. If successful, the >> CQE on completion will contain the buffer ID chosen in the cqe->flags >> member, encoded as: > [...] >> +static struct io_buffer *io_buffer_select(struct io_kiocb *req, int gid, >> + void *buf) >> +{ >> + struct list_head *list; >> + struct io_buffer *kbuf; >> + >> + if (req->flags & REQ_F_BUFFER_SELECTED) >> + return buf; >> + >> + list = idr_find(&req->ctx->io_buffer_idr, gid); >> + if (!list || list_empty(list)) >> + return ERR_PTR(-ENOBUFS); >> + >> + kbuf = list_first_entry(list, struct io_buffer, list); >> + list_del(&kbuf->list); >> + return kbuf; >> +} > > This function requires some sort of external locking, right? Since > it's removing an entry from a list. > > [...] >> +static struct io_buffer *io_send_recv_buffer_select(struct io_kiocb *req, >> + struct io_buffer **kbuf, >> + int *cflags) >> +{ >> + struct io_sr_msg *sr = &req->sr_msg; >> + >> + if (!(req->flags & REQ_F_BUFFER_SELECT)) >> + return req->sr_msg.buf; >> + >> + *kbuf = io_buffer_select(req, sr->gid, sr->kbuf); > > But we use it here in io_send_recv_buffer_select(), not taking any > extra locks before calling it... > >> + if (IS_ERR(*kbuf)) >> + return *kbuf; >> + >> + sr->kbuf = *kbuf; >> + if (sr->len > (*kbuf)->len) >> + sr->len = (*kbuf)->len; >> + req->flags |= REQ_F_BUFFER_SELECTED; >> + >> + *cflags = (*kbuf)->bid << IORING_CQE_BUFFER_SHIFT; >> + *cflags |= IORING_CQE_F_BUFFER; >> + return u64_to_user_ptr((*kbuf)->addr); >> +} >> + >> static int io_send(struct io_kiocb *req, struct io_kiocb **nxt, >> bool force_nonblock) >> { >> #if defined(CONFIG_NET) >> + struct io_buffer *kbuf = NULL; >> struct socket *sock; >> - int ret; >> + int ret, cflags = 0; >> >> if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL)) >> return -EINVAL; >> @@ -3217,9 +3323,16 @@ static int io_send(struct io_kiocb *req, struct io_kiocb **nxt, >> struct io_sr_msg *sr = &req->sr_msg; >> struct msghdr msg; >> struct iovec iov; >> + void __user *buf; >> unsigned flags; >> >> - ret = import_single_range(WRITE, sr->buf, sr->len, &iov, >> + buf = io_send_recv_buffer_select(req, &kbuf, &cflags); > > ... and call io_send_recv_buffer_select() from here without holding > any extra locks in this function. This function is then called from > io_issue_sqe() (no extra locks held there either), which is called > from __io_queue_sqe() (no extra locks AFAICS), which is called from > places like task work. > > Am I missing something? Submission is all done under the ctx->uring_lock mutex, though the async stuff does not. So for the normal path we should all be fine, as we'll be serialized for that. We just need to ensure we do buffer select when we prep the request for async execution, so the workers never have to do it. Either that, or ensure the workers grab the mutex before doing the grab (less ideal). > It might in general be helpful to have more lockdep assertions in this > code, both to help the reader understand the locking rules and to help > verify locking correctness. Agree, that'd be a good addition. -- Jens Axboe