On 3/4/25 15:40, Pavel Begunkov wrote:
Add io_import_reg_vec(), which will be responsible for importing vectored registered buffers. iovecs are overlapped with the resulting bvec in memory, which is why the iovec is expected to be padded in iou_vec. Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> ---
...
+int io_import_reg_vec(int ddir, struct iov_iter *iter, + struct io_kiocb *req, struct iou_vec *vec, + unsigned nr_iovs, unsigned iovec_off, + unsigned issue_flags) +{ + struct io_rsrc_node *node; + struct io_mapped_ubuf *imu; + struct iovec *iov; + unsigned nr_segs; + + node = io_find_buf_node(req, issue_flags); + if (!node) + return -EFAULT; + imu = node->buf; + if (imu->is_kbuf) + return -EOPNOTSUPP; + if (!(imu->dir & (1 << ddir))) + return -EFAULT; + + iov = vec->iovec + iovec_off; + nr_segs = io_estimate_bvec_size(iov, nr_iovs, imu);
if (sizeof(struct bio_vec) > sizeof(struct iovec)) { size_t entry_sz = sizeof(struct iovec); size_t bvec_bytes = nr_segs * sizeof(struct bio_vec); size_t iovec_off = (bvec_bytes + entry_sz - 1) / entry_sz; nr_segs += iovec_off; } How about fixing it up like this for now? Instead of overlapping bvec with iovec, it'd put them back to back and waste some memory on 32bit. I can try to make it a bit tighter, remove the if and let the compiler to optimise it into no-op for x64, or allocate max(bvec, iovec) * nr and see where it leads. But in either way IMHO it's better to be left until I get more time.
+ + if (WARN_ON_ONCE(iovec_off + nr_iovs != vec->nr) || + nr_segs > vec->nr) { + struct iou_vec tmp_vec = {}; + int ret; + + ret = io_vec_realloc(&tmp_vec, nr_segs); + if (ret) + return ret; + + iovec_off = tmp_vec.nr - nr_iovs; + memcpy(tmp_vec.iovec + iovec_off, iov, sizeof(*iov) * nr_iovs); + io_vec_free(vec); + + *vec = tmp_vec; + iov = vec->iovec + iovec_off; + req->flags |= REQ_F_NEED_CLEANUP; + } + + return io_vec_fill_bvec(ddir, iter, imu, iov, nr_iovs, vec); +}
-- Pavel Begunkov