On 3/28/23 3:21?PM, Jens Axboe wrote: > On 3/28/23 1:16?PM, Linus Torvalds wrote: >> On Tue, Mar 28, 2023 at 12:05?PM Linus Torvalds >> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >>> >>> But it's not like adding a 'struct iovec' explicitly to the members >>> just as extra "code documentation" would be wrong. >>> >>> I don't think it really helps, though, since you have to have that >>> other explicit structure there anyway to get the member names right. >> >> Actually, thinking a bit more about it, adding a >> >> const struct iovec xyzzy; >> >> member might be a good idea just to avoid a cast. Then that >> iter_ubuf_to_iov() macro becomes just >> >> #define iter_ubuf_to_iov(iter) (&(iter)->xyzzy) >> >> and that looks much nicer (plus still acts kind of as a "code comment" >> to clarify things). > > I went down this path, and it _mostly_ worked out. You can view the > series here, I'll send it out when I've actually tested it: > > https://git.kernel.dk/cgit/linux-block/log/?h=iter-ubuf > > A few mental notes I made along the way: > > - The IB/sound changes are now just replacing an inappropriate > iter_is_iovec() with iter->user_backed. That's nice and simple. > > - The iov_iter_iovec() case becomes a bit simpler. Or so I thought, > because we still need to add in the offset so we can't just use > out embedded iovec for that. The above branch is just using the > iovec, but I don't think this is right. > > - Looks like it exposed a block bug, where the copy in > bio_alloc_map_data() was obvious garbage but happened to work > before. > > I'm still inclined to favor this approach over the previous, even if the > IB driver is a pile of garbage and lighting it a bit more on fire would > not really hurt. > > Opinions? Or do you want me to just send it out for easier reading While cleaning up that stuff, we only have a few users of iov_iter_iovec(). Why don't we just kill them off and the helper too? That drops that part of it and it kind of works out nicely beyond that. diff --git a/fs/read_write.c b/fs/read_write.c index 7a2ff6157eda..fb932d0997d4 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -749,15 +749,15 @@ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter, return -EOPNOTSUPP; while (iov_iter_count(iter)) { - struct iovec iovec = iov_iter_iovec(iter); + const struct iovec *iov = iter->iov; ssize_t nr; if (type == READ) { - nr = filp->f_op->read(filp, iovec.iov_base, - iovec.iov_len, ppos); + nr = filp->f_op->read(filp, iov->iov_base, + iov->iov_len, ppos); } else { - nr = filp->f_op->write(filp, iovec.iov_base, - iovec.iov_len, ppos); + nr = filp->f_op->write(filp, iov->iov_base, + iov->iov_len, ppos); } if (nr < 0) { @@ -766,7 +766,7 @@ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter, break; } ret += nr; - if (nr != iovec.iov_len) + if (nr != iov->iov_len) break; iov_iter_advance(iter, nr); } diff --git a/io_uring/rw.c b/io_uring/rw.c index 4c233910e200..585461a6f6a0 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -454,7 +454,8 @@ static ssize_t loop_rw_iter(int ddir, struct io_rw *rw, struct iov_iter *iter) iovec.iov_base = iter->ubuf + iter->iov_offset; iovec.iov_len = iov_iter_count(iter); } else if (!iov_iter_is_bvec(iter)) { - iovec = iov_iter_iovec(iter); + iovec.iov_base = iter->iov->iov_base; + iovec.iov_len = iter->iov->iov_len; } else { iovec.iov_base = u64_to_user_ptr(rw->addr); iovec.iov_len = rw->len; diff --git a/mm/madvise.c b/mm/madvise.c index 340125d08c03..0701a3bd530b 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1456,7 +1456,8 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, size_t, vlen, int, behavior, unsigned int, flags) { ssize_t ret; - struct iovec iovstack[UIO_FASTIOV], iovec; + struct iovec iovstack[UIO_FASTIOV]; + const struct iovec *iovec; struct iovec *iov = iovstack; struct iov_iter iter; struct task_struct *task; @@ -1503,12 +1504,12 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, total_len = iov_iter_count(&iter); while (iov_iter_count(&iter)) { - iovec = iov_iter_iovec(&iter); - ret = do_madvise(mm, (unsigned long)iovec.iov_base, - iovec.iov_len, behavior); + iovec = iter.iov; + ret = do_madvise(mm, (unsigned long)iovec->iov_base, + iovec->iov_len, behavior); if (ret < 0) break; - iov_iter_advance(&iter, iovec.iov_len); + iov_iter_advance(&iter, iovec->iov_len); } ret = (total_len - iov_iter_count(&iter)) ? : ret; -- Jens Axboe