On Fri, Jun 18, 2021 at 07:42:41PM +0000, Al Viro wrote: > On Fri, Jun 18, 2021 at 11:50:25AM -0700, Linus Torvalds wrote: > > > I think that you may need it to be based on Al's series for that to > > work, which might be inconvenient, though. > > > > One other non-code issue: particularly since you only handle a subset > > of the iov_iter cases, it would be nice to have an explanation for > > _why_ those particular cases. > > > > IOW, have some trivial explanation for each of the cases. "iovec" is > > for regular read/write, what triggers the kvec and bvec cases? > > > > But also, the other way around. Why doesn't the pipe case trigger? No > > splice support? > > Pipe ones are strictly destinations - they can't be sources. So if you > see it called for one of those, you've a bug. > > Xarray ones are *not* - they can be sources, and that's missing here. Ah, ITER_XARRAY was added recently so I missed it. > Much more unpleasant, though, is that this thing has hard dependency on > nr_seg == 1 *AND* openly suggests the use of iov_iter_single_seg_count(), > which is completely wrong. That sucker has some weird users left (as > of #work.iov_iter), but all of them are actually due to API deficiencies > and I very much hope to kill that thing off. > > Why not simply add iov_iter_check_zeroes(), that would be called after > copy_from_iter() and verified that all that's left in the iterator > consists of zeroes? Then this copy_struct_from_...() would be > trivial to express through those two. And check_zeroes would also > be trivial, especially on top of #work.iov_iter. With no calls of > iov_iter_advance() at all, while we are at it... > > IDGI... Omar, what semantics do you really want from that primitive? RWF_ENCODED is intended to be used like this: struct encoded_iov encoded_iov = { /* compression metadata */ ... }; char compressed_data[] = ...; struct iovec iov[] = { { &encoded_iov, sizeof(encoded_iov) }, { compressed_data, sizeof(compressed_data) }, }; pwritev2(fd, iov, 2, -1, RWF_ENCODED); Basically, we squirrel away the compression metadata in the first element of the iovec array, and we use iov[0].iov_len so that we can support future extensions of struct encoded_iov in the style of copy_struct_from_user(). So this doesn't require nr_seg == 1. On the contrary, it's expected that the rest of the iovec has the compressed payload. And to support the copy_struct_from_user()-style versioning, we need to know the size of the struct encoded_iov that userspace gave us, which is the reason for the iov_iter_single_seg_count(). I know this interface isn't the prettiest. It started as a Btrfs-specific ioctl, but this approach was suggested as a way to avoid having a whole new I/O path: https://lore.kernel.org/linux-fsdevel/20190905021012.GL7777@xxxxxxxxxxxxxxxxxxx/ The copy_struct_from_iter() thing was proposed as a way to allow future extensions here: https://lore.kernel.org/linux-btrfs/20191022020215.csdwgi3ky27rfidf@xxxxxxxxxxxxxxxxxxxx/ Please let me know if you have any suggestions for how to improve this. Thanks, Omar