On Fri, Jun 18, 2021 at 05:43:21PM -0700, Omar Sandoval wrote: > On Fri, Jun 18, 2021 at 10:32:54PM +0000, Al Viro wrote: > > On Fri, Jun 18, 2021 at 03:10:03PM -0700, Omar Sandoval wrote: > > > > > Or do the same reverting thing that Al did, but with copy_from_iter() > > > instead of copy_from_iter_full() and being careful with the copied count > > > (which I'm not 100% sure I got correct here): > > > > > > size_t copied = copy_from_iter(&encoded, sizeof(encoded), &i); > > > if (copied < offsetofend(struct encoded_iov, size)) > > > return -EFAULT; > > > if (encoded.size > PAGE_SIZE) > > > return -E2BIG; > > > if (encoded.size < ENCODED_IOV_SIZE_VER0) > > > return -EINVAL; > > > if (encoded.size > sizeof(encoded)) { > > > if (copied < sizeof(encoded) > > > return -EFAULT; > > > if (!iov_iter_check_zeroes(&i, encoded.size - sizeof(encoded)) > > > return -EINVAL; > > > } else if (encoded.size < sizeof(encoded)) { > > > // older than what we expect > > > if (copied < encoded.size) > > > return -EFAULT; > > > iov_iter_revert(&i, copied - encoded.size); > > > memset((void *)&encoded + encoded.size, 0, sizeof(encoded) - encoded.size); > > > } > > > > simpler than that, actually - > > > > copied = copy_from_iter(&encoded, sizeof(encoded), &i); > > if (unlikely(copied < sizeof(encoded))) { > > if (copied < offsetofend(struct encoded_iov, size) || > > copied < encoded.size) > > return iov_iter_count(i) ? -EFAULT : -EINVAL; > > } > > if (encoded.size > sizeof(encoded)) { > > if (!iov_iter_check_zeroes(&i, encoded.size - sizeof(encoded)) > > return -EINVAL; > > } else if (encoded.size < sizeof(encoded)) { > > // copied can't be less than encoded.size here - otherwise > > // we'd have copied < sizeof(encoded) and the check above > > // would've buggered off > > iov_iter_revert(&i, copied - encoded.size); > > memset((void *)&encoded + encoded.size, 0, sizeof(encoded) - encoded.size); > > } > > > > should do it. > > Thanks, Al, I'll send an updated version with this approach next week. Okay, so this works for the write side of RWF_ENCODED, but it causes problems for the read side. That currently works like so: struct encoded_iov encoded_iov; char compressed_data[...]; struct iovec iov[] = { { &encoded_iov, sizeof(encoded_iov) }, { compressed_data, sizeof(compressed_data) }, }; preadv2(fd, iov, 2, -1, RWF_ENCODED); The kernel fills in the encoded_iov with the compression metadata and the remaining buffers with the compressed data. The kernel needs to know how much of the iovec is for the encoded_iov. The backwards compatibility is similar to the write side: if the kernel size is less than the userspace size, then we can fill in extra zeroes. If the kernel size is greater than the userspace size and all of the extra metadata is zero, then we can omit it. If the extra metadata is non-zero, then we return an error. How do we get the userspace size with the encoded_iov.size approach? We'd have to read the size from the iov_iter before writing to the rest of the iov_iter. Is it okay to mix the iov_iter as a source and destination like this? From what I can tell, it's not intended to be used like this.