Re: [PATCH v8 00/10] fs: interface for directly reading/writing compressed data

Omar Sandoval <osandov@xxxxxxxxxxx> · Fri, 19 Mar 2021 14:11:56 -0700

On Fri, Mar 19, 2021 at 01:55:18PM -0700, Linus Torvalds wrote:
> On Fri, Mar 19, 2021 at 1:27 PM Omar Sandoval <osandov@xxxxxxxxxxx> wrote:
> >
> > For RWF_ENCODED, iov[0] is always used as the entirety of the struct. I
> > made the helper more generic to support other use cases, but if that's
> > the main objection I can easily make it specifically use iov[0].
> 
> Honestly, with new interfaces, I'd prefer to always start off as
> limited as possible.
> 
> And read/write is not very limited (but O_ALLOW_ENCODED and
> RWF_ENCODED at least helps with the "fool suid program to do it"). But
> at least we could make sure that the structure then has to be as
> strict as humanly possible.
> 
> So it's not so much a "main objection" as more about trying to make
> the rules stricter in the hope that that at least makes only one very
> particular way of doing things valid. I'd hate for user space to start
> 'streaming" struct data.

After spending a few minutes trying to simplify copy_struct_from_iter(),
it's honestly easier to just use the iterate_all_kinds() craziness than
open coding it to only operate on iov[0]. But that's an implementation
detail, and we can trivially make the interface stricter:

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 395ca89e5d9b..41b6b0325d18 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -969,9 +969,9 @@ EXPORT_SYMBOL(copy_page_from_iter);
  * On success, the iterator is advanced @usize bytes. On error, the iterator is
  * not advanced.
  */
-int copy_struct_from_iter(void *dst, size_t ksize, struct iov_iter *i,
-			  size_t usize)
+int copy_struct_from_iter(void *dst, size_t ksize, struct iov_iter *i)
 {
+	size_t usize = iov_iter_single_seg_count(i);
 	if (usize <= ksize) {
 		if (!copy_from_iter_full(dst, usize, i))
 			return -EFAULT;

I.e., the size of the userspace structure is always the remaining size
of the current segment. Maybe we can even throw in a check that we're
either at the beginning of the current segment or the very beginning of
the whole iter, what do you think?

(Again, this is what RWF_ENCODED already does, it was just easier to
write copy_struct_from_iter() more generically).

> > > Also I see references to the man-page, but honestly, that's not how
> > > the kernel UAPI should be defined ("just read the man-page"), plus I
> > > refuse to live in the 70's, and consider troff to be an atrocious
> > > format.
> >
> > No disagreement here, troff is horrible to read.
> >
> > > So make the UAPI explanation for this horror be in a legible format
> > > that is actually part of the kernel so that I can read what the intent
> > > is, instead of having to decode hieroglypics.
> >
> > I didn't want to document the UAPI in two places that would need to be
> > kept in sync
> 
> Honestly, I would suggest that nobody ever write troff format stuff.
> I don't think it supports UTF-8 properly, for example, which means
> that you can't even give credit to people properly etc.
> 
> RST isn't perfect, but at least it's human-legible, and it's obviously
> what we're encouraging for kernel use. You can use rst2man to convert
> to bad formats (and yes, you might lose something in the translation,
> eg proper names etc).

It looks like there's precedent for man pages being generated from the
kernel documentation [1], so I'll try that.

1: https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=53666f6c30451cde022f65d35a8d448f5a7132ba

> Almost anything else(*) is better than troff. But at least I can read
> the formatted version.
> 
>           Linus
> 
> (*) With the possible exception of "info" files. Now *there* is a
> truly pointless format maximally designed to make it inconvenient for
> users.