Re: linux-next test error

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Thu, 6 Sep 2018 09:00:51 -0700

On Thu, Sep 06, 2018 at 09:12:12AM -0400, Theodore Y. Ts'o wrote:
> So I don't see the point of changing return value block_page_mkwrite()
> (although to be honest I haven't see the value of the vm_fault_t
> change at all in the first place, at least not compared to the pain it
> has caused) but no, I don't think it's worth it.

You have a sampling bias though; you've only seen the filesystem patches.
Filesystem fault handlers are generally more complex and written by
people who have more Linux expertise.  For the device drivers, it's
been far more useful; bugs have been fixed and a lot of cargo-culted
code has been deleted.

> So what we do for functions that need to either return an error or a
> pointer is to call encode the error as a "pointer" by using ERR_PTR(),
> and the caller can determine whether or not it is a valid pointer or
> an error code by using IS_ERR_VALUE() and turning it back into an
> error by using PTR_ERR().   See include/linux/err.h.

That's _usually_ the convention when a function might return a pointer
or an error.  Sometimes we return NULL to mean "an error happened".
Sometimes that NULL means -ENOMEM.  Sometimes we return ZERO_SIZE_PTR
instead of -EINVAL.  Sometimes we return a POISON value.  It's all pretty
ad-hoc, which wouldn't be as bad if it were better documented.

> Similarly, all valid vm_fault_t's composed of VM_FAULT_xxx are
> positive integers, and all errors are passed using the kernel's
> convention of using a negative error code.  So going through lots of
> machinations to return both an error code and a vm_fault_t *really*
> wasn't necessary.

Not necessary from the point of view that there are enough bits to be able
to distinguish the two, I agree.  But from the mm point of view, it rather
does matter that you can distinguish between SIGBUS, SIGSEGV, HWPOISON
and OOM (although -ENOMEM and VM_FAULT_OOM do have the same meaning).

> The issue, as near as I can understand things, for why we're going
> through all of this churn, was there was a concern that in the mm
> code, that all of the places which received a vm_fault_t would
> sometimes see a negative error code.  The proposal here is to just
> *accept* that this will happen, and just simply have them *check* to
> see if it's a negative error code, and convert it to the appropriate
> vm_fault_t in that case.  It puts the onus of the change on the mm
> layer, where as the "blast radius" of the vm_fault_t "cleanup" is
> spread out across a large number of subsystems.
> 
> Which I wouldn't mind, if it wasn't causing pain.  But it *is* causing
> pain.

As I said earlier, your sample bias shows only pain, but there are
genuine improvements in the patches you haven't seen and don't care about.

> And it's common kernel convention to overload an error and a pointer
> using the exact same trick.  We do it *all* over the place, and quite
> frankly, it's less error prone than changing functions to return a
> pointer and an error.  No one has said, "let's do to the ERR_PTR
> convention what we've done to the vm_fault_t -- it's too confusing
> that a pointer might be an error, since people might forget to check
> for it."  If they did that, it would be NACK'ed right, left and
> center.  But yet it's a good idea for vm_fault_t?

I actually think it would be a good idea to mark functions which return
either-an-errno-or-a-pointer as returning an errptr_t.  The downside is
that we'd lose the type information (we'd only know that it's a void *
or an errno, not that it's a struct ext4_foo * or an errno).  Just like
we gradually introduced 'bool' instead of 'int' for functions which only
returned true/false.