Re: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()

Dan Williams <dan.j.williams@xxxxxxxxx> · Thu, 30 Apr 2020 18:21:45 -0700

On Thu, Apr 30, 2020 at 5:10 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Apr 30, 2020 at 4:52 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> >
> > You had me until here. Up to this point I was grokking that Andy's
> > "_fallible" suggestion does help explain better than "_safe", because
> > the copy is doing extra safety checks. copy_to_user() and
> > copy_to_user_fallible() mean *something* where copy_to_user_safe()
> > does not.
>
> It's a horrible word, btw. The word doesn't actually mean what Andy
> means it to mean. "fallible" means "can make mistakes", not "can
> fault".
>
> So "fallible" is a horrible name.
>
> But anyway, I don't hate something like "copy_to_user_fallible()"
> conceptually. The naming needs to be fixed, in that "user" can always
> take a fault, so it's the _source_ that can fault, not the "user"
> part.
>
> It was the "copy_safe()" model that I find unacceptable, that uses
> _one_ name for what is at the very least *four* different operations:
>
>  - copy from faulting memory to user
>
>  - copy from faulting memory to kernel
>
>  - copy from kernel to faulting memory
>
>  - copy within faulting memory
>
> No way can you do that with one single function. A kernel address and
> a user address may literally have the exact same bit representation.
> So the user vs kernel distinction _has_ to be in the name.
>
> The "kernel vs faulting" doesn't necessarily have to be there from an
> implementation standpoint, but it *should* be there, because
>
>  - it might affect implemmentation
>
>  - but even if it DOESN'T affect implementation, it should be separate
> just from the standpoint of being self-documenting code.
>
> > However you lose me on this "broken nvdimm semantics" contention.
> > There is nothing nvdimm-hardware specific about the copy_safe()
> > implementation, zero, nada, nothing new to the error model that DRAM
> > did not also inflict on the Linux implementation.
>
> Ok, so good. Let's kill this all, and just use memcpy(), and copy_to_user().
>
> Just make sure that the nvdimm code doesn't use invalid kernel
> addresses or other broken poisoning.
>
> Problem solved.
>
> You can't have it both ways. Either memcpy just works, or it doesn't.

It doesn't, but copy_to_user() is frustratingly close and you can see
in the patch that I went ahead and used copy_user_generic() to
implement the backend of the default "fast" implementation.

However now I see that copy_user_generic() works for the wrong reason.
It works because the exception on the source address due to poison
looks no different than a write fault on the user address to the
caller, it's still just a short copy. So it makes copy_to_user() work
for the wrong reason relative to the name.

How about, following your suggestion, introduce copy_mc_to_user() (can
just use copy_user_generic() internally) and copy_mc_to_kernel() for
the other the helpers that the copy_to_iter() implementation needs?
That makes it clear that no mmu-faults are expected on reads, only
exceptions, and no protection-faults are expected at all for
copy_mc_to_kernel() even if it happens to accidentally handle it.
Following Jann's ex_handler_uaccess() example I could arrange for
copy_mc_to_kernel() to use a new _ASM_EXTABLE_MC() to validate that
the only type of exception meant to be handled is MC and warn
otherwise?