Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe

Catalin Marinas <catalin.marinas@xxxxxxx> · Thu, 5 May 2022 14:41:48 +0100

On Thu, May 05, 2022 at 02:39:43PM +0800, Tong Tiangen wrote:
> 在 2022/5/4 18:26, Catalin Marinas 写道:
> > On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> > > Add copy_{to, from}_user() to machine check safe.
> > > 
> > > If copy fail due to hardware memory error, only the relevant processes are
> > > affected, so killing the user process and isolate the user page with
> > > hardware memory errors is a more reasonable choice than kernel panic.
> > 
> > Just to make sure I understand - we can only recover if the fault is in
> > a user page. That is, for a copy_from_user(), we can only handle the
> > faults in the source address, not the destination.
> 
> At the beginning, I also thought we can only recover if the fault is in a
> user page.
> After discussion with a Mark[1], I think no matter user page or kernel page,
> as long as it is triggered by the user process, only related processes will
> be affected. According to this
> understanding, it seems that all uaccess can be recovered.
> 
> [1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@xxxxxxxxxx/

We can indeed safely skip this copy and return an error just like
pretending there was a user page fault. However, my point was more
around the "isolate the user page with hardware memory errors". If the
fault is on a kernel address, there's not much you can do about. You'll
likely trigger it later when you try to access that address (maybe it
was freed and re-allocated). Do we hope we won't get the same error
again on that kernel address?

-- 
Catalin