On Wed, 12 Apr 2023 11:13:50 -0700 Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > On 04/11/23 17:27, Liu Shixin wrote: > > Patch a873dfe1032a ("mm, hwpoison: try to recover from copy-on write faults") > > introduced a new copy_user_highpage_mc() function, and fix the kernel crash > > when the kernel is copying a normal page as the result of a copy-on-write > > fault and runs into an uncorrectable error. But it doesn't work for HugeTLB. > > Andrew asked about user-visible effects. Perhaps, a better way of > stating this in the commit message might be: > > Commit a873dfe1032a ("mm, hwpoison: try to recover from copy-on write > faults") introduced the routine copy_user_highpage_mc() to gracefully > handle copying of user pages with uncorrectable errors. Previously, > such copies would result in a kernel crash. hugetlb has separate code > paths for copy-on-write and does not benefit from the changes made in > commit a873dfe1032a. > > Modify hugetlb copy-on-write code paths to use copy_mc_user_highpage() > so that they can also gracefully handle uncorrectable errors in user > pages. This involves changing the hugetlb specific routine > ?copy_user_folio()? from type void to int so that it can return an error. > Modify the hugetlb userfaultfd code in the same way so that it can return > -EHWPOISON if it encounters an uncorrectable error. Thanks, but... what are the runtime effects? What does hugetlb presently do when encountering these uncorrectable error?