On 2023/4/13 5:57, Andrew Morton wrote: > On Wed, 12 Apr 2023 11:13:50 -0700 Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > >> On 04/11/23 17:27, Liu Shixin wrote: >>> Patch a873dfe1032a ("mm, hwpoison: try to recover from copy-on write faults") >>> introduced a new copy_user_highpage_mc() function, and fix the kernel crash >>> when the kernel is copying a normal page as the result of a copy-on-write >>> fault and runs into an uncorrectable error. But it doesn't work for HugeTLB. >> Andrew asked about user-visible effects. Perhaps, a better way of >> stating this in the commit message might be: >> >> Commit a873dfe1032a ("mm, hwpoison: try to recover from copy-on write >> faults") introduced the routine copy_user_highpage_mc() to gracefully >> handle copying of user pages with uncorrectable errors. Previously, >> such copies would result in a kernel crash. hugetlb has separate code >> paths for copy-on-write and does not benefit from the changes made in >> commit a873dfe1032a. >> >> Modify hugetlb copy-on-write code paths to use copy_mc_user_highpage() >> so that they can also gracefully handle uncorrectable errors in user >> pages. This involves changing the hugetlb specific routine >> ?copy_user_folio()? from type void to int so that it can return an error. >> Modify the hugetlb userfaultfd code in the same way so that it can return >> -EHWPOISON if it encounters an uncorrectable error. > Thanks, but... what are the runtime effects? What does hugetlb > presently do when encountering these uncorrectable error? I have tested the HugeTLB case by using tony's testcase[1](need add a MAP_HUGETLB). Before this patch, the kernel will crash due to the uncorrectable errors. After this patch, if the error occurs in copy-on-write, the process will be killed, if the errors occurs in userfaultfd, it will return -EHWPOISON. Link: https://git.kernel.org/pub/scm/linux/kernel/git/aegl/ras-tools.git [1] > > > . >