On 2023/4/13 6:21, Mike Kravetz wrote: > On 04/12/23 14:57, Andrew Morton wrote: >> On Wed, 12 Apr 2023 11:13:50 -0700 Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >> >>> On 04/11/23 17:27, Liu Shixin wrote: >>>> Patch a873dfe1032a ("mm, hwpoison: try to recover from copy-on write faults") >>>> introduced a new copy_user_highpage_mc() function, and fix the kernel crash >>>> when the kernel is copying a normal page as the result of a copy-on-write >>>> fault and runs into an uncorrectable error. But it doesn't work for HugeTLB. >>> Andrew asked about user-visible effects. Perhaps, a better way of >>> stating this in the commit message might be: >>> >>> Commit a873dfe1032a ("mm, hwpoison: try to recover from copy-on write >>> faults") introduced the routine copy_user_highpage_mc() to gracefully >>> handle copying of user pages with uncorrectable errors. Previously, >>> such copies would result in a kernel crash. hugetlb has separate code >>> paths for copy-on-write and does not benefit from the changes made in >>> commit a873dfe1032a. > I was just going to suggest adding the line, > > Hence, copy-on-write of hugetlb user pages with uncorrectable errors > will result in a kernel crash as was the case with 'normal' pages before > commit a873dfe1032a. > > However, I'm guessing it might be more clear if we start with the > runtime effects. Something like: > > copy-on-write of hugetlb user pages with uncorrectable errors will result > in a kernel crash. This is because the copy is performed in kernel mode > and in general we can not handle accessing memory with such errors while > in kernel mode. Commit a873dfe1032a ("mm, hwpoison: try to recover from > copy-on write faults") introduced the routine copy_user_highpage_mc() to > gracefully handle copying of user pages with uncorrectable errors. However, > the separate hugetlb copy-on-write code paths were not modified as part > of commit a873dfe1032a. Thanks for your advice, I will add these explaination. > >>> Modify hugetlb copy-on-write code paths to use copy_mc_user_highpage() >>> so that they can also gracefully handle uncorrectable errors in user >>> pages. This involves changing the hugetlb specific routine >>> ?copy_user_folio()? from type void to int so that it can return an error. >>> Modify the hugetlb userfaultfd code in the same way so that it can return >>> -EHWPOISON if it encounters an uncorrectable error. >> Thanks, but... what are the runtime effects? What does hugetlb >> presently do when encountering these uncorrectable error?