On Tue, Feb 01, 2022 at 05:40:57PM +0000, Joao Martins wrote: > On 2/1/22 16:01, Joao Martins wrote: > > On 2/1/22 15:46, Matthew Wilcox wrote: > >> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote: > >>> On 1/31/22 20:29, Matthew Wilcox wrote: > >>>> Unless I am mistaken, you have to pass the compound head of the page > >>>> which has the error to collect_procs(). Am I mistaken? > >>>> > >>> -rc2 already has a fix for it: > >>> > >>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@xxxxxxxxxxxxxxxxxxxx/ > >>> > >>> Earlier in that function there's a: > >>> > >>> page = compound_head(page); > >>> > >>> So the @page passed to collect_procs() already is a head page. > >> > >> It's wrong though ;-( You set the HWPoison bit on the page after > >> calling compound_head(), so you set the bit on the head page instead > >> of the precise page that had the poison. > >> > > Considering that on device-dax we would unmap the whole 2M page regardless > > of the poisoned subpage isn't that actually representative still? > > > > To say this another way. We do set the HWPoison on the head page, > and not the subpage as you say, but we end up propagating the resultant > MCE action on the superset of pages in the whole PMD or PUD. > > What I was trying to say in perhaps a convoluted way is that device-dax > case isn't different than HugeTLB that only wants to poison head. If there's > a head page, there's likely a PMD or PUD populated (depending what the device > was onlined with) and thus that's what gets unmapped. There's no idea of > subpages being treated any differently, at least as far as device-dax is > concerned -- unless I miss auditing some other code path. Using HugeTLB as a model is not a good idea. THP is the model to aim for; one can choose to map pages askew (ie not aligned with a PMD), and I don't think you'll find all the mappings with the current code (eg if someone has mapped a single page of the hugepage). > fsdax IIUC seems to rely more on the subpage bit being flagged but no > functional change here for fsdax as there's only base pages there (no heads). > > >> I'm fixing this up as part of the folio patches, but you may wish to > >> fix it earlier than that. > (+Dan in case I misrepresented or missed something) > > Should we deem it a problem, I'll fix for the next -rc. > Just in case, here's the diff stashed: > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 2e2f740c63dc..661c23df8115 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1577,7 +1577,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) > static int memory_failure_dev_pagemap(unsigned long pfn, int flags, > struct dev_pagemap *pgmap) > { > - struct page *page = pfn_to_page(pfn); > + struct page *page = pfn_to_page(pfn), *subpage = page; > unsigned long size = 0; > struct to_kill *tk; > LIST_HEAD(tokill); > @@ -1631,7 +1631,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags, > * Use this flag as an indication that the dax page has been > * remapped UC to prevent speculative consumption of poison. > */ > - SetPageHWPoison(page); > + SetPageHWPoison(subpage); > > /* > * Unlike System-RAM there is no possibility to swap in a