On 2/1/2022 2:20 PM, Dan Williams wrote: > On Tue, Feb 1, 2022 at 2:05 PM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote: >> >> On 2/1/22 21:11, Jane Chu wrote: >>> On 2/1/2022 7:46 AM, Matthew Wilcox wrote: >>>> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote: >>>>> On 1/31/22 20:29, Matthew Wilcox wrote: >>>>>> Unless I am mistaken, you have to pass the compound head of the page >>>>>> which has the error to collect_procs(). Am I mistaken? >>>>>> >>>>> -rc2 already has a fix for it: >>>>> >>>>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@xxxxxxxxxxxxxxxxxxxx/ >>>>> >>>>> Earlier in that function there's a: >>>>> >>>>> page = compound_head(page); >>>>> >>>>> So the @page passed to collect_procs() already is a head page. >>>> >>>> It's wrong though ;-( You set the HWPoison bit on the page after >>>> calling compound_head(), so you set the bit on the head page instead >>>> of the precise page that had the poison. >>> >>> Indeed. The rest of the kernel including pmem driver still deal with >>> base page on clearing poison, bookkeeping etc. So the HWpoison bit needs >>> to be set precisely on the poisoned base page such that we pass the >>> correct 'pfn' to set_mce_nospec() to discourage speculative access. >>> >> set_mce_nospec() machinery makes no use of the HWPoison bit as far as >> my reading goes. And the PFN that is passed to set_mce_nospec() is already >> the subpage PFN that eventually lands on set_memory_np()/set_memory_uc() when >> it changes the kernel page tables mapping (which also don't use the poison bit). >> >> I still can't see how device-dax machinery makes use of that bit? At least >> the one which could use it (clear_mce_nospec()) doesn't actually go through >> device-dax nvdimm-specific code only fsdax which I reiterate that the patch >> does not change as there's no compound head there. Am I missing something? > > device-dax does not use that bit, because there is no kernel mediated > access to the backing range. In the fsdax case that bit is used to > determine when to run clear_mce_nospec(). Jane is in the process of > reworking this path. Thanks Dan! Sorry for confusing set_mce_nospec with PageHWpoison bit, another look at the code leads me to realize that perhaps for devdax, the PageHWpoison bit is never cleared during the life time of the page. and that is okay since the kernel parts that deal with devdax pages do not care. As for fsdax, fsdax use base pages and Joao's code doesn't change that, hence the PageHWpoison bit would be set precisely in the poisoned base page. thanks! -jane