On 4/13/2022 7:32 PM, Dan Williams wrote: > On Wed, Apr 13, 2022 at 4:36 PM Jane Chu <jane.chu@xxxxxxxxxx> wrote: >> >> On 4/11/2022 4:27 PM, Dan Williams wrote: >>> On Tue, Apr 5, 2022 at 12:48 PM Jane Chu <jane.chu@xxxxxxxxxx> wrote: >>>> >>>> The set_memory_uc() approach doesn't work well in all cases. >>>> For example, when "The VMM unmapped the bad page from guest >>>> physical space and passed the machine check to the guest." >>>> "The guest gets virtual #MC on an access to that page. >>>> When the guest tries to do set_memory_uc() and instructs >>>> cpa_flush() to do clean caches that results in taking another >>>> fault / exception perhaps because the VMM unmapped the page >>>> from the guest." >>>> >>>> Since the driver has special knowledge to handle NP or UC, >>> >>> I think a patch is needed before this one to make this statement true? I.e.: >>> >>> diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c >>> index ee8d9973f60b..11641f55025a 100644 >>> --- a/drivers/acpi/nfit/mce.c >>> +++ b/drivers/acpi/nfit/mce.c >>> @@ -32,6 +32,7 @@ static int nfit_handle_mce(struct notifier_block >>> *nb, unsigned long val, >>> */ >>> mutex_lock(&acpi_desc_lock); >>> list_for_each_entry(acpi_desc, &acpi_descs, list) { >>> + unsigned int align = 1UL << MCI_MISC_ADDR_LSB(mce->misc); >>> struct device *dev = acpi_desc->dev; >>> int found_match = 0; >>> >>> @@ -63,8 +64,7 @@ static int nfit_handle_mce(struct notifier_block >>> *nb, unsigned long val, >>> >>> /* If this fails due to an -ENOMEM, there is little we can do */ >>> nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus, >>> - ALIGN(mce->addr, L1_CACHE_BYTES), >>> - L1_CACHE_BYTES); >>> + ALIGN(mce->addr, align), align); >>> nvdimm_region_notify(nfit_spa->nd_region, >>> NVDIMM_REVALIDATE_POISON); >>> >> >> Dan, I tried the above change, and this is what I got after injecting 8 >> back-to-back poisons, then read them and received SIGBUS/BUS_MCEERR_AR, >> then repair via the v7 patch which works until this change is added. >> >> [ 6240.955331] nfit ACPI0012:00: XXX, align = 100 >> [ 6240.960300] nfit ACPI0012:00: XXX, ALIGN(mce->addr, >> L1_CACHE_BYTES)=1851600400, L1_CACHE_BYTES=40, ALIGN(mce->addr, >> align)=1851600400 >> [..] >> [ 6242.052277] nfit ACPI0012:00: XXX, align = 100 >> [ 6242.057243] nfit ACPI0012:00: XXX, ALIGN(mce->addr, >> L1_CACHE_BYTES)=1851601000, L1_CACHE_BYTES=40, ALIGN(mce->addr, >> align)=1851601000 >> [..] >> [ 6244.917198] nfit ACPI0012:00: XXX, align = 1000 >> [ 6244.922258] nfit ACPI0012:00: XXX, ALIGN(mce->addr, >> L1_CACHE_BYTES)=1851601200, L1_CACHE_BYTES=40, ALIGN(mce->addr, >> align)=1851602000 >> [..] >> >> All 8 poisons remain uncleared. >> >> Without further investigation, I don't know why the failure. >> Could we mark this change to a follow-on task? > > Perhaps a bit more debug before kicking this can down the road... > > I'm worried that this means that the driver is not accurately tracking > poison data For example, that last case the hardware is indicating a > full page clobber, but the old code would only track poison from > 1851601200 to 1851601400 (i.e. the first 512 bytes from the base error > address). I would appear so, but the old code tracking seems to be working correctly. For example, injecting 8 back-to-back poison, the /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/badblocks cpatures all 8 of them, that spans 2K range, right? I never had issue about missing poison in my tests. > > Oh... wait, I think there is a second bug here, that ALIGN should be > ALIGN_DOWN(). Does this restore the result you expect? That's it, ALIGN_DOWN fixed the issue, thanks!! I'll add a patch, suggested-by you. > > diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c > index ee8d9973f60b..d7a52238a741 100644 > --- a/drivers/acpi/nfit/mce.c > +++ b/drivers/acpi/nfit/mce.c > @@ -63,8 +63,7 @@ static int nfit_handle_mce(struct notifier_block > *nb, unsigned long val, > > /* If this fails due to an -ENOMEM, there is little we can do */ > nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus, > - ALIGN(mce->addr, L1_CACHE_BYTES), > - L1_CACHE_BYTES); > + ALIGN_DOWN(mce->addr, align), align); > nvdimm_region_notify(nfit_spa->nd_region, > NVDIMM_REVALIDATE_POISON); > > >> The driver knows a lot about how to clear poisons besides hardcoding >> poison alignment to 0x40 bytes. > > It does, but the badblocks tracking should still be reliable, and if > it's not reliable I expect there are cases where recovery_write() will > not be triggered because the driver will not fail the > dax_direct_access() attempt. thanks! -jane