> I think that one point not mentioned yet is how the in-kernel scanner finds > a broken page before the page is marked by PG_hwpoison. Some mechanism > similar to mcsafe-memcpy could be used, but maybe memcpy is not necessary > because we just want to check the healthiness of pages. So a core routine > like mcsafe-read would be introduced in the first patchset (or we already > have it)? I don’t think that there is an existing routine to do the mcsafe-read. But it should be easy enough to write one. If an architecture supports a way to do this without evicting other data from caches, that would be a bonus. X86 has a non-temporal read that could be interesting ... but I'm not sure that it would detect poison synchronously. I could be wrong, but I expect that you won’t see a machine check, but you should see the memory controller log a UCNA error reported by a CMCI. -Tony