Hi naoya, tony: > > > > Idea for what we should do next ... Now that x86 is calling memory_failure() > > from user context ... maybe parallel calls for the same page should > > be blocked until the first caller completes so we can: > > a) know that pages are unmapped (if that happens) > > b) all get the same success/fail status > > One memory_failure() call changes the target page's status and > affects all mappings to all affected processes, so I think that > (ideally) we don't have to block other threads (letting them > early return seems fine). Sometimes memory_failure() fails, > but even in such case, PG_hwpoison is set on the page and other > threads properly get SIGBUSs with this patch, so I think that > we can avoid the worst scenario (like system stall by MCE loop). > I agree with naoya's point, if we block for this issue, Does this change the result that the process should be killed? Or is there something other still need to be considered? Thanks! Aili Yao