On 5/8/2024 5:06 AM, Oscar Salvador wrote:
On Tue, May 07, 2024 at 10:54:10AM -0700, Jane Chu wrote:
I actually managed to hit the re-access case with an older version of Linux
-
MCE occurred, but unmap failed, no SIGBUS and test process re-access
the same address over and over (hence MCE after MCE), as the CPU
was unable to make forward progress. In reality, this issue is fixed with
kill_accessing_processes(). The comment for this patch refers to comment
made
So we get a faulty page and we try to unmap it from all processes that
might have it mapped in their pgtables.
Prior to this patch we would kill the processes right away and now we
deliver a SIGBUS.
Seems safe as upon-reaccesing kill_accessing_process() will be called
for already hwpoisoned pages.
I think the changelog could be made more explicit about this scenario
and state the role of kill_accessing_process more clear.
With that: Reviewed-by: Oscar Salvador <osalvador@xxxxxxx>
I will revise the changelog and mention kill_accessing_process().
Thanks!
-jane