Re: [RFC] Make the memory failure blast radius more precise

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/23/2020 3:40 PM, Matthew Wilcox wrote:
On Tue, Jun 23, 2020 at 03:26:58PM -0700, Luck, Tony wrote:
On Tue, Jun 23, 2020 at 11:17:41PM +0100, Matthew Wilcox wrote:
It might also be nice to have an madvise() MADV_ZERO option so the
application doesn't have to look up the fd associated with that memory
range, but we haven't floated that idea with the customer yet; I just
thought of it now.

So the conversation between OS and kernel goes like this?

1) machine check
2) Kernel unmaps the 4K page surroundinng the poison and sends
    SIGBUS to the application to say that one cache line is gone
3) App says madvise(MADV_ZERO, that cache line)
4) Kernel says ... "oh, you know how to deal with this" and allocates
    a new page, copying the 63 good cache lines from the old page and
    zeroing the missing one. New page is mapped to user.

That could be one way of implementing it.  My understanding is that
pmem devices will reallocate bad cachelines on writes, so a better
implementation would be:

1) Kernel receives machine check
2) Kernel sends SIGBUS to the application
3) App send madvise(MADV_ZERO, addr, 1 << granularity)
4) Kernel does special writes to ensure the cacheline is zeroed
5) App does whatever it needs to recover (reconstructs the data or marks
it as gone)

Thanks Matthew!

Both the RFC patch and the above 5-step recovery plan look neat, step 4) is nice to carry forward on icelake when a single instruction to clear
poison is available.

Next, what are the preferred ways to deal with the signal handling race when multiple processes are sharing the poisoned pmem page?

Also, is it advisable to application to ignore SIGBUS with MCEERR_AO?


Do you have folks lined up to use that?  I don't know that many
folks are even catching the SIGBUS :-(

Had a 75 minute meeting with some people who want to use pmem this
afternoon ...

thanks!
-jane




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux