> So who actually cares about recovering poisoned volatile memory? > I'd like to understand more on how significant a use case this is. > Whilst I can conjecture that its an extreme case of wanting to avoid > loosing the ability to create 1GiB or larger pages due to poison > is that a real problem for anyone today? Note this is just the case > where you've reached an actual uncorrectable error and probably > / possibly killed something, not the more common soft offlining > of memory due to correctable errors being detected. I guess you really need a reply from someone with a data center with thousands of machines, since that's where this question may be important. My humble opinion is that, outside of the huge page issue, nobody should try to recover a poisoned page. Systems that can report and recover from poison have tens, hundreds, or more GBytes of memory. Dropping 4K pages will not have any measurable impact on a system (even if there are hundreds of pages dropped). There's no reliable way to determine whether the poisoned page was due to some transient issue, or a permanent defect. Recovering a poisoned page runs the risk that the poison will re-occur. Perhaps next use of the page will be in some unrecoverable (kernel) context. So recovery has some risk, but very little upside benefit. -Tony