On Fri, Dec 21, 2018 at 06:52:20PM +0000, James Morse wrote: > Do we need to ghes_ack_error() too? That's GHES v2 AFAICT. > With the location cleared the new kernel will never find the records, and > firmware can never re-use that location because it wasn't ack'd. The upshot is > RAS records can't be generated for the kdump kernel. The acpi spec talks about > use of the memory, so I don't think its fair for it to use this to disarm a > watchdog. > > I think we can live with this as the kdump kernel isn't going to handle RAS > errors for the bulk of memory anyway. Usually, handling hw errors is always better than not but the second kernel can't do anything better in that respect than the first, right? If it panics, it panics - no matter the kernel. Generally. Therefore I think the role of the second kernel should be to be as resilient as possible to hw errors - like, not even see them :-) - dump the memory of the first kernel as quickly as possible and reboot for analysis. IMHO, of course. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.