On Wed, Aug 30, 2017 at 11:31:06AM -0400, Sinan Kaya wrote: > I see. We should probably try to do something only if GHES_SEV_CORRECTED or > GHES_SEV_RECOVERABLE. > > If somebody wants to crash the system with GHES_SEV_PANIC, there is no point > in doing additional work. Makes sense. Whatever we do, I'd like to have this all nicely documented *why* we're doing the recovery policy we're doing. > Sounds good. Do you still want to do PCIe recovery in the case of > GHES_SEV_PANIC or if some FW returns GHES_SEV_NO? So I read GHES_SEV_PANIC as: we should panic and stop any processing whatsoever ASAP in order to avoid further error propagation. So doing recovery there might *actually* be a bad idea. GHES_SEV_NO would map to AER_CORRECTABLE and I think that would mean, print the error to let the user know but no need to recover because no harm was done. I *think*. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --