On Mon, Oct 21, 2013 at 05:14:05PM +0000, Luck, Tony wrote: > Even if we recovered from a UC error (which is by no means a sure > thing) ... I don't think the "requires no further action" message > applies. > > Soft single bit errors are common (well, common-ish ... they should > still be somewhat rare by most objective standard). Double bit errors > are much rarer ... and are very unlikely to be the result of two > single bit errors happening to be inside the same cache line. I'd > recommend further investigation of the source of a UC error (even one > that is "recovered" in software). Btw, do we even need to make this distinction? I mean, do we even reach this path on an error where we need to raise a #MC exception? In the initial design we were called from machine_check_poll which is not the exception path and now we're on the decode_chain which gets all errors. Are we ready to handle all? And also, why do we even need to differentiate the error types on reporting? I mean, if it is, say, a contained UC error and we can start a recovery action from userspace like killing the process, we probably want to have that same detailed report too? [ This is purely hypothetical, of course, as we do the poisoning game and killing of processes from kernel space now but still... ] Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html