Hi Boris, On 11/01/2019 17:45, Borislav Petkov wrote: > On Fri, Jan 11, 2019 at 10:32:23AM -0500, Tyler Baicar wrote: >> The kernel would have no way of knowing what to do here. > > What do you mean, there's no way of knowing what to do? It needs to > clear registers so that the next error can get reported properly. > > Or of the status read failed and it doesn't need to do anything, then it > shouldn't. I think we're speaking at cross-purposes. If the error-detecting-hardware has some state, that's firmware's problem to deal with. What we're dealing with here is the memory we read the error records from. > Whatever it is, the kernel either needs to do something in the error > case to clean up, or nothing if the firmware doesn't need anything done > in the error case; *or* ack the error in the success case. We ack it in the corrupt-record case too, because we are done with the memory. > This should all be written down somewhere in that GHES v2 > spec/doc/writeup whatever, explaining what the OS is supposed to do to > signal the error has been read by the OS. I think it is. 18.3.2.8 of ACPI v6.2 (search for Generic Hardware Error Source version 2", then below the table): * OSPM detects error (via interrupt/exception or polling the block status) * OSPM copies the error status block * OSPM clears the block status field of the error status block * OSPM acknowledges the error via Read Ack register The ENOENT case is excluded by 'polling the block status'. Unsurprisingly the spec doesn't consider the case that firmware generates corrupt records! Thanks, James