Hi Boris, On 11/01/2019 19:58, Borislav Petkov wrote: > On Fri, Jan 11, 2019 at 06:25:21PM +0000, James Morse wrote: >> We ack it in the corrupt-record case too, because we are done with the >> memory. > > Ok, so the only thing that we need to do unconditionally is ACK in order > to free the memory. Or is there an exception to that set of steps in > error handling? Do you consider ENOENT an error? We don't ack in that case as the memory wasn't in use. For the other cases its because the records are bogus, but we still unconditionally tell firmware we're done with them. >> I think it is. 18.3.2.8 of ACPI v6.2 (search for Generic Hardware Error Source >> version 2", then below the table): >> * OSPM detects error (via interrupt/exception or polling the block status) >> * OSPM copies the error status block >> * OSPM clears the block status field of the error status block >> * OSPM acknowledges the error via Read Ack register >> >> The ENOENT case is excluded by 'polling the block status'. > > Ok, so we signal the absence of an error record with ENOENT. > > if (!buf_paddr) > return -ENOENT; > > Can that even happen? Yes, for NOTIFY_POLLED its the norm. For the IRQ flavours that walk a list of GHES, all but one of them will return ENOENT. > Also, in that case, what would happen if we ACK the error anyway? We'd > confuse the firmware? No idea. > I sure hope firmware is prepared for spurious ACKs :) We could try it and see. It depends if firmware shares ack locations between multiple GHES. We could ack an empty GHES, and it removes the records of one we haven't looked at yet. >> Unsurprisingly the spec doesn't consider the case that firmware generates >> corrupt records! > > You mean the EIO case? Yup, > Not surprised at all. But we do not report that record so all good. Thanks, James