On Wed, Jan 23, 2019 at 06:36:38PM +0000, James Morse wrote: > Do you consider ENOENT an error? We don't ack in that case as the > memory wasn't in use. So let's see: if (!*buf_paddr) return -ENOENT; can happen when apei_read() has returned 0 but it has managed to do *val = 0; Now, that function returns error values which we should be checking but we're checking the buf_paddr pointed to value for being 0. Are we fearing that even if acpi_os_read_memory() or acpi_os_read_port() succeed, *buf_paddr could still be 0 ? Because if not, we should be checking whether rc == -EINVAL and then convert it to -ENOENT. But ghes_read_estatus() handles the error case first *and* *then* checks buf_paddr too, to make really really sure we won't be reading from address 0. > For the other cases its because the records are bogus, but we still > unconditionally tell firmware we're done with them. ... to free the memory, yes, ok. > >> I think it is. 18.3.2.8 of ACPI v6.2 (search for Generic Hardware Error Source > >> version 2", then below the table): > >> * OSPM detects error (via interrupt/exception or polling the block status) > >> * OSPM copies the error status block > >> * OSPM clears the block status field of the error status block > >> * OSPM acknowledges the error via Read Ack register > >> > >> The ENOENT case is excluded by 'polling the block status'. > > > > Ok, so we signal the absence of an error record with ENOENT. > > > > if (!buf_paddr) > > return -ENOENT; > > > > Can that even happen? > > Yes, for NOTIFY_POLLED its the norm. For the IRQ flavours that walk a list of > GHES, all but one of them will return ENOENT. Lemme get this straight: when we do apei_read(&buf_paddr, &g->error_status_address); in the polled case, buf_paddr can be 0? > We could try it and see. It depends if firmware shares ack locations between > multiple GHES. We could ack an empty GHES, and it removes the records of one we > haven't looked at yet. Yeah, OTOH, we shouldn't be pushing our luck here, I guess. So let's sum up: we'll ack the GHES error in all but the -ENOENT cases in order to free the memory occupied by the error record. The slightly "pathological" -ENOENT case is I guess how the fw behaves when it is being polled and also for broken firmware which could report a 0 buf_paddr. Btw, that last thing I'm assuming because d334a49113a4 ("ACPI, APEI, Generic Hardware Error Source memory error support") doesn't say what that check was needed for. Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.