Re: [PATCH] arm64/acpi: Add fixup for HPE m400 quirks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,

On 18/06/18 23:18, Mark Salter wrote:
> On Mon, 2018-06-18 at 11:04 -0700, Geoff Levand wrote:
>> Thanks for all the comments, but my lack of access to an m400 platform, and
>> my lack of knowledge about the m400 limits what I can comment on and what I
>> can do.  
> 
> I can take another look at this on an m400 here.

Thanks!


> I don't believe it is a
> memory access to physical space with nothing attached to it.

That is what the CPER records are describing though.


> I seem to recall
> an errata with xgene-1 where such accesses cause the cpu to halt. But I could
> be misremembering that. I have no trouble believing the firmware ras code was
> untested. It is probably some boilerplate code built in before ras was supported
> in kernel.

It would be interesting to know which GHES this error is being found in, and
whether the Error Status Block points anywhere (or at an empty block) when Linux
is started from UEFI.

If there is something in the Error Status Block out of UEFI, then this must be
something triggered by UEFI, or a bug that can be fixed by UEFI clearing out the
CPER records.

https://bugzilla.redhat.com/show_bug.cgi?id=1285107
suggests redhat can rebuild the UEFI firmware for this box.


If there is nothing in the Error Status Block when Linux is started, surely
Linux is doing something to cause this to happen. I'd like to find out what, as
its probably a software bug.


(The case where disabling HEST would be the right thing to do is if there is a
bogus GHES->GAS entry in GHES.0, the access to which causes GHES.1 to be
populated with 'Access to an address not mapped to any component', which we find
next. If this is the case it would be better to check GHES entries against the
UEFI memory map to check this is memory, and it was reserved.)


> But the problem occurs early enough in boot where there can't be
> that many things that would cause a problem on m400 and not mustang so I'll
> look again.

Playing spot the difference in the dmesg, I'd check for smoke coming out of:
| acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
| xgene-gpio APMC0D14:00: X-Gene GPIO driver registered.
| pcie_pme: probe of 0000:00:00.0:pcie001 failed with error -22

If the firmware description of the GIC is wrong in someway, disabling KVM may be
worth testing too.


Thanks,

James
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux