Re: [PATCH] acpi, nfit: fix the memory error check in nfit_handle_mce

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 21, 2017 at 1:16 PM, Luck, Tony <tony.luck@xxxxxxxxx> wrote:
>>> > +       if (!(mce->status & 0xef80) == BIT(7))
>>>
>>> Can we get a define for this, or a comment explaining all the magic
>>> that's happening on that one line?
>>
>> Yes - also like lkp pointed out, the check isn't correct at all. Let me
>> figure out what really needs to be done, and I will resend with a better
>> comment.
>
> Needs extra parentheses to make it right. Vishal, sorry I led you astray.
>
>         if (!((mce->status & 0xef80) == BIT(7)))
>
> The magic is shown in table 15-9 of the Intel Software Developers Manual
> (but perhaps not well explained there).
>
> mce->status in the above code is a value plucked from a machine check
> bank status register. See figure 15-6 in the SDM.  The important bits for this
> are {15:0} which are the "MCA Error code".  Table 15-9 shows how these
> are grouped into types, where the type is defined by the most significant '1'
> bit in the field (excluding bit 12 which is the Correction Report Filtering bit,
> see section 15.9.2.1).
>
> So if BIT(3) is the most significant bit, the this is a "Generic Cache Hierarchy"
> error, BIT(4) denotes a TLB error, BIT(7) a Memory error, and so on.

Ah, ok.

> Maybe we should have defines in mce.h for them?  It gets a bit more complicated
> as all the above only applies to Intel branded X86 CPUs ... on AMD different
> decoding rules apply.

Yeah, this code is x86_64 generic so should call into helpers that do
the right thing per cpu type.



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]