Re: [PATCH v2 00/11] mm: Teach memory_failure() about ZONE_DEVICE pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 04-06-18 07:31:25, Dan Williams wrote:
[...]
> I'm trying to solve this real world problem when real poison is
> consumed through a dax mapping:
> 
>         mce: Uncorrected hardware memory error in user-access at af34214200
>         {1}[Hardware Error]: It has been corrected by h/w and requires
> no further action
>         mce: [Hardware Error]: Machine check events logged
>         {1}[Hardware Error]: event severity: corrected
>         Memory failure: 0xaf34214: reserved kernel page still
> referenced by 1 users
>         [..]
>         Memory failure: 0xaf34214: recovery action for reserved kernel
> page: Failed
>         mce: Memory error not recovered
> 
> ...i.e. currently all poison consumed through dax mappings is
> needlessly system fatal.

Thanks. That should be a part of the changelog. It would be great to
describe why this cannot be simply handled by hwpoison code without any
ZONE_DEVICE specific hacks? The error is recoverable so why does
hwpoison code even care?

-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux