Re: [PATCH v2 00/11] mm: Teach memory_failure() about ZONE_DEVICE pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 04-06-18 07:31:25, Dan Williams wrote:
[...]
> I'm trying to solve this real world problem when real poison is
> consumed through a dax mapping:
> 
>         mce: Uncorrected hardware memory error in user-access at af34214200
>         {1}[Hardware Error]: It has been corrected by h/w and requires
> no further action
>         mce: [Hardware Error]: Machine check events logged
>         {1}[Hardware Error]: event severity: corrected
>         Memory failure: 0xaf34214: reserved kernel page still
> referenced by 1 users
>         [..]
>         Memory failure: 0xaf34214: recovery action for reserved kernel
> page: Failed
>         mce: Memory error not recovered
> 
> ...i.e. currently all poison consumed through dax mappings is
> needlessly system fatal.

Thanks. That should be a part of the changelog. It would be great to
describe why this cannot be simply handled by hwpoison code without any
ZONE_DEVICE specific hacks? The error is recoverable so why does
hwpoison code even care?

-- 
Michal Hocko
SUSE Labs



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux