On Wed, Aug 18, 2021 at 12:52 AM ruansy.fnst@xxxxxxxxxxx <ruansy.fnst@xxxxxxxxxxx> wrote: > > > > > -----Original Message----- > > From: Jane Chu <jane.chu@xxxxxxxxxx> > > Subject: Re: [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure() > > > > > > On 8/17/2021 10:43 PM, Jane Chu wrote: > > > More information - > > > > > > On 8/16/2021 10:20 AM, Jane Chu wrote: > > >> Hi, ShiYang, > > >> > > >> So I applied the v6 patch series to my 5.14-rc3 as it's what you > > >> indicated is what v6 was based at, and injected a hardware poison. > > >> > > >> I'm seeing the same problem that was reported a while ago after the > > >> poison was consumed - in the SIGBUS payload, the si_addr is missing: > > >> > > >> ** SIGBUS(7): canjmp=1, whichstep=0, ** > > >> ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) ** > > >> > > >> The si_addr ought to be 0x7f6568000000 - the vaddr of the first page > > >> in this case. > > > > > > The failure came from here : > > > > > > [PATCH RESEND v6 6/9] xfs: Implement ->notify_failure() for XFS > > > > > > +static int > > > +xfs_dax_notify_failure( > > > ... > > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) { > > > + xfs_warn(mp, "notify_failure() needs rmapbt enabled!"); > > > + return -EOPNOTSUPP; > > > + } > > > > > > I am not familiar with XFS, but I have a few questions I hope to get > > > answers - > > > > > > 1) What does it take and cost to make > > > xfs_sb_version_hasrmapbt(&mp->m_sb) to return true? > > Enable rmpabt feature when making xfs filesystem > `mkfs.xfs -m rmapbt=1 /path/to/device` > BTW, reflink is enabled by default. > > > > > > > 2) For a running environment that fails the above check, is it > > > okay to leave the poison handle in limbo and why? > It will fall back to the old handler. I think you have already known it. > > > > > > > 3) If the above regression is not acceptable, any potential remedy? > > > > How about moving the check to prior to the notifier registration? > > And register only if the check is passed? This seems better than an > > alternative which is to fall back to the legacy memory_failure handling in case > > the filesystem returns -EOPNOTSUPP. > > Sounds like a nice solution. I think I can add an is_notify_supported() interface in dax_holder_ops and check it when register dax_holder. Shouldn't the fs avoid registering a memory failure handler if it is not prepared to take over? For example, shouldn't this case behave identically to ext4 that will not even register a callback?