Re: [PATCH v15] mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2023/10/23 14:40, Chandan Babu R 写道:

On Fri, Oct 20, 2023 at 08:40:09 AM -0700, Darrick J. Wong wrote:
On Fri, Oct 20, 2023 at 03:26:32PM +0530, Chandan Babu R wrote:
On Thu, Sep 28, 2023 at 06:32:27 PM +0800, Shiyang Ruan wrote:
====
Changes since v14:
  1. added/fixed code comments per Dan's comments
====

Now, if we suddenly remove a PMEM device(by calling unbind) which
contains FSDAX while programs are still accessing data in this device,
e.g.:
```
  $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 &
  # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 &
  echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind
```
it could come into an unacceptable state:
   1. device has gone but mount point still exists, and umount will fail
        with "target is busy"
   2. programs will hang and cannot be killed
   3. may crash with NULL pointer dereference

To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we
are going to remove the whole device, and make sure all related processes
could be notified so that they could end up gracefully.

This patch is inspired by Dan's "mm, dax, pmem: Introduce
dev_pagemap_failure()"[1].  With the help of dax_holder and
->notify_failure() mechanism, the pmem driver is able to ask filesystem
on it to unmap all files in use, and notify processes who are using
those files.

Call trace:
trigger unbind
  -> unbind_store()
   -> ... (skip)
    -> devres_release_all()
     -> kill_dax()
      -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE)
       -> xfs_dax_notify_failure()
       `-> freeze_super()             // freeze (kernel call)
       `-> do xfs rmap
       ` -> mf_dax_kill_procs()
       `  -> collect_procs_fsdax()    // all associated processes
       `  -> unmap_and_kill()
       ` -> invalidate_inode_pages2_range() // drop file's cache
       `-> thaw_super()               // thaw (both kernel & user call)

Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove
event.  Use the exclusive freeze/thaw[2] to lock the filesystem to prevent
new dax mapping from being created.  Do not shutdown filesystem directly
if configuration is not supported, or if failure range includes metadata
area.  Make sure all files and processes(not only the current progress)
are handled correctly.  Also drop the cache of associated files before
pmem is removed.

[1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
[2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/

Signed-off-by: Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx>
Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx>
Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx>

Hi Andrew,

Shiyang had indicated that this patch has been added to
akpm/mm-hotfixes-unstable branch. However, I don't see the patch listed in
that branch.

I am about to start collecting XFS patches for v6.7 cycle. Please let me know
if you have any objections with me taking this patch via the XFS tree.

V15 was dropped from his tree on 28 Sept., you might as well pull it
into your own tree for 6.7.  It's been testing fine on my trees for the
past 3 weeks.

https://lore.kernel.org/mm-commits/20230928172815.EE6AFC433C8@xxxxxxxxxxxxxxx/

Shiyang, this patch does not apply cleanly on v6.6-rc7. Can you please rebase
the patch on v6.6-rc7 and send it to the mailing list?

Sure.  I have rebased it and sent a v15.1.  Please check it:

https://lore.kernel.org/linux-xfs/20231023072046.1626474-1-ruansy.fnst@xxxxxxxxxxx/


--
Thanks,
Ruan.






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux