On Fri, Oct 20, 2023 at 03:26:32PM +0530, Chandan Babu R wrote: > On Thu, Sep 28, 2023 at 06:32:27 PM +0800, Shiyang Ruan wrote: > > ==== > > Changes since v14: > > 1. added/fixed code comments per Dan's comments > > ==== > > > > Now, if we suddenly remove a PMEM device(by calling unbind) which > > contains FSDAX while programs are still accessing data in this device, > > e.g.: > > ``` > > $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 & > > # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 & > > echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind > > ``` > > it could come into an unacceptable state: > > 1. device has gone but mount point still exists, and umount will fail > > with "target is busy" > > 2. programs will hang and cannot be killed > > 3. may crash with NULL pointer dereference > > > > To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we > > are going to remove the whole device, and make sure all related processes > > could be notified so that they could end up gracefully. > > > > This patch is inspired by Dan's "mm, dax, pmem: Introduce > > dev_pagemap_failure()"[1]. With the help of dax_holder and > > ->notify_failure() mechanism, the pmem driver is able to ask filesystem > > on it to unmap all files in use, and notify processes who are using > > those files. > > > > Call trace: > > trigger unbind > > -> unbind_store() > > -> ... (skip) > > -> devres_release_all() > > -> kill_dax() > > -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE) > > -> xfs_dax_notify_failure() > > `-> freeze_super() // freeze (kernel call) > > `-> do xfs rmap > > ` -> mf_dax_kill_procs() > > ` -> collect_procs_fsdax() // all associated processes > > ` -> unmap_and_kill() > > ` -> invalidate_inode_pages2_range() // drop file's cache > > `-> thaw_super() // thaw (both kernel & user call) > > > > Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove > > event. Use the exclusive freeze/thaw[2] to lock the filesystem to prevent > > new dax mapping from being created. Do not shutdown filesystem directly > > if configuration is not supported, or if failure range includes metadata > > area. Make sure all files and processes(not only the current progress) > > are handled correctly. Also drop the cache of associated files before > > pmem is removed. > > > > [1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > [2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/ > > > > Signed-off-by: Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx> > > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > Hi Andrew, > > Shiyang had indicated that this patch has been added to > akpm/mm-hotfixes-unstable branch. However, I don't see the patch listed in > that branch. > > I am about to start collecting XFS patches for v6.7 cycle. Please let me know > if you have any objections with me taking this patch via the XFS tree. V15 was dropped from his tree on 28 Sept., you might as well pull it into your own tree for 6.7. It's been testing fine on my trees for the past 3 weeks. https://lore.kernel.org/mm-commits/20230928172815.EE6AFC433C8@xxxxxxxxxxxxxxx/ --D > > -- > Chandan