On Fri, Oct 20, 2023 at 08:40:09 AM -0700, Darrick J. Wong wrote: > On Fri, Oct 20, 2023 at 03:26:32PM +0530, Chandan Babu R wrote: >> On Thu, Sep 28, 2023 at 06:32:27 PM +0800, Shiyang Ruan wrote: >> > ==== >> > Changes since v14: >> > 1. added/fixed code comments per Dan's comments >> > ==== >> > >> > Now, if we suddenly remove a PMEM device(by calling unbind) which >> > contains FSDAX while programs are still accessing data in this device, >> > e.g.: >> > ``` >> > $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 & >> > # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 & >> > echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind >> > ``` >> > it could come into an unacceptable state: >> > 1. device has gone but mount point still exists, and umount will fail >> > with "target is busy" >> > 2. programs will hang and cannot be killed >> > 3. may crash with NULL pointer dereference >> > >> > To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we >> > are going to remove the whole device, and make sure all related processes >> > could be notified so that they could end up gracefully. >> > >> > This patch is inspired by Dan's "mm, dax, pmem: Introduce >> > dev_pagemap_failure()"[1]. With the help of dax_holder and >> > ->notify_failure() mechanism, the pmem driver is able to ask filesystem >> > on it to unmap all files in use, and notify processes who are using >> > those files. >> > >> > Call trace: >> > trigger unbind >> > -> unbind_store() >> > -> ... (skip) >> > -> devres_release_all() >> > -> kill_dax() >> > -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE) >> > -> xfs_dax_notify_failure() >> > `-> freeze_super() // freeze (kernel call) >> > `-> do xfs rmap >> > ` -> mf_dax_kill_procs() >> > ` -> collect_procs_fsdax() // all associated processes >> > ` -> unmap_and_kill() >> > ` -> invalidate_inode_pages2_range() // drop file's cache >> > `-> thaw_super() // thaw (both kernel & user call) >> > >> > Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove >> > event. Use the exclusive freeze/thaw[2] to lock the filesystem to prevent >> > new dax mapping from being created. Do not shutdown filesystem directly >> > if configuration is not supported, or if failure range includes metadata >> > area. Make sure all files and processes(not only the current progress) >> > are handled correctly. Also drop the cache of associated files before >> > pmem is removed. >> > >> > [1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ >> > [2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/ >> > >> > Signed-off-by: Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx> >> > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> >> > Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx> >> >> Hi Andrew, >> >> Shiyang had indicated that this patch has been added to >> akpm/mm-hotfixes-unstable branch. However, I don't see the patch listed in >> that branch. >> >> I am about to start collecting XFS patches for v6.7 cycle. Please let me know >> if you have any objections with me taking this patch via the XFS tree. > > V15 was dropped from his tree on 28 Sept., you might as well pull it > into your own tree for 6.7. It's been testing fine on my trees for the > past 3 weeks. > > https://lore.kernel.org/mm-commits/20230928172815.EE6AFC433C8@xxxxxxxxxxxxxxx/ Shiyang, this patch does not apply cleanly on v6.6-rc7. Can you please rebase the patch on v6.6-rc7 and send it to the mailing list? -- Chandan