On Mon, Oct 23, 2023 at 03:26:52 PM +0800, Shiyang Ruan wrote: > 在 2023/10/23 14:40, Chandan Babu R 写道: >> On Fri, Oct 20, 2023 at 08:40:09 AM -0700, Darrick J. Wong wrote: >>> On Fri, Oct 20, 2023 at 03:26:32PM +0530, Chandan Babu R wrote: >>>> On Thu, Sep 28, 2023 at 06:32:27 PM +0800, Shiyang Ruan wrote: >>>>> ==== >>>>> Changes since v14: >>>>> 1. added/fixed code comments per Dan's comments >>>>> ==== >>>>> >>>>> Now, if we suddenly remove a PMEM device(by calling unbind) which >>>>> contains FSDAX while programs are still accessing data in this device, >>>>> e.g.: >>>>> ``` >>>>> $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 & >>>>> # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 & >>>>> echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind >>>>> ``` >>>>> it could come into an unacceptable state: >>>>> 1. device has gone but mount point still exists, and umount will fail >>>>> with "target is busy" >>>>> 2. programs will hang and cannot be killed >>>>> 3. may crash with NULL pointer dereference >>>>> >>>>> To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we >>>>> are going to remove the whole device, and make sure all related processes >>>>> could be notified so that they could end up gracefully. >>>>> >>>>> This patch is inspired by Dan's "mm, dax, pmem: Introduce >>>>> dev_pagemap_failure()"[1]. With the help of dax_holder and >>>>> ->notify_failure() mechanism, the pmem driver is able to ask filesystem >>>>> on it to unmap all files in use, and notify processes who are using >>>>> those files. >>>>> >>>>> Call trace: >>>>> trigger unbind >>>>> -> unbind_store() >>>>> -> ... (skip) >>>>> -> devres_release_all() >>>>> -> kill_dax() >>>>> -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE) >>>>> -> xfs_dax_notify_failure() >>>>> `-> freeze_super() // freeze (kernel call) >>>>> `-> do xfs rmap >>>>> ` -> mf_dax_kill_procs() >>>>> ` -> collect_procs_fsdax() // all associated processes >>>>> ` -> unmap_and_kill() >>>>> ` -> invalidate_inode_pages2_range() // drop file's cache >>>>> `-> thaw_super() // thaw (both kernel & user call) >>>>> >>>>> Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove >>>>> event. Use the exclusive freeze/thaw[2] to lock the filesystem to prevent >>>>> new dax mapping from being created. Do not shutdown filesystem directly >>>>> if configuration is not supported, or if failure range includes metadata >>>>> area. Make sure all files and processes(not only the current progress) >>>>> are handled correctly. Also drop the cache of associated files before >>>>> pmem is removed. >>>>> >>>>> [1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ >>>>> [2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/ >>>>> >>>>> Signed-off-by: Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx> >>>>> Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> >>>>> Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx> >>>> >>>> Hi Andrew, >>>> >>>> Shiyang had indicated that this patch has been added to >>>> akpm/mm-hotfixes-unstable branch. However, I don't see the patch listed in >>>> that branch. >>>> >>>> I am about to start collecting XFS patches for v6.7 cycle. Please let me know >>>> if you have any objections with me taking this patch via the XFS tree. >>> >>> V15 was dropped from his tree on 28 Sept., you might as well pull it >>> into your own tree for 6.7. It's been testing fine on my trees for the >>> past 3 weeks. >>> >>> https://lore.kernel.org/mm-commits/20230928172815.EE6AFC433C8@xxxxxxxxxxxxxxx/ >> Shiyang, this patch does not apply cleanly on v6.6-rc7. Can you >> please rebase >> the patch on v6.6-rc7 and send it to the mailing list? > > Sure. I have rebased it and sent a v15.1. Please check it: > > https://lore.kernel.org/linux-xfs/20231023072046.1626474-1-ruansy.fnst@xxxxxxxxxxx/ Thank you. I have applied the patch to my local Git tree. -- Chandan