Gao Xiang <hsiangkao@xxxxxxxxxx> writes: > SWP_FS is used to make swap_{read,write}page() go through > the filesystem, and it's only used for swap files over > NFS. So, !SWP_FS means non NFS for now, it could be either > file backed or device backed. Something similar goes with > legacy SWP_FILE. > > So in order to achieve the goal of the original patch, > SWP_BLKDEV should be used instead. > > FS corruption can be observed with SSD device + XFS + > fragmented swapfile due to CONFIG_THP_SWAP=y. > > I reproduced the issue with the following details: > > Environment: > QEMU + upstream kernel + buildroot + NVMe (2 GB) > > Kernel config: > CONFIG_BLK_DEV_NVME=y > CONFIG_THP_SWAP=y > > Some reproducable steps: > mkfs.xfs -f /dev/nvme0n1 > mkdir /tmp/mnt > mount /dev/nvme0n1 /tmp/mnt > bs="32k" > sz="1024m" # doesn't matter too much, I also tried 16m > xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw > xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw > xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw > xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw > xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw > > mkswap /tmp/mnt/sw > swapon /tmp/mnt/sw > > stress --vm 2 --vm-bytes 600M # doesn't matter too much as well > > Symptoms: > - FS corruption (e.g. checksum failure) > - memory corruption at: 0xd2808010 > - segfault > > Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device") > Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out") > Cc: "Huang, Ying" <ying.huang@xxxxxxxxx> > Cc: Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> > Cc: Rafael Aquini <aquini@xxxxxxxxxx> > Cc: Dave Chinner <david@xxxxxxxxxxxxx> > Cc: stable <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxx> Thanks! Reviewed-by: "Huang, Ying" <ying.huang@xxxxxxxxx> Best Regards, Huang, Ying