On 2025/2/10 12:16, Qu Wenruo wrote:
在 2025/2/10 14:32, Qi Zheng 写道:
Hi Zi,
On 2025/2/10 11:35, Zi Yan wrote:
On 7 Feb 2025, at 17:17, Matthew Wilcox wrote:
On Fri, Feb 07, 2025 at 04:29:36PM +0100, Christian Brauner wrote:
while true; do ./xfs.run.sh "generic/437"; done
allows me to reproduce this fairly quickly.
on holiday, back monday
git bisect points to commit
4817f70c25b6 ("x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64").
Qi is cc'd.
After deselect PT_RECLAIM on v6.14-rc1, the issue is gone.
At least, no splat after running for more than 300s,
whereas the splat is usually triggered after ~20s with
PT_RECLAIM set.
The PT_RECLAIM mainly made the following two changes:
1) try to reclaim page table pages during madvise(MADV_DONTNEED)
2) Unconditionally select MMU_GATHER_RCU_TABLE_FREE
Will ./xfs.run.sh "generic/437" perform the madvise(MADV_DONTNEED)?
Anyway, I will try to reproduce it locally and troubleshoot it.
BTW, btrfs is also able to reproduce the same problem on x86_64, all
default mount option.
Normally less than 32 runs of generic/437 (done by "./check -I 32
generic/437" of fstests) is enough to trigger it.
In my case, I go 128 runs to be extra sure.
And no more reproduce after deselect CONFIG_PT_RECLAIM option, thus it
really looks like 4817f70c25b6 ("x86: select ARCH_SUPPORTS_PT_RECLAIM if
X86_64") is the cause.
Thank you for your information, I will try to reproduce it locally and
troubleshoot it.
And for aarch64 64K page size and 4K fs block size, no reproduce at all.
Now, the PT_RECLAIM is only supported on x86_64.
Thanks,
Qi
Thanks,
Qu
Thanks!
--
Best Regards,
Yan, Zi