On 04.11.24 15:11, Petr Vaněk wrote:
I would like to report a regression in XFS introduced in kerenel v6.6 in
commit 5d8edfb900d5 ("iomap: Copy larger chunks from userspace"). On a
system running under Xen, when a process creates a file on an XFS file
system and writes exactly 2MB or more in a single write syscall,
accessing memory through mmap on that file causes the process to hang,
while dmesg is flooded with page fault warnings:
[...]
[ 62.406493] </TASK>
As shown in the log above, the issue persists in kernel 6.6.59. However,
it was recently resolved in commit 2b0f922323cc ("mm: don't install PMD
mappings when THPs are disabled by the hw/process/vma"). The fix was
backported to 6.11. Would it make sense to backport it to 6.6 as well?
I was speculating about this in the patch description:
"Is it also a problem when the HW disabled THP using
TRANSPARENT_HUGEPAGE_UNSUPPORTED? At least on x86 this would be the
case without X86_FEATURE_PSE."
I assume we have a HW, where has_transparent_hugepage() == false, so
likely x86-64 without X86_FEATURE_PSE.
QEMU/KVM should be supporting X86_FEATURE_PSE, but maybe XEN does not
for its (PC?) guests? If I understood your setup correctly :)
At least years ago, this feature was not available in XEN PV guests [1].
Note that I already sent a backport [2], I should probably ping at this
point.
[1]
https://lore.kernel.org/all/57188ED802000078000E431C@xxxxxxxxxxxxxxxxxxxxxxx/
[2] https://lkml.kernel.org/r/20241022090952.4101444-1-david@xxxxxxxxxx
I encountered this issue while updating a Gentoo VM with an XFS
filesystem running under Xen. During a final stage of glibc update,
files were copied to the live system, but when locale-gen started, it
hung. I couldn't open a new shell, as it attempted to mmap an
LC_COLLATE-related file, resulting in the same page faults as reported
above.
Yes, looks like something is not happy about the PMD mapping that we
installed.
--
Cheers,
David / dhildenb