On 03.07.23 20:21, Suren Baghdasaryan wrote:
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86.
Disable per-VMA locks config to prevent this issue while the problem is
being investigated. This is expected to be a temporary measure.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@xxxxxxxxxx
Reported-by: Jiri Slaby <jirislaby@xxxxxxxxxx>
Reported-by: Jacob Young <jacobly.alt@xxxxxxxxx>
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
---
mm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/Kconfig b/mm/Kconfig
index 09130434e30d..de94b2497600 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1224,7 +1224,7 @@ config ARCH_SUPPORTS_PER_VMA_LOCK
def_bool n
config PER_VMA_LOCK
- def_bool y
+ bool "Enable per-vma locking during page fault handling."
depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP
help
Allow per-vma locking during page fault handling.
As raised at LSF/MM, I was "surprised" that we can now handle page faults
concurrent to fork() and was expecting something to be broken already.
What probably happens is that we wr-protected the page in the parent process and
COW-shared an anon page with the child using copy_present_pte().
But we only flush the parent MM tlb before we drop the parent MM lock in
dup_mmap().
If we get a write-fault before that TLB flush in the parent, and we end up
replacing that anon page in the parent process in do_wp_page() [because, COW-shared with the child],
this might be problematic: some stale writable TLB entries can target the wrong (old) page.
We had similar issues in the past with userfaultfd, see the comment at the beginning of do_wp_page():
if (likely(!unshare)) {
if (userfaultfd_pte_wp(vma, *vmf->pte)) {
pte_unmap_unlock(vmf->pte, vmf->ptl);
return handle_userfault(vmf, VM_UFFD_WP);
}
/*
* Userfaultfd write-protect can defer flushes. Ensure the TLB
* is flushed in this case before copying.
*/
if (unlikely(userfaultfd_wp(vmf->vma) &&
mm_tlb_flush_pending(vmf->vma->vm_mm)))
flush_tlb_page(vmf->vma, vmf->address);
}
We really should not allow page faults concurrent to fork() without further investigation.
--
Cheers,
David / dhildenb