On 26.09.24 15:39, Peter Xu wrote:
On Thu, Sep 26, 2024 at 12:48:19PM +0200, David Hildenbrand wrote:
On 25.09.24 18:59, Peter Xu wrote:
On Tue, Sep 24, 2024 at 04:45:00PM +0200, David Hildenbrand wrote:
On 23.09.24 14:18, syzbot wrote:
Hello,
syzbot found the following issue on:
HEAD commit: 88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k..
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=16c36c27980000
kernel config: https://syzkaller.appspot.com/x/.config?x=e851828834875d6f
dashboard link: https://syzkaller.appspot.com/bug?extid=bf2c35fa302ebe3c7471
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12773080580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16ed5e9f980000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0e011ac37c93/disk-88264981.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/f5c65577e19e/vmlinux-88264981.xz
kernel image: https://storage.googleapis.com/syzbot-assets/984d963c8ea1/bzImage-88264981.xz
The issue was bisected to:
commit 75182022a0439788415b2dd1db3086e07aa506f7
Author: Peter Xu <peterx@xxxxxxxxxx>
Date: Mon Aug 26 20:43:51 2024 +0000
mm/x86: support large pfn mappings
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=17df9c27980000
final oops: https://syzkaller.appspot.com/x/report.txt?x=143f9c27980000
console output: https://syzkaller.appspot.com/x/log.txt?x=103f9c27980000
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+bf2c35fa302ebe3c7471@xxxxxxxxxxxxxxxxxxxxxxxxx
Fixes: 75182022a043 ("mm/x86: support large pfn mappings")
------------[ cut here ]------------
WARNING: CPU: 1 PID: 5508 at mm/huge_memory.c:1602 copy_huge_pmd+0x102c/0x1c60 mm/huge_memory.c:1602
This is the
VM_WARN_ON_ONCE(is_cow_mapping(src_vma->vm_flags) && pmd_write(pmd))
So we have a special-marked PMD in a COW mapping.
The reproducer seems to involve fuse, but not sure if that makes a
difference here.
That chunk of code seems to be there only making sure the test won't get
blocked due to any fused based fs being stuck, via writting to the "abort"
file:
snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
ent->d_name);
int fd = open(abort, O_WRONLY);
if (fd == -1) {
continue;
}
if (write(fd, abort, 1) < 0) {
}
close(fd);
So far looks not relevant to this issue indeed.
Unfortunately I cannot reproduce it even with the reproducer. So this one
is a bit tricky..
What confuses me yet is how that special bit is set, if it's only used so
far with vfio-pci, and this test doesn't seem to have it involved.
The test keeps invoking processes, then threads, doing concurrent accesses
over a few stuff (madvise, mremap, migrate_pages, munmap, etc.) on the
pre-mapped areas, but none of them seem to create new memory that can
provide hint on how special bit can start to occur.
I wonder if some of these operations can race in a way that mm can wrongly
create the special bit (alone with it being writable).. and then it could
be a historical bug, only captured by this patchset due to the newly added
WARN_ON_ONCE somehow, then it could mean that it's not the WRITE bit that
is not intended, but the SPECIAL bit altogether.
I assume you are missing a check for present/non-swap pmds. Assume you have
a migration entry and end up using the special bit -- which is perfectly
fine -- your code would assume it's a present PMD with the special bit set.
Maybe for the time being something like:
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0580ac9e47b9..e55efcad1e6c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1586,7 +1586,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct
mm_struct *src_mm,
int ret = -ENOMEM;
pmd = pmdp_get_lockless(src_pmd);
- if (unlikely(pmd_special(pmd))) {
+ if (unlikely(pmd_present(pmd) && pmd_special(pmd))) {
dst_ptl = pmd_lock(dst_mm, dst_pmd);
src_ptl = pmd_lockptr(src_mm, src_pmd);
spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
Good catch!
I definitely overlooked it, and I did check the config has THP_MIGRATION
set indeed. So it's very possible relevant.
Do you want to send a formal patch? You can also push a branch with "#syz
test", looks like syzbot can constantly reproduce.
Yes, let me send out a patch real quick.
--
Cheers,
David / dhildenb