Hi Nhat, Thanks for the first stab on this bug. On Tue, Dec 26, 2023 at 1:08 PM Nhat Pham <nphamcs@xxxxxxxxx> wrote: > > On Tue, Dec 26, 2023 at 7:28 AM syzbot > <syzbot+3eff5e51bf1db122a16e@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 39676dfe5233 Add linux-next specific files for 20231222 > > git tree: linux-next > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=172080a1e80000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=f3761490b734dc96 > > dashboard link: https://syzkaller.appspot.com/bug?extid=3eff5e51bf1db122a16e > > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=178f6e26e80000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15c399e9e80000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/360542c2ca67/disk-39676dfe.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/900dfb21ca8a/vmlinux-39676dfe.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/c94a2a3ea0e0/bzImage-39676dfe.xz > > > > The issue was bisected to: > > > > commit 7bc134496bbbaacb0d4423b819da4eca850a839d > > Author: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > > Date: Mon Dec 18 11:50:31 2023 +0000 > > > > mm/zswap: change dstmem size to one page > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15f60c36e80000 > > final oops: https://syzkaller.appspot.com/x/report.txt?x=17f60c36e80000 > > console output: https://syzkaller.appspot.com/x/log.txt?x=13f60c36e80000 > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+3eff5e51bf1db122a16e@xxxxxxxxxxxxxxxxxxxxxxxxx > > Fixes: 7bc134496bbb ("mm/zswap: change dstmem size to one page") > > > > general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN > > KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] > > CPU: 0 PID: 5065 Comm: syz-executor140 Not tainted 6.7.0-rc6-next-20231222-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 > > RIP: 0010:scatterwalk_start include/crypto/scatterwalk.h:63 [inline] <========= This is the offending line. > > RIP: 0010:scatterwalk_pagedone include/crypto/scatterwalk.h:83 [inline] > > RIP: 0010:scatterwalk_pagedone include/crypto/scatterwalk.h:72 [inline] > > RIP: 0010:scatterwalk_copychunks+0x3e0/0x560 crypto/scatterwalk.c:50 This is the call stack with an inline function. Very nice annotations on the inline function expansions call stacks. > > Code: f0 48 c1 e8 03 80 3c 08 00 0f 85 81 01 00 00 49 8d 44 24 08 4d 89 26 48 bf 00 00 00 00 00 fc ff df 48 89 44 24 10 48 c1 e8 03 <0f> b6 04 38 84 c0 74 08 3c 03 0f 8e 47 01 00 00 48 8b 44 24 08 41 > > RSP: 0018:ffffc90003a8ecf0 EFLAGS: 00010202 > > RAX: 0000000000000001 RBX: 0000000000000000 RCX: dffffc0000000000 > > RDX: ffff88802785d940 RSI: ffffffff8465df74 RDI: dffffc0000000000 > > RBP: 0000000000001000 R08: 0000000000000005 R09: 0000000000000000 > > R10: 0000000000000002 R11: 82d8bd1b6060f805 R12: 0000000000000000 > > R13: 0000000000000014 R14: ffffc90003a8ed88 R15: 0000000000001000 > > FS: 00005555565c5380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000d5e538 CR3: 0000000079f3a000 CR4: 00000000003506f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <TASK> > > scatterwalk_map_and_copy+0x151/0x1d0 crypto/scatterwalk.c:67 > > scomp_acomp_comp_decomp+0x3a3/0x780 crypto/scompress.c:149 > > Looks like it's this line: > > scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,1); Yes indeed. The offending instruction is actually this one: The inlined part of the call stack: RIP: 0010:scatterwalk_start include/crypto/scatterwalk.h:63 [inline] <========= This is the offending line. static inline void scatterwalk_start(struct scatter_walk *walk, struct scatterlist *sg) { walk->sg = sg; walk->offset = sg->offset; <============== RIP pointer } RIP: 0010:scatterwalk_pagedone include/crypto/scatterwalk.h:83 [inline] static inline void scatterwalk_pagedone(struct scatter_walk *walk, int out, unsigned int more) { if (out) { struct page *page; page = sg_page(walk->sg) + ((walk->offset - 1) >> PAGE_SHIFT); flush_dcache_page(page); } if (more && walk->offset >= walk->sg->offset + walk->sg->length) scatterwalk_start(walk, sg_next(walk->sg)); <========================== } RIP: 0010:scatterwalk_pagedone include/crypto/scatterwalk.h:72 [inline] RIP: 0010:scatterwalk_copychunks+0x3e0/0x560 crypto/scatterwalk.c:50 void scatterwalk_copychunks(void *buf, struct scatter_walk *walk, size_t nbytes, int out) { for (;;) { unsigned int len_this_page = scatterwalk_pagelen(walk); u8 *vaddr; if (len_this_page > nbytes) len_this_page = nbytes; if (out != 2) { vaddr = scatterwalk_map(walk); memcpy_dir(buf, vaddr, len_this_page, out); scatterwalk_unmap(vaddr); } scatterwalk_advance(walk, len_this_page); if (nbytes == len_this_page) break; buf += len_this_page; nbytes -= len_this_page; scatterwalk_pagedone(walk, out & 1, 1); <========================= } } then the unlined portion of the call stack: scatterwalk_map_and_copy+0x151/0x1d0 crypto/scatterwalk.c:67 void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg, unsigned int start, unsigned int nbytes, int out) { struct scatter_walk walk; struct scatterlist tmp[2]; if (!nbytes) return; sg = scatterwalk_ffwd(tmp, sg, start); scatterwalk_start(&walk, sg); scatterwalk_copychunks(buf, &walk, nbytes, out); <=========================== scatterwalk_done(&walk, out, 0); } scomp_acomp_comp_decomp+0x3a3/0x780 crypto/scompress.c:149 if (dir) ret = crypto_scomp_compress(scomp, scratch->src, req->slen, scratch->dst, &req->dlen, *ctx); else ret = crypto_scomp_decompress(scomp, scratch->src, req->slen, scratch->dst, &req->dlen, *ctx); if (!ret) { if (!req->dst) { req->dst = sgl_alloc(req->dlen, GFP_ATOMIC, NULL); if (!req->dst) { ret = -ENOMEM; goto out; } } scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen, <======================= 1); } crypto_acomp_compress include/crypto/acompress.h:302 [inline] zswap_store+0x98b/0x2430 mm/zswap.c:1666 swap_writepage+0x8e/0x220 mm/page_io.c:198 pageout+0x399/0x9e0 mm/vmscan.c:656 shrink_folio_list+0x2f47/0x3ea0 mm/vmscan.c:1319 reclaim_folio_list+0xe4/0x3a0 mm/vmscan.c:2104 reclaim_pages+0x483/0x6a0 mm/vmscan.c:2140 madvise_cold_or_pageout_pte_range+0x129e/0x1f70 mm/madvise.c:526 walk_pmd_range mm/pagewalk.c:143 [inline] > > My guess is dlen here exceeds 1 page - maybe the data is > incompressible, so the output exceeds the original input? Based on the > included kernel config, the algorithm used here is lzo, and it seems > that can happen for this particular compressor: > > https://github.com/nemequ/lzo/blob/master/doc/LZO.FAQ#L200C1-L203C1 > https://github.com/nemequ/lzo/blob/master/doc/LZO.TXT#L157C2-L157C62 The decompressed output can be bigger than input. However, here we specify the output size limit to be one page. The decompressor should not output more than the 1 page of the dst buffer. I did check the lzo1x_decompress_safe() function. It is supposed to use the NEED_OP(x) check before it stores the output buffer. However I do find one place that seems to be missing that check, at least it is not obvious to me how that check is effective. I will paste it here let other experts take a look as well: Line 228: if (op - m_pos >= 8) { unsigned char *oe = op + t; if (likely(HAVE_OP(t + 15))) { do { COPY8(op, m_pos); op += 8; m_pos += 8; COPY8(op, m_pos); op += 8; m_pos += 8; } while (op < oe); op = oe; if (HAVE_IP(6)) { state = next; COPY4(op, ip); <========================= This COPY4 does not have obvious NEED_OP() check. As far as I can tell, this output is not converted by the above HAVE_OP(t + 15)) check, which means to protect the unaligned two 8 byte stores inside the "do while" loops. op += next; ip += next; continue; } } else { NEED_OP(t); do { *op++ = *m_pos++; } while (op < oe); } } else > > Not 100% sure about linux kernel's implementation though. I'm no > compression expert :) Me neither. Anyway, if it is a decompression issue. For this bug, it is good to have some debug print assert to check that after decompression, the *dlen is not bigger than the original output length. If it does blow over the decompression buffer, it is a bug that needs to be fixed in the decompression function side, not the zswap code. Chris > > If this is the case though, perhaps we can't reduce the output buffer > size to 1 page after all, unless we contractually obligate the output > size to be <= input size (i.e new function in the compression API). > > > > crypto_acomp_compress include/crypto/acompress.h:302 [inline] > > zswap_store+0x98b/0x2430 mm/zswap.c:1666 > > swap_writepage+0x8e/0x220 mm/page_io.c:198 > > pageout+0x399/0x9e0 mm/vmscan.c:656 > > shrink_folio_list+0x2f47/0x3ea0 mm/vmscan.c:1319 > > reclaim_folio_list+0xe4/0x3a0 mm/vmscan.c:2104 > > reclaim_pages+0x483/0x6a0 mm/vmscan.c:2140 > > madvise_cold_or_pageout_pte_range+0x129e/0x1f70 mm/madvise.c:526 > > walk_pmd_range mm/pagewalk.c:143 [inline] > > walk_pud_range mm/pagewalk.c:221 [inline] > > walk_p4d_range mm/pagewalk.c:256 [inline] > > walk_pgd_range+0xa48/0x1870 mm/pagewalk.c:293 > > __walk_page_range+0x630/0x770 mm/pagewalk.c:395 > > walk_page_range+0x626/0xa80 mm/pagewalk.c:521 > > madvise_pageout_page_range mm/madvise.c:585 [inline] > > madvise_pageout+0x32c/0x820 mm/madvise.c:612 > > madvise_vma_behavior+0x1cc/0x1b50 mm/madvise.c:1031 > > madvise_walk_vmas+0x1cf/0x2c0 mm/madvise.c:1260 > > do_madvise+0x333/0x660 mm/madvise.c:1440 > > __do_sys_madvise mm/madvise.c:1453 [inline] > > __se_sys_madvise mm/madvise.c:1451 [inline] > > __x64_sys_madvise+0xa9/0x110 mm/madvise.c:1451 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x62/0x6a > > RIP: 0033:0x7f15a5e14b69 > > Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > > RSP: 002b:00007ffde7b4a5c8 EFLAGS: 00000246 ORIG_RAX: 000000000000001c > > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f15a5e14b69 > > RDX: 0000000000000015 RSI: 0000000000c00304 RDI: 0000000020000000 > > RBP: 0000000000000000 R08: 00005555565c6610 R09: 00005555565c6610 > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 > > R13: 00007ffde7b4a808 R14: 0000000000000001 R15: 0000000000000001 > > </TASK> > > Modules linked in: > > ---[ end trace 0000000000000000 ]--- > > RIP: 0010:scatterwalk_start include/crypto/scatterwalk.h:63 [inline] > > RIP: 0010:scatterwalk_pagedone include/crypto/scatterwalk.h:83 [inline] > > RIP: 0010:scatterwalk_pagedone include/crypto/scatterwalk.h:72 [inline] > > RIP: 0010:scatterwalk_copychunks+0x3e0/0x560 crypto/scatterwalk.c:50 > > Code: f0 48 c1 e8 03 80 3c 08 00 0f 85 81 01 00 00 49 8d 44 24 08 4d 89 26 48 bf 00 00 00 00 00 fc ff df 48 89 44 24 10 48 c1 e8 03 <0f> b6 04 38 84 c0 74 08 3c 03 0f 8e 47 01 00 00 48 8b 44 24 08 41 > > RSP: 0018:ffffc90003a8ecf0 EFLAGS: 00010202 > > RAX: 0000000000000001 RBX: 0000000000000000 RCX: dffffc0000000000 > > RDX: ffff88802785d940 RSI: ffffffff8465df74 RDI: dffffc0000000000 > > RBP: 0000000000001000 R08: 0000000000000005 R09: 0000000000000000 > > R10: 0000000000000002 R11: 82d8bd1b6060f805 R12: 0000000000000000 > > R13: 0000000000000014 R14: ffffc90003a8ed88 R15: 0000000000001000 > > FS: 00005555565c5380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000d5e538 CR3: 0000000079f3a000 CR4: 00000000003506f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > ---------------- > > Code disassembly (best guess): > > 0: f0 48 c1 e8 03 lock shr $0x3,%rax > > 5: 80 3c 08 00 cmpb $0x0,(%rax,%rcx,1) > > 9: 0f 85 81 01 00 00 jne 0x190 > > f: 49 8d 44 24 08 lea 0x8(%r12),%rax > > 14: 4d 89 26 mov %r12,(%r14) > > 17: 48 bf 00 00 00 00 00 movabs $0xdffffc0000000000,%rdi > > 1e: fc ff df > > 21: 48 89 44 24 10 mov %rax,0x10(%rsp) > > 26: 48 c1 e8 03 shr $0x3,%rax > > * 2a: 0f b6 04 38 movzbl (%rax,%rdi,1),%eax <-- trapping instruction > > 2e: 84 c0 test %al,%al > > 30: 74 08 je 0x3a > > 32: 3c 03 cmp $0x3,%al > > 34: 0f 8e 47 01 00 00 jle 0x181 > > 3a: 48 8b 44 24 08 mov 0x8(%rsp),%rax > > 3f: 41 rex.B > > > > > > --- > > This report is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx. > > > > syzbot will keep track of this issue. See: > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > For information about bisection process see: https://goo.gl/tpsmEJ#bisection > > > > If the report is already addressed, let syzbot know by replying with: > > #syz fix: exact-commit-title > > > > If you want syzbot to run the reproducer, reply with: > > #syz test: git://repo/address.git branch-or-commit-hash > > If you attach or paste a git patch, syzbot will apply it before testing. > > > > If you want to overwrite report's subsystems, reply with: > > #syz set subsystems: new-subsystem > > (See the list of subsystem names on the web dashboard) > > > > If the report is a duplicate of another one, reply with: > > #syz dup: exact-subject-of-another-report > > > > If you want to undo deduplication, reply with: > > #syz undup >