Apologies in advance, but I cannot git bisect this since machine was running for 10 days on 6.6.8 before this happened. Reporting in case it's useful (and not a hardware fail). There is nothing interesting in journal ahead of the crash - previous entry, 2 minutes prior from user space dhcp server. - Root, efi is on nvme - Spare root,efi is on sdg - md raid6 on sda-sd with lvmcache from one partition on nvme drive. - all filesystems are ext4 (other than efi). - 32 GB mem. regards gene details attached which show: Dec 30 07:00:36 s6 kernel: <TASK> Dec 30 07:00:36 s6 kernel: ? __folio_mark_dirty+0x21c/0x2a0 Dec 30 07:00:36 s6 kernel: ? __warn+0x81/0x130 Dec 30 07:00:36 s6 kernel: ? __folio_mark_dirty+0x21c/0x2a0 Dec 30 07:00:36 s6 kernel: ? report_bug+0x171/0x1a0 Dec 30 07:00:36 s6 kernel: ? handle_bug+0x3c/0x80 Dec 30 07:00:36 s6 kernel: ? exc_invalid_op+0x17/0x70 Dec 30 07:00:36 s6 kernel: ? asm_exc_invalid_op+0x1a/0x20 Dec 30 07:00:36 s6 kernel: ? __folio_mark_dirty+0x21c/0x2a0 Dec 30 07:00:36 s6 kernel: block_dirty_folio+0x8a/0xb0 Dec 30 07:00:36 s6 kernel: unmap_page_range+0xd17/0x1120 Dec 30 07:00:36 s6 kernel: unmap_vmas+0xb5/0x190 Dec 30 07:00:36 s6 kernel: exit_mmap+0xec/0x340 Dec 30 07:00:36 s6 kernel: __mmput+0x3e/0x130 Dec 30 07:00:36 s6 kernel: do_exit+0x31c/0xb20 Dec 30 07:00:36 s6 kernel: do_group_exit+0x31/0x80 Dec 30 07:00:36 s6 kernel: __x64_sys_exit_group+0x18/0x20 Dec 30 07:00:36 s6 kernel: do_syscall_64+0x5d/0x90 Dec 30 07:00:36 s6 kernel: ? count_memcg_events.constprop.0+0x1a/0x30 Dec 30 07:00:36 s6 kernel: ? handle_mm_fault+0xa2/0x360 Dec 30 07:00:36 s6 kernel: ? do_user_addr_fault+0x30f/0x660 Dec 30 07:00:36 s6 kernel: ? exc_page_fault+0x7f/0x180 Dec 30 07:00:36 s6 kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Dec 30 07:00:36 s6 kernel: RIP: 0033:0x7fb3c581ee2d Dec 30 07:00:36 s6 kernel: Code: Unable to access opcode bytes at 0x7fb3c581ee03. Dec 30 07:00:36 s6 kernel: RSP: 002b:00007fff620541e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7 Dec 30 07:00:36 s6 kernel: RAX: ffffffffffffffda RBX: 00007fb3c591efa8 RCX: 00007fb3c581ee2d Dec 30 07:00:36 s6 kernel: RDX: 00000000000000e7 RSI: ffffffffffffff88 RDI: 0000000000000000 Dec 30 07:00:36 s6 kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 00007fb3c5924920 Dec 30 07:00:36 s6 kernel: R10: 00005650f2e615f0 R11: 0000000000000206 R12: 0000000000000000 Dec 30 07:00:36 s6 kernel: R13: 0000000000000000 R14: 00007fb3c591d680 R15: 00007fb3c591efc0 Dec 30 07:00:36 s6 kernel: </TASK>
Dec 30 07:00:36 s6 kernel: ------------[ cut here ]------------ Dec 30 07:00:36 s6 kernel: WARNING: CPU: 0 PID: 521524 at mm/page-writeback.c:2668 __folio_mark_dirty (??:?) Dec 30 07:00:36 s6 kernel: Modules linked in: algif_hash af_alg rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs nft_nat nft_chain_nat nf_nat nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables rpcrdma rdma> Dec 30 07:00:36 s6 kernel: async_xor rapl joydev async_tx intel_cstate mei_me nls_iso8859_1 vfat i2c_i801 xor cec snd raid6_pq libcrc32c intel_uncore mxm_wmi pcspkr e1000e i2c_smbus intel_wmi_thunderbolt soundcore mei> Dec 30 07:00:36 s6 kernel: CPU: 0 PID: 521524 Comm: rsync Not tainted 6.6.8-stable-1 #13 d238f5ab6a206cdb0cc5cd72f8688230f23d58df Dec 30 07:00:36 s6 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370 Extreme4, BIOS P4.20 10/31/2019 Dec 30 07:00:36 s6 kernel: RIP: 0010:__folio_mark_dirty (??:?) Dec 30 07:00:36 s6 kernel: Code: 89 fe e8 57 22 14 00 65 ff 0d b8 ff f2 62 0f 84 8d 00 00 00 49 8b 3c 24 e9 47 fe ff ff 4c 89 ff e8 b9 18 08 00 48 89 c6 eb 85 <0f> 0b e9 27 fe ff ff 48 8b 52 10 e9 56 ff ff ff 48 c7 04 > All code ======== 0: 89 fe mov %edi,%esi 2: e8 57 22 14 00 call 0x14225e 7: 65 ff 0d b8 ff f2 62 decl %gs:0x62f2ffb8(%rip) # 0x62f2ffc6 e: 0f 84 8d 00 00 00 je 0xa1 14: 49 8b 3c 24 mov (%r12),%rdi 18: e9 47 fe ff ff jmp 0xfffffffffffffe64 1d: 4c 89 ff mov %r15,%rdi 20: e8 b9 18 08 00 call 0x818de 25: 48 89 c6 mov %rax,%rsi 28: eb 85 jmp 0xffffffffffffffaf 2a:* 0f 0b ud2 <-- trapping instruction 2c: e9 27 fe ff ff jmp 0xfffffffffffffe58 31: 48 8b 52 10 mov 0x10(%rdx),%rdx 35: e9 56 ff ff ff jmp 0xffffffffffffff90 3a: 48 rex.W 3b: c7 .byte 0xc7 3c: 04 00 add $0x0,%al Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: e9 27 fe ff ff jmp 0xfffffffffffffe2e 7: 48 8b 52 10 mov 0x10(%rdx),%rdx b: e9 56 ff ff ff jmp 0xffffffffffffff66 10: 48 rex.W 11: c7 .byte 0xc7 12: 04 00 add $0x0,%al Dec 30 07:00:36 s6 kernel: RSP: 0018:ffffc9000c037b00 EFLAGS: 00010046 Dec 30 07:00:36 s6 kernel: RAX: 02ffff6000008030 RBX: 0000000000000286 RCX: ffff8885d44dff08 Dec 30 07:00:36 s6 kernel: RDX: 0000000000000001 RSI: ffff88810d015ca8 RDI: ffff88810d015cb0 Dec 30 07:00:36 s6 kernel: RBP: ffff88810d015cb0 R08: ffff8885208c1300 R09: 0000000000000000 Dec 30 07:00:36 s6 kernel: R10: 0000000000000200 R11: 0000000000000002 R12: ffff88810d015ca8 Dec 30 07:00:36 s6 kernel: R13: 0000000000000001 R14: ffff88851ec72fc0 R15: ffffea00105c5e00 Dec 30 07:00:36 s6 kernel: FS: 0000000000000000(0000) GS:ffff88889ee00000(0000) knlGS:0000000000000000 Dec 30 07:00:36 s6 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 30 07:00:36 s6 kernel: CR2: 00007fb3c593b020 CR3: 0000000690e20003 CR4: 00000000003706f0 Dec 30 07:00:36 s6 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Dec 30 07:00:36 s6 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Dec 30 07:00:36 s6 kernel: Call Trace: Dec 30 07:00:36 s6 kernel: <TASK> Dec 30 07:00:36 s6 kernel: ? __folio_mark_dirty (??:?) Dec 30 07:00:36 s6 kernel: ? __warn (??:?) Dec 30 07:00:36 s6 kernel: ? __folio_mark_dirty (??:?) Dec 30 07:00:36 s6 kernel: ? report_bug (??:?) Dec 30 07:00:36 s6 kernel: ? handle_bug (??:?) Dec 30 07:00:36 s6 kernel: ? exc_invalid_op (??:?) Dec 30 07:00:36 s6 kernel: ? asm_exc_invalid_op (??:?) Dec 30 07:00:36 s6 kernel: ? __folio_mark_dirty (??:?) Dec 30 07:00:36 s6 kernel: block_dirty_folio (??:?) Dec 30 07:00:36 s6 kernel: unmap_page_range (??:?) Dec 30 07:00:36 s6 kernel: unmap_vmas (??:?) Dec 30 07:00:36 s6 kernel: exit_mmap (??:?) Dec 30 07:00:36 s6 kernel: __mmput (??:?) Dec 30 07:00:36 s6 kernel: do_exit (??:?) Dec 30 07:00:36 s6 kernel: do_group_exit (??:?) Dec 30 07:00:36 s6 kernel: __x64_sys_exit_group (??:?) Dec 30 07:00:36 s6 kernel: do_syscall_64 (??:?) Dec 30 07:00:36 s6 kernel: ? count_memcg_events.constprop.0 (??:?) Dec 30 07:00:36 s6 kernel: ? handle_mm_fault (??:?) Dec 30 07:00:36 s6 kernel: ? do_user_addr_fault (??:?) Dec 30 07:00:36 s6 kernel: ? exc_page_fault (??:?) Dec 30 07:00:36 s6 kernel: entry_SYSCALL_64_after_hwframe (??:?) Dec 30 07:00:36 s6 kernel: RIP: 0033:0x7fb3c581ee2d Dec 30 07:00:36 s6 kernel: Code: Unable to access opcode bytes at 0x7fb3c581ee03. Code starting with the faulting instruction =========================================== Dec 30 07:00:36 s6 kernel: RSP: 002b:00007fff620541e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7 Dec 30 07:00:36 s6 kernel: RAX: ffffffffffffffda RBX: 00007fb3c591efa8 RCX: 00007fb3c581ee2d Dec 30 07:00:36 s6 kernel: RDX: 00000000000000e7 RSI: ffffffffffffff88 RDI: 0000000000000000 Dec 30 07:00:36 s6 kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 00007fb3c5924920 Dec 30 07:00:36 s6 kernel: R10: 00005650f2e615f0 R11: 0000000000000206 R12: 0000000000000000 Dec 30 07:00:36 s6 kernel: R13: 0000000000000000 R14: 00007fb3c591d680 R15: 00007fb3c591efc0 Dec 30 07:00:36 s6 kernel: </TASK> Dec 30 07:00:36 s6 kernel: ---[ end trace 0000000000000000 ]--- Dec 30 07:00:36 s6 kernel: BUG: Bad rss-counter state mm:000000008e24d57a type:MM_FILEPAGES val:-1 Dec 30 07:00:36 s6 kernel: BUG: Bad rss-counter state mm:000000008e24d57a type:MM_ANONPAGES val:1 Dec 30 07:02:23 s6 kernel: general protection fault, probably for non-canonical address 0x6d65532d66697975: 0000 [#1] PREEMPT SMP PTI Dec 30 07:02:23 s6 kernel: CPU: 7 PID: 521578 Comm: rsync Tainted: G W 6.6.8-stable-1 #13 d238f5ab6a206cdb0cc5cd72f8688230f23d58df Dec 30 07:02:23 s6 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370 Extreme4, BIOS P4.20 10/31/2019 Dec 30 07:02:23 s6 kernel: RIP: 0010:__mod_memcg_lruvec_state (??:?) Dec 30 07:02:23 s6 kernel: Code: ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 48 8b 8f 40 0b 00 00 48 63 c2 89 f6 48 c1 e6 03 <48> 8b 91 10 07 00 00 48 01 f2 65 48 01 02 48 03 b7 28 06 > All code ======== 0: ff 90 90 90 90 90 call *-0x6f6f6f70(%rax) 6: 90 nop 7: 90 nop 8: 90 nop 9: 90 nop a: 90 nop b: 90 nop c: 90 nop d: 90 nop e: 90 nop f: 90 nop 10: 90 nop 11: 66 0f 1f 00 nopw (%rax) 15: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 1a: 48 8b 8f 40 0b 00 00 mov 0xb40(%rdi),%rcx 21: 48 63 c2 movslq %edx,%rax 24: 89 f6 mov %esi,%esi 26: 48 c1 e6 03 shl $0x3,%rsi 2a:* 48 8b 91 10 07 00 00 mov 0x710(%rcx),%rdx <-- trapping instruction 31: 48 01 f2 add %rsi,%rdx 34: 65 48 01 02 add %rax,%gs:(%rdx) 38: 48 rex.W 39: 03 .byte 0x3 3a: b7 28 mov $0x28,%bh 3c: 06 (bad) ... Code starting with the faulting instruction =========================================== 0: 48 8b 91 10 07 00 00 mov 0x710(%rcx),%rdx 7: 48 01 f2 add %rsi,%rdx a: 65 48 01 02 add %rax,%gs:(%rdx) e: 48 rex.W f: 03 .byte 0x3 10: b7 28 mov $0x28,%bh 12: 06 (bad) ... Dec 30 07:02:23 s6 kernel: RSP: 0018:ffffc9000c12fb68 EFLAGS: 00010206