On Thu, 2023-02-09 at 21:50 +0000, Matthew Wilcox wrote:
On Fri, Feb 10, 2023 at 08:30:02AM +1100, Dave Chinner wrote:[cc willy, linux-mm, as it crashed walking the page cache in thegeneric fault code]I've seen this one occasionally, and I'm not sure what's going on.I've never been able to reproduce it myself, and it seems to disappearfor the people who have been able to reproduce it ;-(It is 100% my fault and definitely caused by large folios. In theXArray, large folios are represented by a folio pointer in the lowestindex occupied by that folio and sibling entries in every other index,which redirect lookups to the canonical (ie lowest) entry. This 0x42that you've managed to find in the XArray is a sibling entry. Itsays that the entry we're actually looking for is at offset 0x10 ofthe node we're in.Something similar was fixed in commit 63b1898fffcd, but that was asibling entry that ended up pointing to a node. You've *presumably*hit some kind of temporary situation where the original sibling entry is nolonger pointing to the folio entry that it should be. However, there'sanother possibility, which is that this is not a temporary RCU-inducedstate, but we have corruption in the tree. If we do have corruption,then you'll see an infinite loop instead of a crash.If it's a temporary situation, this will fix it.
I'm unfortunately not in a position to test a fix.
diff --git a/lib/xarray.c b/lib/xarray.cindex ea9ce1f0b386..4237a9647a6a 100644--- a/lib/xarray.c+++ b/lib/xarray.c@@ -207,7 +207,8 @@ static void *xas_descend(struct xa_state *xas, struct xa_node *node)if (xa_is_sibling(entry)) {offset = xa_to_sibling(entry);entry = xa_entry(xas->xa, node, offset);- if (node->shift && xa_is_node(entry))+ if (xa_is_sibling(entry) ||+ (node->shift && xa_is_node(entry)))entry = XA_RETRY_ENTRY;}Please do let me know ... you say it's happened twice, but how manymachine-hours did it take to hit twice?
That's hard to say. There are ~5 machines doing this work, the kernel was installed in early February, so around 1000 machine-hours, but what part of the time they were busy and how much of that they were running the triggering workload, I can't say.
On Thu, Feb 09, 2023 at 10:43:10AM +0200, Avi Kivity wrote:Workload: compilation and running unit tests. The task that crashed isa unit test.Kernel: 6.1.8-200.fc37.x86_64Previously known stable on 5.8.9-200.fc32.x86_64. Two crashes seen sofar.Feb 7 17:19:33 localhost kernel: BUG: kernel NULL pointer dereference,address: 0000000000000042Feb 7 17:19:33 localhost kernel: #PF: supervisor read access in kernelmodeFeb 7 17:19:33 localhost kernel: #PF: error_code(0x0000) - not-presentpageFeb 7 17:19:33 localhost kernel: PGD 80000001cbb1f067 P4D80000001cbb1f067 PUD 9cbb75067 PMD 0Feb 7 17:19:33 localhost kernel: Oops: 0000 [#1] PREEMPT SMP PTIFeb 7 17:19:33 localhost kernel: CPU: 24 PID: 3718328 Comm:transport_test Tainted: G S 6.1.8-200.fc37.x86_64 #1Feb 7 17:19:33 localhost kernel: Hardware name: Dell Inc. PowerEdgeR730/0599V5, BIOS 2.9.1 12/04/2018Feb 7 17:19:33 localhost kernel: RIP:0010:next_uptodate_page+0x46/0x200Feb 7 17:19:33 localhost kernel: Code: 0f 84 3f 01 00 00 48 81 ff 0604 00 00 0f 84 b3 00 00 00 48 81 ff 02 04 00 00 0f 84 37 01 00 00 40 f6c7 01 0f 85 9c 00 00 00 <48> 8b 07 a8 01 0f 85 91 00 00 00 8b 47 34 85c0 0f 84 86 00 00 00Feb 7 17:19:33 localhost kernel: RSP: 0000:ffffa83e4ed67cc8 EFLAGS:00010246Feb 7 17:19:33 localhost kernel: RAX: 0000000000000042 RBX:ffffa83e4ed67e00 RCX: 000000000000146eFeb 7 17:19:33 localhost kernel: RDX: ffffa83e4ed67d20 RSI:ffff94a9046316b0 RDI: 0000000000000042Feb 7 17:19:33 localhost kernel: RBP: ffffa83e4ed67d20 R08:000000000000146e R09: 0000000000dfd000Feb 7 17:19:33 localhost kernel: R10: 000000000000145f R11:ffff94978b85960c R12: ffff94a9046316b0Feb 7 17:19:33 localhost kernel: R13: 000000000000146e R14:ffff94a9046316b0 R15: ffff948f8bb1f000Feb 7 17:19:33 localhost kernel: FS: 00007fd68fcb9d40(0000)GS:ffff949dffd00000(0000) knlGS:0000000000000000Feb 7 17:19:33 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0:0000000080050033Feb 7 17:19:33 localhost kernel: CR2: 0000000000000042 CR3:00000001dc1be005 CR4: 00000000001706e0Feb 7 17:19:33 localhost kernel: Call Trace:Feb 7 17:19:33 localhost kernel: <TASK>Feb 7 17:19:33 localhost kernel: filemap_map_pages+0x9f/0x7b0Feb 7 17:19:33 localhost kernel: xfs_filemap_map_pages+0x41/0x60 [xfs]Feb 7 17:19:33 localhost kernel: do_fault+0x1bf/0x430Feb 7 17:19:33 localhost kernel: __handle_mm_fault+0x63d/0xe40Feb 7 17:19:33 localhost kernel: ? do_sigaction+0x11a/0x240Feb 7 17:19:33 localhost kernel: handle_mm_fault+0xdb/0x2d0Feb 7 17:19:33 localhost kernel: do_user_addr_fault+0x1cd/0x690Feb 7 17:19:33 localhost kernel: exc_page_fault+0x70/0x170Feb 7 17:19:33 localhost kernel: asm_exc_page_fault+0x22/0x30Feb 7 17:19:33 localhost kernel: RIP: 0033:0x1666350Feb 7 17:19:33 localhost kernel: Code: Unable to access opcode bytesat 0x1666326.Feb 7 17:19:33 localhost kernel: RSP: 002b:00007ffde7fa86d8 EFLAGS:00010212Feb 7 17:19:33 localhost kernel: RAX: 0000000000000000 RBX:00007ffde7fa8748 RCX: 0000000002ed4468Feb 7 17:19:33 localhost kernel: RDX: 00006000000c4f50 RSI:00007ffde7fa8748 RDI: 0000000000000012Feb 7 17:19:33 localhost kernel: RBP: 0000000000000012 R08:0000000000000001 R09: 0000000002f46860Feb 7 17:19:33 localhost kernel: R10: 00007fd69219cac0 R11:00007fd69224e670 R12: 0000000000000000Feb 7 17:19:33 localhost kernel: R13: 00006000000c4f50 R14:0000000002ed4470 R15: 00007fd693be0000Feb 7 17:19:33 localhost kernel: </TASK>Feb 7 17:19:33 localhost kernel: Modules linked in: xsk_diag veth tlsxt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype nft_compatbr_netfilter bridge stp llc intel_rapl_msr dell_wmi iTCO_wdtdell_smbios intel_pmc_bxt iTCO_vendor_support dell_wmi_descriptorledtrig_audio sparse_keymap video dcdbas intel_rapl_common sb_edacx86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssifirqbypass rapl intel_cstate intel_uncore ipmi_si ipmi_devintfipmi_msghandler nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fibnft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ctnft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkilloverlay ip_set nf_tables nfnetlink qrtr acpi_power_meter mxm_wmi mei_memei lpc_ich auth_rpcgss ip6_tables ip_tables sunrpc zram xfscrct10dif_pclmul crc32_pclmul nvme crc32c_intel polyval_clmulnipolyval_generic ixgbe ghash_clmulni_intel nvme_core sha512_ssse3megaraid_sas tg3 mgag200 mdio nvme_common dca wmi scsi_dh_rdacscsi_dh_emc scsi_dh_aluaFeb 7 17:19:33 localhost kernel: dm_multipath fuseFeb 7 17:19:33 localhost kernel: CR2: 0000000000000042Feb 7 17:19:33 localhost kernel: ---[ end trace 0000000000000000 ]---Feb 7 17:19:33 localhost kernel: RIP:0010:next_uptodate_page+0x46/0x200Feb 7 17:19:33 localhost kernel: Code: 0f 84 3f 01 00 00 48 81 ff 0604 00 00 0f 84 b3 00 00 00 48 81 ff 02 04 00 00 0f 84 37 01 00 00 40 f6c7 01 0f 85 9c 00 00 00 <48> 8b 07 a8 01 0f 85 91 00 00 00 8b 47 34 85c0 0f 84 86 00 00 00Feb 7 17:19:33 localhost kernel: RSP: 0000:ffffa83e4ed67cc8 EFLAGS:00010246Feb 7 17:19:33 localhost kernel: RAX: 0000000000000042 RBX:ffffa83e4ed67e00 RCX: 000000000000146eFeb 7 17:19:33 localhost kernel: RDX: ffffa83e4ed67d20 RSI:ffff94a9046316b0 RDI: 0000000000000042Feb 7 17:19:33 localhost kernel: RBP: ffffa83e4ed67d20 R08:000000000000146e R09: 0000000000dfd000Feb 7 17:19:33 localhost kernel: R10: 000000000000145f R11:ffff94978b85960c R12: ffff94a9046316b0Feb 7 17:19:33 localhost kernel: R13: 000000000000146e R14:ffff94a9046316b0 R15: ffff948f8bb1f000Feb 7 17:19:33 localhost kernel: FS: 00007fd68fcb9d40(0000)GS:ffff949dffd00000(0000) knlGS:0000000000000000--Dave Chinner