The problem with the recent netfs/folio series is easy to repro, and doesn't show up if I remove the mempools patch: Author: David Howells <dhowells@xxxxxxxxxx> Date: Fri Mar 15 18:03:30 2024 +0000 cifs: Add mempools for cifs_io_request and cifs_io_subrequest structs Add mempools for the allocation of cifs_io_request and cifs_io_subrequest structs for netfslib to use so that it can guarantee eventual allocation in writeback. Repro is just to do modprobe and then rmmod [root@fedora29 xfstests-dev]# modprobe cifs [root@fedora29 xfstests-dev]# dmesg -c [ 589.547809] Key type cifs.spnego registered [ 589.547857] Key type cifs.idmap registered [root@fedora29 xfstests-dev]# rmmod cifs Segmentation fault [ 593.793058] RIP: 0010:free_large_kmalloc+0x78/0xb0 [ 593.793063] Code: 74 0a 5d 41 5c 41 5d c3 cc cc cc cc 48 89 ef 5d 41 5c 41 5d e9 99 06 f4 ff 48 c7 c6 50 cf 38 9d 48 89 ef e8 7a f4 f8 ff 0f 0b <0f> 0b 80 3d a6 3d 91 02 00 41 bc 00 f0 ff ff 75 a2 4c 89 ee 48 c7 [ 593.793068] RSP: 0018:ff1100011ceafe00 EFLAGS: 00010246 [ 593.793074] RAX: 0017ffffc0000000 RBX: 1fe22000239d5fc6 RCX: dffffc0000000000 [ 593.793078] RDX: ffd4000009265808 RSI: ffffffffc1960140 RDI: ffd4000009265800 [ 593.793082] RBP: ffd4000009265800 R08: ffffffff9b287a70 R09: 0000000000000001 [ 593.793086] R10: ffffffff9df472e7 R11: 0000000000000001 R12: ffffffffc195ff60 [ 593.793090] R13: ffffffffc1960140 R14: 0000000000000000 R15: 0000000000000000 [ 593.793093] FS: 00007fd5849cc280(0000) GS:ff110004cb200000(0000) knlGS:0000000000000000 [ 593.793098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 593.793101] CR2: 000055c6c44c7d58 CR3: 000000010da2a004 CR4: 0000000000371ef0 [ 593.793110] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 593.793114] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 593.793118] Call Trace: [ 593.793121] <TASK> [ 593.793125] ? __warn+0xa4/0x220 [ 593.793133] ? free_large_kmalloc+0x78/0xb0 [ 593.793140] ? report_bug+0x1d4/0x1e0 [ 593.793151] ? handle_bug+0x42/0x80 [ 593.793158] ? exc_invalid_op+0x18/0x50 [ 593.793164] ? asm_exc_invalid_op+0x1a/0x20 [ 593.793178] ? rcu_is_watching+0x20/0x50 [ 593.793188] ? free_large_kmalloc+0x78/0xb0 [ 593.793197] exit_cifs+0x89/0x6a0 [cifs] [ 593.793363] __do_sys_delete_module.constprop.0+0x23f/0x450 [ 593.793370] ? __pfx___do_sys_delete_module.constprop.0+0x10/0x10 [ 593.793375] ? mark_held_locks+0x24/0x90 [ 593.793383] ? __x64_sys_close+0x54/0xa0 [ 593.793388] ? lockdep_hardirqs_on_prepare+0x139/0x200 [ 593.793394] ? kasan_quarantine_put+0x97/0x1f0 [ 593.793404] ? mark_held_locks+0x24/0x90 [ 593.793414] do_syscall_64+0x78/0x180 [ 593.793421] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 593.793427] RIP: 0033:0x7fd584aecd4b [ 593.793433] Code: 73 01 c3 48 8b 0d 3d 11 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 11 0c 00 f7 d8 64 89 01 48 [ 593.793437] RSP: 002b:00007ffe0a36ec18 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 593.793443] RAX: ffffffffffffffda RBX: 000055c6c44bd7a0 RCX: 00007fd584aecd4b [ 593.793447] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055c6c44bd808 [ 593.793451] RBP: 0000000000000000 R08: 00007ffe0a36db91 R09: 0000000000000000 [ 593.793454] R10: 00007fd584b5eae0 R11: 0000000000000206 R12: 00007ffe0a36ee40 [ 593.793458] R13: 00007ffe0a3706d1 R14: 000055c6c44bd260 R15: 000055c6c44bd7a0 [ 593.793474] </TASK> [ 593.793477] irq event stamp: 12729 [ 593.793480] hardirqs last enabled at (12735): [<ffffffff9b25d2eb>] console_unlock+0x15b/0x170 [ 593.793487] hardirqs last disabled at (12740): [<ffffffff9b25d2d0>] console_unlock+0x140/0x170 [ 593.793492] softirqs last enabled at (11910): [<ffffffff9b16499e>] __irq_exit_rcu+0xfe/0x120 [ 593.793498] softirqs last disabled at (11901): [<ffffffff9b16499e>] __irq_exit_rcu+0xfe/0x120 [ 593.793503] ---[ end trace 0000000000000000 ]--- [ 593.793546] object pointer: 0x00000000da6e868b [ 593.793550] ================================================================== [ 593.793553] BUG: KASAN: invalid-free in exit_cifs+0x89/0x6a0 [cifs] [ 593.793698] Free of addr ffffffffc1960140 by task rmmod/1306 [ 593.793703] CPU: 4 PID: 1306 Comm: rmmod Tainted: G W 6.9.0 #1 [ 593.793707] Hardware name: Red Hat KVM, BIOS 1.16.1-1.el9 04/01/2014 [ 593.793709] Call Trace: [ 593.793711] <TASK> [ 593.793714] dump_stack_lvl+0x79/0xb0 [ 593.793718] print_report+0xcb/0x620 [ 593.793724] ? exit_cifs+0x89/0x6a0 [cifs] [ 593.793861] ? exit_cifs+0x89/0x6a0 [cifs] [ 593.794002] kasan_report_invalid_free+0x9a/0xc0 [ 593.794008] ? exit_cifs+0x89/0x6a0 [cifs] [ 593.794173] free_large_kmalloc+0x38/0xb0 [ 593.794178] exit_cifs+0x89/0x6a0 [cifs] [ 593.794327] __do_sys_delete_module.constprop.0+0x23f/0x450 [ 593.794331] ? __pfx___do_sys_delete_module.constprop.0+0x10/0x10 [ 593.794335] ? mark_held_locks+0x24/0x90 [ 593.794339] ? __x64_sys_close+0x54/0xa0 [ 593.794342] ? lockdep_hardirqs_on_prepare+0x139/0x200 [ 593.794347] ? kasan_quarantine_put+0x97/0x1f0 [ 593.794352] ? mark_held_locks+0x24/0x90 [ 593.794357] do_syscall_64+0x78/0x180 [ 593.794361] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 593.794367] RIP: 0033:0x7fd584aecd4b [ 593.794370] Code: 73 01 c3 48 8b 0d 3d 11 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 11 0c 00 f7 d8 64 89 01 48 [ 593.794373] RSP: 002b:00007ffe0a36ec18 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 593.794377] RAX: ffffffffffffffda RBX: 000055c6c44bd7a0 RCX: 00007fd584aecd4b [ 593.794380] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055c6c44bd808 [ 593.794382] RBP: 0000000000000000 R08: 00007ffe0a36db91 R09: 0000000000000000 [ 593.794385] R10: 00007fd584b5eae0 R11: 0000000000000206 R12: 00007ffe0a36ee40 [ 593.794387] R13: 00007ffe0a3706d1 R14: 000055c6c44bd260 R15: 000055c6c44bd7a0 [ 593.794394] </TASK> [ 593.794398] The buggy address belongs to the variable: [ 593.794399] cifs_io_subrequest_pool+0x0/0xfffffffffff3dec0 [cifs] [ 593.794557] Memory state around the buggy address: [ 593.794559] ffffffffc1960000: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 [ 593.794562] ffffffffc1960080: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 [ 593.794565] >ffffffffc1960100: 00 00 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 [ 593.794567] ^ [ 593.794570] ffffffffc1960180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 [ 593.794572] ffffffffc1960200: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 [ 593.794575] ================================================================== On Sat, May 11, 2024 at 12:59 PM Steve French <smfrench@xxxxxxxxx> wrote: > > This was running against linux-next as of about an hour ago > > On Sat, May 11, 2024 at 12:53 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > Tried running the regression tests against for-next and saw crash > > early in the test run in > > > > # FS QA Test No. cifs/006 > > # > > # check deferred closes on handles of deleted files > > # > > umount: /mnt/test: not mounted. > > umount: /mnt/test: not mounted. > > umount: /mnt/scratch: not mounted. > > umount: /mnt/scratch: not mounted. > > ./run-xfstests.sh: line 25: 4556 Segmentation fault rmmod cifs > > modprobe: ERROR: could not insert 'cifs': Device or resource busy > > > > More information here: > > http://smb311-linux-testing.southcentralus.cloudapp.azure.com/#/builders/5/builds/123/steps/14/logs/stdio > > > > Are you also seeing that? There are not many likely candidates for > > what patch is causing the problem (could be related to the folios > > changes) e.g. > > > > 7c1ac89480e8 cifs: Enable large folio support > > 3ee1a1fc3981 cifs: Cut over to using netfslib > > 69c3c023af25 cifs: Implement netfslib hooks > > c20c0d7325ab cifs: Make add_credits_and_wake_if() clear deducted credits > > edea94a69730 cifs: Add mempools for cifs_io_request and > > cifs_io_subrequest structs > > 3758c485f6c9 cifs: Set zero_point in the copy_file_range() and > > remap_file_range() > > 1a5b4edd97ce cifs: Move cifs_loose_read_iter() and > > cifs_file_write_iter() to file.c > > dc5939de82f1 cifs: Replace the writedata replay bool with a netfs sreq flag > > 56257334e8e0 cifs: Make wait_mtu_credits take size_t args > > ab58fbdeebc7 cifs: Use more fields from netfs_io_subrequest > > a975a2f22cdc cifs: Replace cifs_writedata with a wrapper around > > netfs_io_subrequest > > 753b67eb630d cifs: Replace cifs_readdata with a wrapper around > > netfs_io_subrequest > > 0f7c0f3f5150 cifs: Use alternative invalidation to using launder_folio > > 2e9d7e4b984a mm: Remove the PG_fscache alias for PG_private_2 > > > > Any ideas? > > > > -- > > Thanks, > > > > Steve > > > > -- > Thanks, > > Steve -- Thanks, Steve