On 8/10/22 5:50 PM, Olga Kornievskaia wrote:
Hi folks (Chuck/Jeff specifically), We've had this outstanding kernel oops that happens (infrequently) in copy_offload testing (stack trace in the end). I've been trying to debug it for a while, added printks and such. I can hand-wavey explain what I'm seeing but I need help (a) nailing down exactly the problem and (b) get a helpful hint how to address it? Ok so what happens. Test case: source file is opened, locked, (blah-blah destination file), copy_notify to the source server, copy is done (src dest), source file unlocked (etc dest file), files closed. Copy/Copy_notify, uses a locking stateid. When unlocking happens it triggers LOCKU and FREE_STATEID. Copy_notify stateid is associated with the locking stateid on the server. When the last reference on the locking stateid goes nfs4_put_stateid() also calls nfs4_free_cpntf_statelist() which deals with cleaning up the copy_notify stateid. In the laundry thread, there is a failsafe that if for some reason the copy_notify state was not cleaned up/expired, then it would be deleted there. However in the failing case, where everything should be cleaned up as it's supposed to be, instead I see calling to put_ol_stateid_locked() (before FREE_STATEID is processed) which cleans up the parent but doesn't touch the copy_notify stateids so instead the laundry thread runs and walks the copy_notify list (since it's not empty) and tries to free the entries but that leads to this oops (since part of the memory was freed by put_ol_stateid_locked() and parent stateid)?. Perhaps the fix is to add the nfs4_free_cpntf_statelist() to put_ol_stateid_locked() which I tried and it seems to work. But I'm not confident about it. Thoughts? diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index fa67ecd5fe63..b988d3c4e5fd 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1408,6 +1408,7 @@ static void put_ol_stateid_locked(struct nfs4_ol_stateid *stp, } idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id); + nfs4_free_cpntf_statelist(clp->net, s); list_add(&stp->st_locks, reaplist); }
In the use-after-free scenario, the copy_state is inserted on the sc_cp_list of the lock state. So when put_ol_stateid_locked is called from nfsd4_close_open_stateid, with your proposed patch, nfs4_free_cpntf_statelist does not remove any copy_state since 's' is the open state. Also put_ol_stateid_locked can return without calling idr_remove and nfs4_free_cpntf_statelist (your proposed fix). -Dai
[ 338.681529] ------------[ cut here ]------------ [ 338.683090] kernel BUG at lib/list_debug.c:53! [ 338.684372] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI [ 338.685977] CPU: 1 PID: 493 Comm: kworker/u256:27 Tainted: G B 5.19.0-rc6+ #104 [ 338.688266] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [ 338.691019] Workqueue: nfsd4 laundromat_main [nfsd] [ 338.692224] RIP: 0010:__list_del_entry_valid.cold.3+0x3d/0x53 [ 338.693626] Code: 0b 4c 89 e1 4c 89 ee 48 c7 c7 e0 1a e3 8f e8 5b 60 fe ff 0f 0b 48 89 e9 4c 89 ea 48 89 de 48 c7 c7 60 1a e3 8f e8 44 60 fe ff <0f> 0b 48 89 ea 48 89 de 48 c7 c7 00 1a e3 8f e8 30 60 fe ff 0f 0b [ 338.697651] RSP: 0018:ffff88800d03fc68 EFLAGS: 00010286 [ 338.698762] RAX: 000000000000006d RBX: ffff888028a14798 RCX: 0000000000000000 [ 338.700442] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff917e9240 [ 338.702257] RBP: ffff88801bb0ae90 R08: ffffed100a795f0e R09: ffffed100a795f0e [ 338.704016] R10: ffff888053caf86b R11: ffffed100a795f0d R12: ffff88801bb0ae90 [ 338.705703] R13: d9c0013300000a39 R14: 000000000000005a R15: ffff88801b9f5800 [ 338.707510] FS: 0000000000000000(0000) GS:ffff888053c80000(0000) knlGS:0000000000000000 [ 338.709319] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 338.710715] CR2: 00005640baab74d0 CR3: 0000000017574005 CR4: 00000000001706e0 [ 338.712282] Call Trace: [ 338.712898] <TASK> [ 338.713430] _free_cpntf_state_locked+0x6b/0x120 [nfsd] [ 338.714806] nfs4_laundromat+0x8ef/0xf30 [nfsd] [ 338.716013] ? dequeue_entity+0x18b/0x6c0 [ 338.716970] ? release_lock_stateid+0x60/0x60 [nfsd] [ 338.718169] ? _raw_spin_unlock+0x15/0x30 [ 338.719064] ? __switch_to+0x2fa/0x690 [ 338.719879] ? __schedule+0x67d/0xf20 [ 338.720678] laundromat_main+0x15/0x40 [nfsd] [ 338.721760] process_one_work+0x3b4/0x6b0 [ 338.722629] worker_thread+0x5a/0x640 [ 338.723425] ? process_one_work+0x6b0/0x6b0 [ 338.724324] kthread+0x162/0x190 [ 338.725030] ? kthread_complete_and_exit+0x20/0x20 [ 338.726074] ret_from_fork+0x22/0x30 [ 338.726856] </TASK>