On 05.12.2021 13:12, Vasily Averin wrote: > In 2006 Trond Myklebust added support for the FL_ACCESS flag, > commit 01c3b861cd77 ("NLM,NFSv4: Wait on local locks before we put RPC > calls on the wire"), as a result of which _nfs4_proc_setlk() began > to execute _nfs4_do_setlk() with modified request->fl_flag where > FL_ACCESS flag was set. > > It was not important not till 2015, when commit c69899a17ca4 ("NFSv4: > Update of VFS byte range lock must be atomic with the stateid update") > added do_vfs_lock call into nfs4_locku_done(). > nfs4_locku_done() in this case uses calldata->fl of nfs4_unlockdata. > It is copied from struct nfs4_lockdata, which in turn uses the fl_flag > copied from the request->fl_flag provided by _nfs4_do_setlk(), i.e. with > FL_ACCESS flag set. > > FL_ACCESS flag is removed in nfs4_lock_done() for non-cancelled case. > however rpc task can be cancelled earlier. > > As a result flock_lock_inode() can be called with request->fl_type F_UNLCK > and fl_flags with FL_ACCESS flag set. > Such request is processed incorectly. Instead of expected search and > removal of exisiting flocks it jumps to "find_conflict" label and can call > locks_insert_block() function. > > On kernels before 2018, (i.e. before commit 7b587e1a5a6c > ("NFS: use locks_copy_lock() to copy locks.")) it caused a BUG in > __locks_insert_block() because copied fl had incorrectly linked fl_block. originally it was foudn during processing of real customers bugreports on RHEL7-based OpenVz7 kernel. kernel BUG at fs/locks.c:612! CPU: 7 PID: 1019852 Comm: kworker/u65:43 ve: 0 Kdump: loaded Tainted: G W O ------------ 3.10.0-1160.41.1.vz7.183.5 #1 183.5 Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.3 05/23/2018 Workqueue: rpciod rpc_async_schedule [sunrpc] task: ffff9d50e5de0000 ti: ffff9d3c9ec10000 task.ti: ffff9d3c9ec10000 RIP: 0010:[<ffffffffbe0d590a>] [<ffffffffbe0d590a>] __locks_insert_block+0xea/0xf0 RSP: 0018:ffff9d3c9ec13c78 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff9d529554e180 RCX: 0000000000000001 RDX: 0000000000000001 RSI: ffff9d51d2363a98 RDI: ffff9d51d2363ab0 RBP: ffff9d3c9ec13c88 R08: 0000000000000003 R09: ffff9d5f5b8dfcd0 R10: ffff9d5f5b8dfd08 R11: ffffbb21594b5a80 R12: ffff9d51d2363a98 R13: 0000000000000000 R14: ffff9d50e5de0000 R15: ffff9d3da03915f8 FS: 0000000000000000(0000) GS:ffff9d55bfbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f93d65ee1e8 CR3: 00000029a04d6000 CR4: 00000000000607e0 Call Trace: [<ffffffffbe0d5939>] locks_insert_block+0x29/0x40 [<ffffffffbe0d6d5b>] flock_lock_inode_wait+0x2bb/0x310 [<ffffffffc01c7470>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc] [<ffffffffbe0d6dce>] locks_lock_inode_wait+0x1e/0x40 [<ffffffffc0c9f5c0>] nfs4_locku_done+0x90/0x190 [nfsv4] [<ffffffffc01bb750>] ? call_decode+0x1f0/0x880 [sunrpc] [<ffffffffc01c7470>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc] [<ffffffffc01c74a1>] rpc_exit_task+0x31/0x90 [sunrpc] [<ffffffffc01c9654>] __rpc_execute+0xe4/0x470 [sunrpc] [<ffffffffc01c99f2>] rpc_async_schedule+0x12/0x20 [sunrpc] [<ffffffffbdec1b25>] process_one_work+0x185/0x440 [<ffffffffbdec27e6>] worker_thread+0x126/0x3c0 [<ffffffffbdec26c0>] ? manage_workers.isra.26+0x2a0/0x2a0 [<ffffffffbdec9e31>] kthread+0xd1/0xe0 [<ffffffffbdec9d60>] ? create_kthread+0x60/0x60 [<ffffffffbe5d2eb7>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffffbdec9d60>] ? create_kthread+0x60/0x60 Code: 48 85 d2 49 89 54 24 08 74 04 48 89 4a 08 48 89 0c c5 c0 ee 09 bf 49 89 74 24 10 5b 41 5c 5d c3 90 49 8b 44 24 28 e9 80 ff ff ff <0f> 0b 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 41 54 49 89 f4 53 RIP [<ffffffffbe0d590a>] __locks_insert_block+0xea/0xf0 RSP <ffff9d3c9ec13c78> In crashdump I've found nfs4_lockudata and (already freed but not reused) nfs4_lockdata both have fl->fl_flags = 0x8a. Thank you, Vasily Averin i.e have set FL_SLEEP, FL_ACCESS and FL_FLOCK. fl_flags = 0x8a,