On Tue, 10 Oct 2017 20:48:34 -0400, Changwei Ge wrote: > > Hi Andrew and Vitaly, > > I do agree that patch ee8f7fcbe638 ("ocfs2/dlm: continue to purge > recovery lockres when recovery master goes down", 2016-08-02) introduced > an issue. It makes DLM recovery can't pick up a new master for an > existed lock resource whose owner died seconds ago. > > But this patch truly solves another issue. > So I think we can't just revert this patch but to give a fix to it. I'm not saying we must revert the patch, I just stating it broke one of use cases :) Here's a tad more diagnosis: System is x86-64 Fedora 25 (I think) with 4.14-rc4. Cluster config: # cat /etc/ocfs2/cluster.conf cluster: node_count = 2 name = cyccfs node: number = 1 cluster = cyccfs ip_port = 7777 ip_address = 192.168.122.93 name = node-A node: number = 2 cluster = cyccfs ip_port = 7777 ip_address = 192.168.122.96 name = node-B Test case (run on both nodes): # cat ocfs #!/bin/bash modprobe ocfs2_stack_o2cb modprobe ocfs2_dlmfs SYSTEMCTL_SKIP_REDIRECT=1 service o2cb start SYSTEMCTL_SKIP_REDIRECT=1 service ocfs2 start mount /dev/sdd /ocfs touch /ocfs/file while true; do dd if=/dev/urandom of=/ocfs/file bs=1 count=100 conv=notrunc oflag=append dd if=/ocfs/file of=/dev/null bs=1 done Then send hard reboot to either node - in this case it's A. Please ignore inconsistent dmesg timestamps - cut&paste from multiple runs. Node B runs, notices A passed away, loops with error -107 until A reappears, then currently used lock becomes unusable (file behind the lock too): [ 68.875788] ocfs2: Registered cluster interface o2cb [ 68.911161] OCFS2 User DLM kernel interface loaded [ 69.053471] o2hb: Heartbeat mode set to local [ 73.316460] o2net: Connected to node node-A (num 1) at 192.168.122.93:7777 [ 77.750103] o2dlm: Joining domain 753F2064B4124242A46F1715538F8618 [ 77.750103] ( [ 77.753092] 1 [ 77.754297] 2 [ 77.755553] ) 2 nodes [ 77.811583] ocfs2: Mounting device (8,48) on (node 2, slot 1) with ordered data mode. [ 140.768968] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7 [ 140.769968] o2net: No longer connected to node node-A (num 1) at 192.168.122.93:7777 [ 140.777303] (dd,2323,1):dlm_send_remote_convert_request:420 ERROR: Error -112 when sending message 504 (key 0x5339c6e3) to node 1 [ 140.782710] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618 [ 145.991182] (dd,2323,1):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 145.997284] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618 [ 151.111187] (dd,2323,1):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 151.120235] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618 [ 156.231307] (dd,2323,1):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 156.237260] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618 [ 161.351199] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 161.362485] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618 [ 166.983245] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 166.989088] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618 [ 170.975235] o2net: No connection established with node 1 after 30.0 seconds, check network and cluster configuration. [ 170.981640] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7 [ 173.025099] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7 [ 175.073443] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7 [ 175.713830] o2cb: o2dlm has evicted node 1 from domain 753F2064B4124242A46F1715538F8618 [ 175.713851] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 175.713851] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618 [ 175.847251] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 ... [ 176.743289] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 176.799237] o2dlm: Waiting on the recovery of node 1 in domain 753F2064B4124242A46F1715538F8618 [ 176.855271] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 ... [ 180.023231] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 180.135277] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 180.191321] o2dlm: Begin recovery on domain 753F2064B4124242A46F1715538F8618 for node 1 [ 180.197953] o2dlm: Node 2 (me) is the Recovery Master for the dead node 1 in domain 753F2064B4124242A46F1715538F8618 [ 180.202519] o2dlm: End recovery on domain 753F2064B4124242A46F1715538F8618 [ 180.247333] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 180.367312] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 ... [ 187.815221] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 187.874086] ocfs2: Begin replay journal (node 1, slot 0) on device (8,48) [ 187.883851] ocfs2: End replay journal (node 1, slot 0) on device (8,48) [ 187.910708] ocfs2: Beginning quota recovery on device (8,48) for slot 0 [ 187.927200] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 187.938099] ocfs2: Finishing quota recovery on device (8,48) for slot 0 [ 188.039262] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 ... [ 263.535262] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 263.647244] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 263.759228] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1 [ 263.780479] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7 [ 265.826450] o2net: Connected to node node-A (num 1) at 192.168.122.93:7777 [ 265.832021] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x5339c6e3) to node 1 ... [ 269.121285] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x5339c6e3) to node 1 [ 269.233402] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x5339c6e3) to node 1 [ 269.345747] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x5339c6e3) to node 1 [ 269.384183] o2dlm: Node 1 joins domain 753F2064B4124242A46F1715538F8618 [ 269.384183] ( [ 269.387089] 1 [ 269.388762] 2 [ 269.390137] ) 2 nodes [ 269.459165] (dd,2323,0):dlm_send_remote_convert_request:416 ERROR: dlm status = DLM_IVLOCKID [ 269.465496] (dd,2323,0):dlmconvert_remote:343 ERROR: dlm status = DLM_IVLOCKID [ 269.470268] (dd,2323,0):__ocfs2_cluster_lock:1549 ERROR: DLM error -22 while calling ocfs2_dlm_lock on resource W000000000000000000f0050de1f090 [ 269.476655] (dd,2323,0):ocfs2_rw_lock:1740 ERROR: status = -22 [ 269.479963] (dd,2323,0):ocfs2_file_write_iter:2244 ERROR: status = -22 node A says it does not know such lock: [ 308.107780] (kworker/u8:7,101,0):dlm_convert_lock_handler:513 ERROR: did not find lock to convert on grant queue! cookie=2:9 [ 308.116505] lockres: W000000000000000000f0050de1f090, owner=1, state=0 [ 308.121113] last used: 0, refcnt: 3, on purge list: no [ 308.123443] on dirty list: no, on reco list: no, migrating pending: no [ 308.129062] inflight locks: 0, asts reserved: 0 [ 308.131417] refmap nodes: [ [ 308.131417] ], inflight=0 [ 308.139445] granted queue: [ 308.141671] type=5, conv=-1, node=1, cookie=1:10, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n) [ 308.149155] converting queue: [ 308.150387] blocked queue: Trying to mount -o remount on B, it dies: [ 166.940431] (mount.ocfs2,2546,3):ocfs2_unlock_ast:1181 ERROR: Dlm passes error -22 for lock N0000000000000009, unlock_action 2 [ 166.948393] (mount.ocfs2,2546,3):dlmunlock:688 ERROR: dlm status = DLM_IVLOCKID [ 166.952468] (mount.ocfs2,2546,3):ocfs2_drop_lock:3237 ERROR: DLM error -22 while calling ocfs2_dlm_unlock on resource N00000000000000090000f005 [ 166.959220] (mount.ocfs2,2546,3):ocfs2_drop_lock:3238 ERROR: lockres flags: 194 [ 166.963247] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0 [ 166.964247] IP: _raw_spin_lock+0xc/0x30 [ 166.964247] PGD 0 P4D 0 [ 166.964247] Oops: 0002 [#1] SMP [ 166.964247] Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ghash_clmulni_intel i2c_piix4 crc32c_intel serio_raw 8139cp [ 166.964247] CPU: 3 PID: 2546 Comm: mount.ocfs2 Not tainted 4.14.0-rc4+ #57 [ 166.964247] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 166.964247] task: ffff880136073fc0 task.stack: ffffc90000c74000 [ 166.964247] RIP: 0010:_raw_spin_lock+0xc/0x30 [ 166.964247] RSP: 0018:ffffc90000c77c80 EFLAGS: 00010246 [ 166.964247] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 166.964247] RDX: 0000000000000001 RSI: ffff88013fd8cbd8 RDI: 00000000000000b0 [ 166.964247] RBP: ffffc90000c77c90 R08: 0000000000000000 R09: 000000000000058d [ 166.964247] R10: ffffc90000c77cc0 R11: 0000000081ee5501 R12: ffffc90000c77cd8 [ 166.964247] R13: ffff880136c9e660 R14: ffff880136c9e618 R15: 0000000000000246 [ 166.964247] FS: 00007f2d8a3e3dc0(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000 [ 166.964247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 166.964247] CR2: 00000000000000b0 CR3: 0000000137a7e000 CR4: 00000000000006e0 [ 166.964247] Call Trace: [ 166.964247] ? dlm_print_one_lock+0x1a/0x30 [ocfs2_dlm] [ 166.964247] o2cb_dump_lksb+0x12/0x20 [ocfs2_stack_o2cb] [ 166.964247] ocfs2_dlm_dump_lksb+0x17/0x20 [ocfs2_stackglue] [ 166.964247] ocfs2_drop_lock.isra.33.part.34+0x319/0x360 [ocfs2] [ 166.964247] ocfs2_simple_drop_lockres+0x2d/0x40 [ocfs2] [ 166.964247] ocfs2_dentry_lock_put+0x50/0x70 [ocfs2] [ 166.964247] ocfs2_dentry_iput+0x3a/0xf0 [ocfs2] [ 166.964247] dentry_unlink_inode+0xa7/0x150 [ 166.964247] __dentry_kill+0xc1/0x150 [ 166.964247] shrink_dentry_list+0x117/0x280 [ 166.964247] shrink_dcache_sb+0xc5/0x140 [ 166.964247] do_remount_sb+0x4d/0x1e0 [ 166.964247] do_mount+0x74c/0xcb0 [ 166.964247] ? copy_mount_options+0x2c/0x220 [ 166.964247] SyS_mount+0x83/0xd0 [ 166.964247] entry_SYSCALL_64_fastpath+0x13/0x94 [ 166.964247] RIP: 0033:0x7f2d89b0e6fa [ 166.964247] RSP: 002b:00007ffe70f9ea38 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 166.964247] RAX: ffffffffffffffda RBX: 00007ffe70f9da11 RCX: 00007f2d89b0e6fa [ 166.964247] RDX: 000055dd5b4d06ae RSI: 000055dd5b85e120 RDI: 000055dd5b85e3d0 [ 166.964247] RBP: 0000000000000000 R08: 000055dd5b85e810 R09: 0000000000000001 [ 166.964247] R10: 0000000000200020 R11: 0000000000000202 R12: 0000000000001001 [ 166.964247] R13: 000055dd5b85e101 R14: 00007ffe70f9da10 R15: 0000000000000000 [ 166.964247] Code: 7d 08 e8 88 4f a0 ff 31 c0 5d c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 dd b8 01 00 00 00 c3 90 66 66 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 50 7d a4 ff 5d [ 166.964247] RIP: _raw_spin_lock+0xc/0x30 RSP: ffffc90000c77c80 [ 166.964247] CR2: 00000000000000b0 Few minutes later A dies as well: [ 83.229138] (kworker/u8:8,102,0):dlm_unlock_lock_handler:514 ERROR: failed to find lock to unlock! cookie=1:9 [ 113.631250] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 has been idle for 30.403 secs. [ 143.839155] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 has been idle for 30.193 secs. [ 146.466137] o2net: No longer connected to node node-A (num 1) at 192.168.122.93:7777 [ 146.473923] o2cb: o2dlm has evicted node 1 from domain 753F2064B4124242A46F1715538F8618 [ 147.935147] o2dlm: Begin recovery on domain 753F2064B4124242A46F1715538F8618 for node 1 [ 147.941285] o2dlm: Node 2 (me) is the Recovery Master for the dead node 1 in domain 753F2064B4124242A46F1715538F8618 [ 147.947347] o2dlm: End recovery on domain 753F2064B4124242A46F1715538F8618 [ 158.592082] ------------[ cut here ]------------ [ 158.594722] kernel BUG at fs/ocfs2/reservations.c:507! [ 158.597507] invalid opcode: 0000 [#1] SMP [ 158.598507] Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ghash_clmulni_intel i2c_piix4 crc32c_intel serio_raw 8139cp [ 158.598507] CPU: 2 PID: 2485 Comm: dd Not tainted 4.14.0-rc4+ #57 [ 158.598507] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 158.598507] task: ffff880136f03300 task.stack: ffffc9000116c000 [ 158.598507] RIP: 0010:__ocfs2_resv_find_window+0x3f9/0x440 [ocfs2] [ 158.598507] RSP: 0018:ffffc9000116f720 EFLAGS: 00010246 [ 158.598507] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff [ 158.598507] RDX: 0000000000000000 RSI: ffffc9000116f748 RDI: 0000000000000000 [ 158.598507] RBP: ffffc9000116f778 R08: 0000000000000000 R09: ffffc9000116f744 [ 158.598507] R10: 0000000000000001 R11: ffff8801318a84c0 R12: 0000000000000000 [ 158.598507] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88012da04400 [ 158.598507] FS: 00007f869b3d8700(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000 [ 158.598507] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 158.598507] CR2: 0000563858481000 CR3: 0000000135f0a000 CR4: 00000000000006e0 [ 158.598507] Call Trace: [ 158.598507] ocfs2_resmap_resv_bits+0xad/0x360 [ocfs2] [ 158.598507] ocfs2_claim_local_alloc_bits+0xbe/0x490 [ocfs2] [ 158.598507] __ocfs2_claim_clusters+0x136/0x250 [ocfs2] [ 158.598507] ocfs2_add_clusters_in_btree+0xad/0x430 [ocfs2] [ 158.598507] ? ocfs2_inode_cache_lock+0x12/0x20 [ocfs2] [ 158.598507] ? ocfs2_metadata_cache_unlock+0x19/0x20 [ocfs2] [ 158.598507] ? get_page_from_freelist+0x3ef/0xb90 [ 158.598507] ocfs2_add_inode_data+0x4b/0x60 [ocfs2] [ 158.598507] ? ocfs2_allocate_extend_trans+0x160/0x160 [ocfs2] [ 158.598507] ocfs2_write_cluster.isra.35+0x46f/0x760 [ocfs2] [ 158.598507] ? __add_to_page_cache_locked+0x142/0x1d0 [ 158.598507] ? lru_cache_add+0x3a/0x80 [ 158.598507] ? add_to_page_cache_lru+0x6a/0xd0 [ 158.598507] ocfs2_write_begin_nolock+0x1381/0x17b0 [ocfs2] [ 158.598507] ? ocfs2_inode_lock_full_nested+0x362/0x880 [ocfs2] [ 158.598507] ? ocfs2_allocate_extend_trans+0x160/0x160 [ocfs2] [ 158.598507] ocfs2_write_begin+0xd5/0x1b0 [ocfs2] [ 158.598507] ? security_inode_need_killpriv+0x33/0x50 [ 158.598507] generic_perform_write+0xaa/0x180 [ 158.598507] __generic_file_write_iter+0x14f/0x1c0 [ 158.598507] ? ocfs2_inode_unlock+0xf4/0x110 [ocfs2] [ 158.598507] ocfs2_file_write_iter+0x3dd/0xa40 [ocfs2] [ 158.598507] ? memzero_explicit+0x12/0x20 [ 158.598507] __vfs_write+0xd2/0x130 [ 158.598507] vfs_write+0xb5/0x1a0 [ 158.598507] SyS_write+0x46/0xa0 [ 158.598507] entry_SYSCALL_64_fastpath+0x13/0x94 [ 158.598507] RIP: 0033:0x7f869aef8c30 [ 158.598507] RSP: 002b:00007ffc659ea698 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 158.752095] RAX: ffffffffffffffda RBX: 00007f869b1c1b38 RCX: 00007f869aef8c30 [ 158.752095] RDX: 0000000000000001 RSI: 0000563858481000 RDI: 0000000000000001 [ 158.752095] RBP: 00007f869b1c1ae0 R08: 001167e91233fc63 R09: 00007ffc659f3080 [ 158.752095] R10: 0000563858480030 R11: 0000000000000246 R12: 00007f869b1c1b38 [ 158.752095] R13: 0000000000002010 R14: 7fffffffffffffff R15: 0000000000002710 [ 158.770296] Code: ff ff 45 8b 75 18 41 8b 55 1c e9 e1 fc ff ff 45 8b 75 18 41 8b 55 1c c7 45 b0 00 00 00 00 e9 cd fc ff ff 85 c0 0f 85 50 ff ff ff <0f> 0b 48 b8 00 00 00 00 00 00 00 10 48 8d 7d d0 45 89 c1 48 89 [ 158.770296] RIP: __ocfs2_resv_find_window+0x3f9/0x440 [ocfs2] RSP: ffffc9000116f720 All above does not happen with the mentioned commit reverted. Thanks! -- wbr, Vitaly