Re: mmotm 2016-08-02-15-53 uploaded

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 10 Oct 2017 20:48:34 -0400,
Changwei Ge wrote:
> 
> Hi Andrew and Vitaly,
> 
> I do agree that patch ee8f7fcbe638 ("ocfs2/dlm: continue to purge 
> recovery lockres when recovery master goes down", 2016-08-02) introduced 
> an issue. It makes DLM recovery can't pick up a new master for an 
> existed lock resource whose owner died seconds ago.
> 
> But this patch truly solves another issue.
> So I think we can't just revert this patch but to give a fix to it.

I'm not saying we must revert the patch, I just stating it broke one of use cases :)

Here's a tad more diagnosis:

System is x86-64 Fedora 25 (I think) with 4.14-rc4.

Cluster config:

# cat /etc/ocfs2/cluster.conf 
cluster:
        node_count = 2
        name = cyccfs

node:
        number = 1
        cluster = cyccfs
        ip_port = 7777
        ip_address = 192.168.122.93
        name = node-A

node:
        number = 2
        cluster = cyccfs
        ip_port = 7777
        ip_address = 192.168.122.96
        name = node-B

Test case (run on both nodes):

# cat ocfs
#!/bin/bash

modprobe ocfs2_stack_o2cb
modprobe ocfs2_dlmfs
SYSTEMCTL_SKIP_REDIRECT=1 service o2cb start
SYSTEMCTL_SKIP_REDIRECT=1 service ocfs2 start
mount /dev/sdd /ocfs

touch /ocfs/file
while true; do
    dd if=/dev/urandom of=/ocfs/file bs=1 count=100 conv=notrunc oflag=append
    dd if=/ocfs/file of=/dev/null bs=1
done


Then send hard reboot to either node - in this case it's A.

Please ignore inconsistent dmesg timestamps - cut&paste from multiple runs.

Node B runs, notices A passed away, loops with error -107 until A reappears, then currently used lock becomes unusable (file behind the lock too):

[   68.875788] ocfs2: Registered cluster interface o2cb
[   68.911161] OCFS2 User DLM kernel interface loaded
[   69.053471] o2hb: Heartbeat mode set to local
[   73.316460] o2net: Connected to node node-A (num 1) at 192.168.122.93:7777
[   77.750103] o2dlm: Joining domain 753F2064B4124242A46F1715538F8618 
[   77.750103] ( 
[   77.753092] 1 
[   77.754297] 2 
[   77.755553] ) 2 nodes
[   77.811583] ocfs2: Mounting device (8,48) on (node 2, slot 1) with ordered data mode.
[  140.768968] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7
[  140.769968] o2net: No longer connected to node node-A (num 1) at 192.168.122.93:7777
[  140.777303] (dd,2323,1):dlm_send_remote_convert_request:420 ERROR: Error -112 when sending message 504 (key 0x5339c6e3) to node 1
[  140.782710] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618
[  145.991182] (dd,2323,1):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  145.997284] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618
[  151.111187] (dd,2323,1):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  151.120235] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618
[  156.231307] (dd,2323,1):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  156.237260] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618
[  161.351199] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  161.362485] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618
[  166.983245] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  166.989088] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618
[  170.975235] o2net: No connection established with node 1 after 30.0 seconds, check network and cluster configuration.
[  170.981640] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7
[  173.025099] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7
[  175.073443] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7
[  175.713830] o2cb: o2dlm has evicted node 1 from domain 753F2064B4124242A46F1715538F8618
[  175.713851] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  175.713851] o2dlm: Waiting on the death of node 1 in domain 753F2064B4124242A46F1715538F8618
[  175.847251] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
...
[  176.743289] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  176.799237] o2dlm: Waiting on the recovery of node 1 in domain 753F2064B4124242A46F1715538F8618
[  176.855271] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
...
[  180.023231] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  180.135277] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  180.191321] o2dlm: Begin recovery on domain 753F2064B4124242A46F1715538F8618 for node 1
[  180.197953] o2dlm: Node 2 (me) is the Recovery Master for the dead node 1 in domain 753F2064B4124242A46F1715538F8618
[  180.202519] o2dlm: End recovery on domain 753F2064B4124242A46F1715538F8618
[  180.247333] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  180.367312] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
...
[  187.815221] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  187.874086] ocfs2: Begin replay journal (node 1, slot 0) on device (8,48)
[  187.883851] ocfs2: End replay journal (node 1, slot 0) on device (8,48)
[  187.910708] ocfs2: Beginning quota recovery on device (8,48) for slot 0
[  187.927200] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  187.938099] ocfs2: Finishing quota recovery on device (8,48) for slot 0
[  188.039262] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
...
[  263.535262] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  263.647244] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  263.759228] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x5339c6e3) to node 1
[  263.780479] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 shutdown, state 7
[  265.826450] o2net: Connected to node node-A (num 1) at 192.168.122.93:7777
[  265.832021] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x5339c6e3) to node 1
...
[  269.121285] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x5339c6e3) to node 1
[  269.233402] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x5339c6e3) to node 1
[  269.345747] (dd,2323,0):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x5339c6e3) to node 1
[  269.384183] o2dlm: Node 1 joins domain 753F2064B4124242A46F1715538F8618 
[  269.384183] ( 
[  269.387089] 1 
[  269.388762] 2 
[  269.390137] ) 2 nodes
[  269.459165] (dd,2323,0):dlm_send_remote_convert_request:416 ERROR: dlm status = DLM_IVLOCKID
[  269.465496] (dd,2323,0):dlmconvert_remote:343 ERROR: dlm status = DLM_IVLOCKID
[  269.470268] (dd,2323,0):__ocfs2_cluster_lock:1549 ERROR: DLM error -22 while calling ocfs2_dlm_lock on resource W000000000000000000f0050de1f090
[  269.476655] (dd,2323,0):ocfs2_rw_lock:1740 ERROR: status = -22
[  269.479963] (dd,2323,0):ocfs2_file_write_iter:2244 ERROR: status = -22



node A says it does not know such lock:

[  308.107780] (kworker/u8:7,101,0):dlm_convert_lock_handler:513 ERROR: did not find lock to convert on grant queue! cookie=2:9
[  308.116505] lockres: W000000000000000000f0050de1f090, owner=1, state=0
[  308.121113]   last used: 0, refcnt: 3, on purge list: no
[  308.123443]   on dirty list: no, on reco list: no, migrating pending: no
[  308.129062]   inflight locks: 0, asts reserved: 0
[  308.131417]   refmap nodes: [ 
[  308.131417] ], inflight=0
[  308.139445]   granted queue:
[  308.141671]     type=5, conv=-1, node=1, cookie=1:10, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
[  308.149155]   converting queue:
[  308.150387]   blocked queue:

Trying to mount -o remount on B, it dies:

[  166.940431] (mount.ocfs2,2546,3):ocfs2_unlock_ast:1181 ERROR: Dlm passes error -22 for lock N0000000000000009, unlock_action 2
[  166.948393] (mount.ocfs2,2546,3):dlmunlock:688 ERROR: dlm status = DLM_IVLOCKID
[  166.952468] (mount.ocfs2,2546,3):ocfs2_drop_lock:3237 ERROR: DLM error -22 while calling ocfs2_dlm_unlock on resource N00000000000000090000f005
[  166.959220] (mount.ocfs2,2546,3):ocfs2_drop_lock:3238 ERROR: lockres flags: 194
[  166.963247] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
[  166.964247] IP: _raw_spin_lock+0xc/0x30
[  166.964247] PGD 0 P4D 0 
[  166.964247] Oops: 0002 [#1] SMP
[  166.964247] Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ghash_clmulni_intel i2c_piix4 crc32c_intel serio_raw 8139cp
[  166.964247] CPU: 3 PID: 2546 Comm: mount.ocfs2 Not tainted 4.14.0-rc4+ #57
[  166.964247] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  166.964247] task: ffff880136073fc0 task.stack: ffffc90000c74000
[  166.964247] RIP: 0010:_raw_spin_lock+0xc/0x30
[  166.964247] RSP: 0018:ffffc90000c77c80 EFLAGS: 00010246
[  166.964247] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  166.964247] RDX: 0000000000000001 RSI: ffff88013fd8cbd8 RDI: 00000000000000b0
[  166.964247] RBP: ffffc90000c77c90 R08: 0000000000000000 R09: 000000000000058d
[  166.964247] R10: ffffc90000c77cc0 R11: 0000000081ee5501 R12: ffffc90000c77cd8
[  166.964247] R13: ffff880136c9e660 R14: ffff880136c9e618 R15: 0000000000000246
[  166.964247] FS:  00007f2d8a3e3dc0(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
[  166.964247] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  166.964247] CR2: 00000000000000b0 CR3: 0000000137a7e000 CR4: 00000000000006e0
[  166.964247] Call Trace:
[  166.964247]  ? dlm_print_one_lock+0x1a/0x30 [ocfs2_dlm]
[  166.964247]  o2cb_dump_lksb+0x12/0x20 [ocfs2_stack_o2cb]
[  166.964247]  ocfs2_dlm_dump_lksb+0x17/0x20 [ocfs2_stackglue]
[  166.964247]  ocfs2_drop_lock.isra.33.part.34+0x319/0x360 [ocfs2]
[  166.964247]  ocfs2_simple_drop_lockres+0x2d/0x40 [ocfs2]
[  166.964247]  ocfs2_dentry_lock_put+0x50/0x70 [ocfs2]
[  166.964247]  ocfs2_dentry_iput+0x3a/0xf0 [ocfs2]
[  166.964247]  dentry_unlink_inode+0xa7/0x150
[  166.964247]  __dentry_kill+0xc1/0x150
[  166.964247]  shrink_dentry_list+0x117/0x280
[  166.964247]  shrink_dcache_sb+0xc5/0x140
[  166.964247]  do_remount_sb+0x4d/0x1e0
[  166.964247]  do_mount+0x74c/0xcb0
[  166.964247]  ? copy_mount_options+0x2c/0x220
[  166.964247]  SyS_mount+0x83/0xd0
[  166.964247]  entry_SYSCALL_64_fastpath+0x13/0x94
[  166.964247] RIP: 0033:0x7f2d89b0e6fa
[  166.964247] RSP: 002b:00007ffe70f9ea38 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  166.964247] RAX: ffffffffffffffda RBX: 00007ffe70f9da11 RCX: 00007f2d89b0e6fa
[  166.964247] RDX: 000055dd5b4d06ae RSI: 000055dd5b85e120 RDI: 000055dd5b85e3d0
[  166.964247] RBP: 0000000000000000 R08: 000055dd5b85e810 R09: 0000000000000001
[  166.964247] R10: 0000000000200020 R11: 0000000000000202 R12: 0000000000001001
[  166.964247] R13: 000055dd5b85e101 R14: 00007ffe70f9da10 R15: 0000000000000000
[  166.964247] Code: 7d 08 e8 88 4f a0 ff 31 c0 5d c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 dd b8 01 00 00 00 c3 90 66 66 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 50 7d a4 ff 5d 
[  166.964247] RIP: _raw_spin_lock+0xc/0x30 RSP: ffffc90000c77c80
[  166.964247] CR2: 00000000000000b0


Few minutes later A dies as well:

[   83.229138] (kworker/u8:8,102,0):dlm_unlock_lock_handler:514 ERROR: failed to find lock to unlock! cookie=1:9
[  113.631250] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 has been idle for 30.403 secs.
[  143.839155] o2net: Connection to node node-A (num 1) at 192.168.122.93:7777 has been idle for 30.193 secs.
[  146.466137] o2net: No longer connected to node node-A (num 1) at 192.168.122.93:7777
[  146.473923] o2cb: o2dlm has evicted node 1 from domain 753F2064B4124242A46F1715538F8618
[  147.935147] o2dlm: Begin recovery on domain 753F2064B4124242A46F1715538F8618 for node 1
[  147.941285] o2dlm: Node 2 (me) is the Recovery Master for the dead node 1 in domain 753F2064B4124242A46F1715538F8618
[  147.947347] o2dlm: End recovery on domain 753F2064B4124242A46F1715538F8618
[  158.592082] ------------[ cut here ]------------
[  158.594722] kernel BUG at fs/ocfs2/reservations.c:507!
[  158.597507] invalid opcode: 0000 [#1] SMP
[  158.598507] Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ghash_clmulni_intel i2c_piix4 crc32c_intel serio_raw 8139cp
[  158.598507] CPU: 2 PID: 2485 Comm: dd Not tainted 4.14.0-rc4+ #57
[  158.598507] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  158.598507] task: ffff880136f03300 task.stack: ffffc9000116c000
[  158.598507] RIP: 0010:__ocfs2_resv_find_window+0x3f9/0x440 [ocfs2]
[  158.598507] RSP: 0018:ffffc9000116f720 EFLAGS: 00010246
[  158.598507] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff
[  158.598507] RDX: 0000000000000000 RSI: ffffc9000116f748 RDI: 0000000000000000
[  158.598507] RBP: ffffc9000116f778 R08: 0000000000000000 R09: ffffc9000116f744
[  158.598507] R10: 0000000000000001 R11: ffff8801318a84c0 R12: 0000000000000000
[  158.598507] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88012da04400
[  158.598507] FS:  00007f869b3d8700(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[  158.598507] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  158.598507] CR2: 0000563858481000 CR3: 0000000135f0a000 CR4: 00000000000006e0
[  158.598507] Call Trace:
[  158.598507]  ocfs2_resmap_resv_bits+0xad/0x360 [ocfs2]
[  158.598507]  ocfs2_claim_local_alloc_bits+0xbe/0x490 [ocfs2]
[  158.598507]  __ocfs2_claim_clusters+0x136/0x250 [ocfs2]
[  158.598507]  ocfs2_add_clusters_in_btree+0xad/0x430 [ocfs2]
[  158.598507]  ? ocfs2_inode_cache_lock+0x12/0x20 [ocfs2]
[  158.598507]  ? ocfs2_metadata_cache_unlock+0x19/0x20 [ocfs2]
[  158.598507]  ? get_page_from_freelist+0x3ef/0xb90
[  158.598507]  ocfs2_add_inode_data+0x4b/0x60 [ocfs2]
[  158.598507]  ? ocfs2_allocate_extend_trans+0x160/0x160 [ocfs2]
[  158.598507]  ocfs2_write_cluster.isra.35+0x46f/0x760 [ocfs2]
[  158.598507]  ? __add_to_page_cache_locked+0x142/0x1d0
[  158.598507]  ? lru_cache_add+0x3a/0x80
[  158.598507]  ? add_to_page_cache_lru+0x6a/0xd0
[  158.598507]  ocfs2_write_begin_nolock+0x1381/0x17b0 [ocfs2]
[  158.598507]  ? ocfs2_inode_lock_full_nested+0x362/0x880 [ocfs2]
[  158.598507]  ? ocfs2_allocate_extend_trans+0x160/0x160 [ocfs2]
[  158.598507]  ocfs2_write_begin+0xd5/0x1b0 [ocfs2]
[  158.598507]  ? security_inode_need_killpriv+0x33/0x50
[  158.598507]  generic_perform_write+0xaa/0x180
[  158.598507]  __generic_file_write_iter+0x14f/0x1c0
[  158.598507]  ? ocfs2_inode_unlock+0xf4/0x110 [ocfs2]
[  158.598507]  ocfs2_file_write_iter+0x3dd/0xa40 [ocfs2]
[  158.598507]  ? memzero_explicit+0x12/0x20
[  158.598507]  __vfs_write+0xd2/0x130
[  158.598507]  vfs_write+0xb5/0x1a0
[  158.598507]  SyS_write+0x46/0xa0
[  158.598507]  entry_SYSCALL_64_fastpath+0x13/0x94
[  158.598507] RIP: 0033:0x7f869aef8c30
[  158.598507] RSP: 002b:00007ffc659ea698 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  158.752095] RAX: ffffffffffffffda RBX: 00007f869b1c1b38 RCX: 00007f869aef8c30
[  158.752095] RDX: 0000000000000001 RSI: 0000563858481000 RDI: 0000000000000001
[  158.752095] RBP: 00007f869b1c1ae0 R08: 001167e91233fc63 R09: 00007ffc659f3080
[  158.752095] R10: 0000563858480030 R11: 0000000000000246 R12: 00007f869b1c1b38
[  158.752095] R13: 0000000000002010 R14: 7fffffffffffffff R15: 0000000000002710
[  158.770296] Code: ff ff 45 8b 75 18 41 8b 55 1c e9 e1 fc ff ff 45 8b 75 18 41 8b 55 1c c7 45 b0 00 00 00 00 e9 cd fc ff ff 85 c0 0f 85 50 ff ff ff <0f> 0b 48 b8 00 00 00 00 00 00 00 10 48 8d 7d d0 45 89 c1 48 89 
[  158.770296] RIP: __ocfs2_resv_find_window+0x3f9/0x440 [ocfs2] RSP: ffffc9000116f720

All above does not happen with the mentioned commit reverted.

Thanks!
-- 
wbr, Vitaly



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux