On Mon, May 2, 2022 at 2:03 PM Marc Smith <msmith626@xxxxxxxxx> wrote: > > Hi, > > I'm using Linux 5.4.190 (kernel.org) with 'lvmcache' (LVM2 version > 2.03.14). I use dm-cache (in write-back mode) and dm-writecache, both > via lvmcache for several logical volumes. I was recently performing > power-loss tests, and noticed one of the dm-cache (in write-back mode) > LV's is encountering a problem when I attempt to activate that volume > group: > [root@fusion-dvt3a ~]# vgchange -ay 10072a_1_default > Segmentation fault > [root@fusion-dvt3a ~]# > > Then in the kernel log I see this: > ... > [ 4528.074501] device-mapper: cache: Origin device (dm-30) discard > unsupported: Disabling discard passdown. > [ 4528.074716] device-mapper: array: array_block_check failed: blocknr > 16818504669106333287 != wanted 10 > [ 4528.074717] device-mapper: block manager: array validator check > failed for block 10 > [ 4528.074717] device-mapper: array: get_ablock failed > [ 4528.074718] device-mapper: cache metadata: dm_array_cursor_next for > mapping failed > [ 4528.074727] ------------[ cut here ]------------ > [ 4528.074728] kernel BUG at drivers/md/dm-bufio.c:1155! > [ 4528.074732] invalid opcode: 0000 [#1] SMP NOPTI > [ 4528.074757] CPU: 9 PID: 53811 Comm: vgchange Kdump: loaded Tainted: > P OE 5.4.190-esos.prod #1 > [ 4528.074797] Hardware name: Quantum H2012/H12SSW-NT, BIOS > T20201009143356 10/09/2020 > [ 4528.074833] RIP: 0010:dm_bufio_release+0x19/0x75 > [ 4528.074855] Code: fb ff ff 48 89 ef be 01 00 00 00 5d e9 e3 fe ff > ff 41 54 55 4c 8b 67 78 48 89 fd 4c 89 e7 e8 5b cb 26 00 8b 45 50 85 > c0 75 02 <0f> 0b ff c8 89 45 50 75 48 31 c9 ba 01 00 00 00 be 03 00 00 > 00 49 > [ 4528.074928] RSP: 0018:ffffc90009f73b78 EFLAGS: 00010246 > [ 4528.074952] RAX: 0000000000000000 RBX: ffff88871f1fd340 RCX: 0000000000000007 > [ 4528.074982] RDX: ffff888826ed1600 RSI: ffffc90009f73b58 RDI: ffff88880da7e800 > [ 4528.075012] RBP: ffff8887482a0150 R08: 0000000000000001 R09: 0000000000037100 > [ 4528.075042] R10: 0000000000000000 R11: 0000000000000068 R12: ffff88880da7e800 > [ 4528.075073] R13: ffff88871f1fd478 R14: ffff88871f1fd340 R15: 00000000000001fc > [ 4528.075103] FS: 00007f48fab9b2c0(0000) GS:ffff88882b040000(0000) > knlGS:0000000000000000 > [ 4528.075137] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 4528.075161] CR2: 00007fae36386288 CR3: 0000000721d20000 CR4: 0000000000340ee0 > [ 4528.075191] Call Trace: > [ 4528.075207] dm_array_cursor_end+0x1c/0x27 > [ 4528.075255] dm_cache_load_mappings+0x2d0/0x322 > [ 4528.075304] ? clear_discard+0x34/0x34 > [ 4528.075348] cache_preresume+0x108/0x1e3 > [ 4528.075395] dm_table_resume_targets+0x38/0xa9 > [ 4528.075443] __dm_resume+0x33/0x45 > [ 4528.075486] dm_resume+0x8c/0xb1 > [ 4528.075529] dev_suspend+0x173/0x1d8 > [ 4528.075573] ctl_ioctl+0x2a6/0x39d > [ 4528.075617] ? retrieve_status+0x19e/0x19e > [ 4528.075663] dm_ctl_ioctl+0x5/0x8 > [ 4528.075708] vfs_ioctl+0x19/0x26 > [ 4528.075751] do_vfs_ioctl+0x52a/0x554 > [ 4528.075797] ? ksys_semctl.constprop.0+0xfe/0x15e > [ 4528.075847] ? rcu_is_watching+0x5/0x9 > [ 4528.075891] ksys_ioctl+0x37/0x56 > [ 4528.075935] __x64_sys_ioctl+0x11/0x14 > [ 4528.075980] do_syscall_64+0x52/0x60 > [ 4528.076027] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 4528.076077] RIP: 0033:0x7f48fac5f617 > [ 4528.076121] Code: c0 78 ed 4c 89 e0 5a 5b 41 5c c3 48 8b 05 71 88 > 0b 00 64 c7 00 26 00 00 00 48 83 c8 ff c3 0f 1f 44 00 00 b8 10 00 00 > 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 49 88 0b 00 f7 d8 64 89 > 01 48 > [ 4528.076276] RSP: 002b:00007ffe25fa3298 EFLAGS: 00000206 ORIG_RAX: > 0000000000000010 > [ 4528.076362] RAX: ffffffffffffffda RBX: 0000556c551c3391 RCX: 00007f48fac5f617 > [ 4528.076419] RDX: 0000556c564ff2c0 RSI: 00000000c138fd06 RDI: 0000000000000003 > [ 4528.076475] RBP: 0000556c564f3d30 R08: 0000556c552b7a5b R09: 00007ffe25fa30f8 > [ 4528.076532] R10: 0000556c552beb30 R11: 0000000000000206 R12: 0000556c564ff2f0 > [ 4528.076589] R13: 0000556c564ff2c0 R14: 0000556c564ff370 R15: 0000000000000001 > [ 4528.076646] Modules linked in: dm_cache_smq pci_stub nvmet_rdma(O) > bcache celerity16fc_cfg(O) nvmet_fc(O) nvmet(O) qla2xxx(O) nvme_fc(O) > nvme_fabrics(O) bonding ntb_transport ntb_hw_switchtec(O) > cls_switchtec(O) qede qed bna rdmavt(O) ib_umad(O) rdma_ucm(O) > ib_srp(O) rdma_cm(O) iw_cm(O) ib_cm(O) iw_cxgb4(O) iw_cxgb3(O) > ib_qib(O) mlx5_ib(O) mlx5_core(O) mlxdevm(O) pci_hyperv_intf mlxfw(O) > ib_uverbs(O) ib_core(O) auxiliary(O) mlx4_ib(O) ib_mthca(O) > bnxt_en(OE) nvme(O) nvme_core(O) mlx_compat(O) [last unloaded: scst] > [ 4528.077476] ---[ end trace 97c700627c684afd ]--- > [ 4528.083087] RIP: 0010:dm_bufio_release+0x19/0x75 > [ 4528.083189] Code: fb ff ff 48 89 ef be 01 00 00 00 5d e9 e3 fe ff > ff 41 54 55 4c 8b 67 78 48 89 fd 4c 89 e7 e8 5b cb 26 00 8b 45 50 85 > c0 75 02 <0f> 0b ff c8 89 45 50 75 48 31 c9 ba 01 00 00 00 be 03 00 00 > 00 49 > [ 4528.083397] RSP: 0018:ffffc90009f73b78 EFLAGS: 00010246 > [ 4528.083499] RAX: 0000000000000000 RBX: ffff88871f1fd340 RCX: 0000000000000007 > [ 4528.083608] RDX: ffff888826ed1600 RSI: ffffc90009f73b58 RDI: ffff88880da7e800 > [ 4528.083717] RBP: ffff8887482a0150 R08: 0000000000000001 R09: 0000000000037100 > [ 4528.083827] R10: 0000000000000000 R11: 0000000000000068 R12: ffff88880da7e800 > [ 4528.083936] R13: ffff88871f1fd478 R14: ffff88871f1fd340 R15: 00000000000001fc > [ 4528.084051] FS: 00007f48fab9b2c0(0000) GS:ffff88882b040000(0000) > knlGS:0000000000000000 > [ 4528.084194] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 4528.084298] CR2: 00007fae36386288 CR3: 0000000721d20000 CR4: 0000000000340ee0 > ... > > The "blocknr" value of "16818504669106333287" caught my eye, as that > number seems way out of bounds for the origin/cache devices in this > setup (the cache LV size in this case is 256 MiB, and the origin > device is less than 500 GiB). > > At first I suspected cache device corruption, so I used the PV data > offset + PE segment sizes and offset to extract an image of the cache > LV into a separate file so I can examine it with the cache_* user land > tools, and this is what I see: > [root@fusion-dvt3a ~]# cache_dump /mnt/data/md122_C02_cvol.img > <superblock uuid="" block_size="128" nr_cache_blocks="3968" > policy="mq" hint_width="4"> > <mappings> > <mapping cache_block="0" origin_block="7325368" dirty="false"/> > <mapping cache_block="1" origin_block="7325369" dirty="false"/> > <mapping cache_block="2" origin_block="7325370" dirty="false"/> > <mapping cache_block="3" origin_block="7325371" dirty="false"/> > <mapping cache_block="4" origin_block="7325372" dirty="false"/> > <mapping cache_block="5" origin_block="7325373" dirty="false"/> > <mapping cache_block="6" origin_block="7325374" dirty="false"/> > <mapping cache_block="7" origin_block="7325375" dirty="false"/> > <mapping cache_block="8" origin_block="0" dirty="false"/> > <mapping cache_block="9" origin_block="7325351" dirty="false"/> > <mapping cache_block="10" origin_block="1" dirty="false"/> > <mapping cache_block="11" origin_block="2" dirty="false"/> > <mapping cache_block="12" origin_block="4" dirty="false"/> > <mapping cache_block="13" origin_block="8" dirty="false"/> > <mapping cache_block="14" origin_block="16" dirty="false"/> > <mapping cache_block="15" origin_block="32" dirty="false"/> > <mapping cache_block="16" origin_block="64" dirty="false"/> > <mapping cache_block="17" origin_block="3" dirty="false"/> > <mapping cache_block="18" origin_block="5" dirty="false"/> > <mapping cache_block="19" origin_block="6" dirty="false"/> > <mapping cache_block="20" origin_block="7" dirty="false"/> > <mapping cache_block="21" origin_block="9" dirty="false"/> > <mapping cache_block="22" origin_block="10" dirty="false"/> > <mapping cache_block="23" origin_block="11" dirty="false"/> > <mapping cache_block="24" origin_block="12" dirty="false"/> > <mapping cache_block="25" origin_block="13" dirty="false"/> > <mapping cache_block="26" origin_block="14" dirty="false"/> > <mapping cache_block="27" origin_block="15" dirty="false"/> > <mapping cache_block="28" origin_block="7322946" dirty="false"/> > <mapping cache_block="29" origin_block="7324340" dirty="false"/> > <mapping cache_block="30" origin_block="7324341" dirty="false"/> > <mapping cache_block="31" origin_block="7324342" dirty="false"/> > <mapping cache_block="32" origin_block="7324343" dirty="false"/> > <mapping cache_block="33" origin_block="7324344" dirty="false"/> > <mapping cache_block="34" origin_block="7324345" dirty="false"/> > <mapping cache_block="35" origin_block="7324346" dirty="false"/> > <mapping cache_block="36" origin_block="7324347" dirty="false"/> > </mappings> > <hints> > <hint cache_block="0" data="PwAAAA=="/> > <hint cache_block="1" data="PwAAAA=="/> > <hint cache_block="2" data="PwAAAA=="/> > <hint cache_block="3" data="PwAAAA=="/> > <hint cache_block="4" data="PwAAAA=="/> > <hint cache_block="5" data="PwAAAA=="/> > <hint cache_block="6" data="PwAAAA=="/> > <hint cache_block="7" data="PwAAAA=="/> > <hint cache_block="8" data="PwAAAA=="/> > <hint cache_block="9" data="PwAAAA=="/> > <hint cache_block="10" data="PwAAAA=="/> > <hint cache_block="11" data="PwAAAA=="/> > <hint cache_block="12" data="PwAAAA=="/> > <hint cache_block="13" data="PwAAAA=="/> > <hint cache_block="14" data="PwAAAA=="/> > <hint cache_block="15" data="PwAAAA=="/> > <hint cache_block="16" data="PwAAAA=="/> > <hint cache_block="17" data="PwAAAA=="/> > <hint cache_block="18" data="PwAAAA=="/> > <hint cache_block="19" data="PwAAAA=="/> > <hint cache_block="20" data="PwAAAA=="/> > <hint cache_block="21" data="AAAAAA=="/> > <hint cache_block="22" data="AAAAAA=="/> > <hint cache_block="23" data="AAAAAA=="/> > <hint cache_block="24" data="AAAAAA=="/> > <hint cache_block="25" data="AAAAAA=="/> > <hint cache_block="26" data="AAAAAA=="/> > <hint cache_block="27" data="AAAAAA=="/> > <hint cache_block="28" data="AAAAAA=="/> > <hint cache_block="29" data="AAAAAA=="/> > <hint cache_block="30" data="AAAAAA=="/> > <hint cache_block="31" data="AAAAAA=="/> > <hint cache_block="32" data="AAAAAA=="/> > <hint cache_block="33" data="AAAAAA=="/> > <hint cache_block="34" data="AAAAAA=="/> > <hint cache_block="35" data="AAAAAA=="/> > <hint cache_block="36" data="AAAAAA=="/> > </hints> > </superblock> > [root@fusion-dvt3a ~]# cache_check /mnt/data/md122_C02_cvol.img > examining superblock > examining mapping array > examining hint array > examining discard bitset > [root@fusion-dvt3a ~]# echo $? > 0 > [root@fusion-dvt3a ~]# > > Nothing seems out of the ordinary to me, and 'cache_check' doesn't > report any errors/problems with the metadata. > > So I then used a "debug" Linux Kernel (same 5.4.190 code) that > contains additional debugging features (full preemption, lock > debugging, etc.). It also has the > CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING feature enabled, which I suspect > might be related to the persistent-data stuff used by dm-cache. And > when I boot this kernel that has the additional debug features, I do > NOT hit the BUG_ON() in dm_bufio_release() and everything works > properly -- the VG/LV's activate just fine and are usable, no kernel > errors / call traces logged. > > I switched back and fourth between these two kernels several times, > just to make sure it wasn't a fluke. And indeed, I am unable to hit > this problem with the "debug" kernel. > > Based on the experiment between the two kernel configs and the user > land tools not reporting an error, it sounds like my cache metadata is > sound, but hitting some other issue that I don't understand yet. > > Wondering if anyone has any tips / guidance on additional debugging I > can do? I plan to circle back and take my original kernel config and > enable CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING to see if that has any > effect. I enabled these two additional kernel config options and re-tested: CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING=y CONFIG_DM_DEBUG_BLOCK_STACK_TRACING=y And then when activating the VG, the 'lvm' (vgchange) tool still seg faults, but the kernel log and call trace is a bit different with these options enabled: ... [ 7134.908886] device-mapper: cache: Origin device (dm-30) discard unsupported: Disabling discard passdown. [ 7134.909088] device-mapper: array: array_block_check failed: blocknr 16818504669106333287 != wanted 10 [ 7134.909090] device-mapper: block manager: array validator check failed for block 10 [ 7134.909090] device-mapper: array: get_ablock failed [ 7134.909091] device-mapper: cache metadata: dm_array_cursor_next for mapping failed [ 7134.909102] ------------[ cut here ]------------ [ 7134.909102] kernel BUG at drivers/md/persistent-data/dm-block-manager.c:246! [ 7134.909107] invalid opcode: 0000 [#1] SMP NOPTI [ 7134.909133] CPU: 13 PID: 75361 Comm: vgchange Kdump: loaded Tainted: P OE 5.4.190-esos.prod #1 [ 7134.909175] Hardware name: Quantum H2012/H12SSW-NT, BIOS T20201009143356 10/09/2020 [ 7134.909214] RIP: 0010:bl_up_read+0xf/0x43 [ 7134.909233] Code: 48 83 f8 04 75 f0 e8 69 fc ff ff 89 c0 48 c7 44 c2 18 00 00 00 00 e9 e1 fb ff ff 55 48 89 fd e8 ab ff 25 00 83 7d 04 00 7f 02 <0f> 0b 48 89 ef 65 48 8b 34 25 40 ad 01 00 e8 b1 ff ff ff 48 8b 45 [ 7134.909311] RSP: 0018:ffffc90009f1bb68 EFLAGS: 00010246 [ 7134.909335] RAX: 0000000000000000 RBX: ffff88878e2e6b40 RCX: 0000000000000007 [ 7134.909366] RDX: 0000000000000001 RSI: ffffc90009f1bb54 RDI: ffff888748f49858 [ 7134.909397] RBP: ffff888748f49858 R08: 0000000000000001 R09: 0000000000036d00 [ 7134.909428] R10: 0000000000000000 R11: 0000000000000068 R12: ffff888748f49858 [ 7134.909459] R13: ffff88878e2e6c78 R14: ffff88878e2e6b40 R15: 00000000000001fc [ 7134.909491] FS: 00007fdfa39b72c0(0000) GS:ffff88882b140000(0000) knlGS:0000000000000000 [ 7134.909526] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7134.909552] CR2: 0000559be7639038 CR3: 00000007a5a72000 CR4: 0000000000340ee0 [ 7134.909583] Call Trace: [ 7134.909597] dm_bm_unlock+0x2f/0x3a [ 7134.909642] dm_array_cursor_end+0x1c/0x27 [ 7134.909690] dm_cache_load_mappings+0x2d0/0x322 [ 7134.909739] ? clear_discard+0x34/0x34 [ 7134.909784] cache_preresume+0x108/0x1e3 [ 7134.909831] dm_table_resume_targets+0x38/0xa9 [ 7134.909880] __dm_resume+0x33/0x45 [ 7134.909923] dm_resume+0x8c/0xb1 [ 7134.909967] dev_suspend+0x173/0x1d8 [ 7134.910011] ctl_ioctl+0x2a6/0x39d [ 7134.910055] ? retrieve_status+0x19e/0x19e [ 7134.910103] dm_ctl_ioctl+0x5/0x8 [ 7134.910148] vfs_ioctl+0x19/0x26 [ 7134.910191] do_vfs_ioctl+0x52a/0x554 [ 7134.910237] ? ksys_semctl.constprop.0+0xfe/0x15e [ 7134.910287] ? rcu_is_watching+0x5/0x9 [ 7134.910333] ksys_ioctl+0x37/0x56 [ 7134.910376] __x64_sys_ioctl+0x11/0x14 [ 7134.910423] do_syscall_64+0x52/0x60 [ 7134.910469] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 7134.910520] RIP: 0033:0x7fdfa3a7b617 [ 7134.910564] Code: c0 78 ed 4c 89 e0 5a 5b 41 5c c3 48 8b 05 71 88 0b 00 64 c7 00 26 00 00 00 48 83 c8 ff c3 0f 1f 44 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 49 88 0b 00 f7 d8 64 89 01 48 [ 7134.910722] RSP: 002b:00007fffcfccb488 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 [ 7134.910810] RAX: ffffffffffffffda RBX: 0000559be5dae391 RCX: 00007fdfa3a7b617 [ 7134.910868] RDX: 0000559be76589c0 RSI: 00000000c138fd06 RDI: 0000000000000003 [ 7134.910925] RBP: 0000559be764c830 R08: 0000559be5ea2a5b R09: 00007fffcfccb2e8 [ 7134.910983] R10: 0000559be5ea9b30 R11: 0000000000000206 R12: 0000559be76589f0 [ 7134.911041] R13: 0000559be76589c0 R14: 0000559be7658a70 R15: 0000000000000001 [ 7134.911100] Modules linked in: dm_cache_smq pci_stub nvmet_rdma(O) bcache celerity16fc_cfg(O) nvmet_fc(O) nvmet(O) qla2xxx(O) nvme_fc(O) nvme_fabrics(O) bonding ntb_transport ntb_hw_switchtec(O) cls_switchtec(O) qede qed bna rdmavt(O) ib_umad(O) rdma_ucm(O) ib_srp(O) rdma_cm(O) iw_cm(O) ib_cm(O) iw_cxgb4(O) iw_cxgb3(O) ib_qib(O) mlx5_ib(O) mlx5_core(O) mlxdevm(O) pci_hyperv_intf mlxfw(O) ib_uverbs(O) ib_core(O) auxiliary(O) mlx4_ib(O) ib_mthca(O) bnxt_en(OE) nvme(O) nvme_core(O) mlx_compat(O) [last unloaded: scst] [ 7134.911937] ---[ end trace bf6823c5ee51ae83 ]--- [ 7134.917732] RIP: 0010:bl_up_read+0xf/0x43 [ 7134.917831] Code: 48 83 f8 04 75 f0 e8 69 fc ff ff 89 c0 48 c7 44 c2 18 00 00 00 00 e9 e1 fb ff ff 55 48 89 fd e8 ab ff 25 00 83 7d 04 00 7f 02 <0f> 0b 48 89 ef 65 48 8b 34 25 40 ad 01 00 e8 b1 ff ff ff 48 8b 45 [ 7134.918042] RSP: 0018:ffffc90009f1bb68 EFLAGS: 00010246 [ 7134.918145] RAX: 0000000000000000 RBX: ffff88878e2e6b40 RCX: 0000000000000007 [ 7134.918255] RDX: 0000000000000001 RSI: ffffc90009f1bb54 RDI: ffff888748f49858 [ 7134.918365] RBP: ffff888748f49858 R08: 0000000000000001 R09: 0000000000036d00 [ 7134.918475] R10: 0000000000000000 R11: 0000000000000068 R12: ffff888748f49858 [ 7134.918585] R13: ffff88878e2e6c78 R14: ffff88878e2e6b40 R15: 00000000000001fc [ 7134.918699] FS: 00007fdfa39b72c0(0000) GS:ffff88882b140000(0000) knlGS:0000000000000000 [ 7134.918843] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7134.918950] CR2: 0000559be7639038 CR3: 00000007a5a72000 CR4: 0000000000340ee0 ... --Marc > > > Thanks for your time. > > --Marc -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel