On Mon, Apr 27 2020 at 8:39pm -0400, Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> wrote: > When adding devices that don't have a scsi_dh on a BIO based multipath, > I was able to consistently hit the warning below and lock-up the system. > > The problem is that __map_bio reads the flag before it potentially being > modified by choose_pgpath, and ends up using the older value. > > The WARN_ON below is not trivially linked to the issue. It goes like > this: The activate_path delayed_work is not initialized for non-scsi_dh > devices, but we always set MPATHF_QUEUE_IO, asking for initialization. > That is fine, since MPATHF_QUEUE_IO would be cleared in choose_pgpath. > Nevertheless, only for BIO-based mpath, we cache the flag before calling > choose_pgpath, and use the older version when deciding if we should > initialize the path. Therefore, we end up trying to initialize the > paths, and calling the non-initialized activate_path work. > > [ 82.437100] ------------[ cut here ]------------ > [ 82.437659] WARNING: CPU: 3 PID: 602 at kernel/workqueue.c:1624 > __queue_delayed_work+0x71/0x90 > [ 82.438436] Modules linked in: > [ 82.438911] CPU: 3 PID: 602 Comm: systemd-udevd Not tainted 5.6.0-rc6+ #339 > [ 82.439680] RIP: 0010:__queue_delayed_work+0x71/0x90 > [ 82.440287] Code: c1 48 89 4a 50 81 ff 00 02 00 00 75 2a 4c 89 cf e9 > 94 d6 07 00 e9 7f e9 ff ff 0f 0b eb c7 0f 0b 48 81 7a 58 40 74 a8 94 74 > a7 <0f> 0b 48 83 7a 48 00 74 a5 0f 0b eb a1 89 fe 4c 89 cf e9 c8 c4 07 > [ 82.441719] RSP: 0018:ffffb738803977c0 EFLAGS: 00010007 > [ 82.442121] RAX: ffffa086389f9740 RBX: 0000000000000002 RCX: 0000000000000000 > [ 82.442718] RDX: ffffa086350dd930 RSI: ffffa0863d76f600 RDI: 0000000000000200 > [ 82.443484] RBP: 0000000000000200 R08: 0000000000000000 R09: ffffa086350dd970 > [ 82.444128] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa086350dd930 > [ 82.444773] R13: ffffa0863d76f600 R14: 0000000000000000 R15: ffffa08636738008 > [ 82.445427] FS: 00007f6abfe9dd40(0000) GS:ffffa0863dd80000(0000) knlGS:00000 > [ 82.446040] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 82.446478] CR2: 0000557d288db4e8 CR3: 0000000078b36000 CR4: 00000000000006e0 > [ 82.447104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 82.447561] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 82.448012] Call Trace: > [ 82.448164] queue_delayed_work_on+0x6d/0x80 > [ 82.448472] __pg_init_all_paths+0x7b/0xf0 > [ 82.448714] pg_init_all_paths+0x26/0x40 > [ 82.448980] __multipath_map_bio.isra.0+0x84/0x210 > [ 82.449267] __map_bio+0x3c/0x1f0 > [ 82.449468] __split_and_process_non_flush+0x14a/0x1b0 > [ 82.449775] __split_and_process_bio+0xde/0x340 > [ 82.450045] ? dm_get_live_table+0x5/0xb0 > [ 82.450278] dm_process_bio+0x98/0x290 > [ 82.450518] dm_make_request+0x54/0x120 > [ 82.450778] generic_make_request+0xd2/0x3e0 > [ 82.451038] ? submit_bio+0x3c/0x150 > [ 82.451278] submit_bio+0x3c/0x150 > [ 82.451492] mpage_readpages+0x129/0x160 > [ 82.451756] ? bdev_evict_inode+0x1d0/0x1d0 > [ 82.452033] read_pages+0x72/0x170 > [ 82.452260] __do_page_cache_readahead+0x1ba/0x1d0 > [ 82.452624] force_page_cache_readahead+0x96/0x110 > [ 82.452903] generic_file_read_iter+0x84f/0xae0 > [ 82.453192] ? __seccomp_filter+0x7c/0x670 > [ 82.453547] new_sync_read+0x10e/0x190 > [ 82.453883] vfs_read+0x9d/0x150 > [ 82.454172] ksys_read+0x65/0xe0 > [ 82.454466] do_syscall_64+0x4e/0x210 > [ 82.454828] entry_SYSCALL_64_after_hwframe+0x49/0xbe > [...] > [ 82.462501] ---[ end trace bb39975e9cf45daa ]--- > > Signed-off-by: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> I'll get this queued for 5.7-rcX and stable@ Thanks, Mike -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel