On Thu 08-02-18 06:49:18, Andi Kleen wrote: > > > It seems multiple processes deadlocked on the bd_mutex. > > > Unfortunately there's no backtrace for the lock acquisitions, > > > so it's hard to see the exact sequence. > > > > Well, all in the report points to a situation where some IO was submitted > > to the block device and never completed (more exactly it took longer than > > those 120s to complete that IO). It would need more digging into the > > Are you sure? I didn't think outstanding IO would take bd_mutex. The stack trace is: schedule+0xf5/0x430 kernel/sched/core.c:3480 io_schedule+0x1c/0x70 kernel/sched/core.c:5096 wait_on_page_bit_common+0x4b3/0x770 mm/filemap.c:1099 wait_on_page_bit mm/filemap.c:1132 [inline] wait_on_page_writeback include/linux/pagemap.h:546 [inline] __filemap_fdatawait_range+0x282/0x430 mm/filemap.c:533 filemap_fdatawait_range mm/filemap.c:558 [inline] filemap_fdatawait include/linux/fs.h:2590 [inline] filemap_write_and_wait+0x7a/0xd0 mm/filemap.c:624 __sync_blockdev fs/block_dev.c:448 [inline] sync_blockdev.part.29+0x50/0x70 fs/block_dev.c:457 sync_blockdev fs/block_dev.c:444 [inline] __blkdev_put+0x18b/0x7f0 fs/block_dev.c:1763 blkdev_put+0x85/0x4f0 fs/block_dev.c:1835 blkdev_close+0x8b/0xb0 fs/block_dev.c:1842 __fput+0x327/0x7e0 fs/file_table.c:209 ____fput+0x15/0x20 fs/file_table.c:243 So we are waiting for PageWriteback on some page. And bd_mutex is grabbed by this process in __blkdev_put() before calling sync_blockdev(). Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR