On Mon, Jul 24, 2023 at 08:29:26AM -0700, Christoph Hellwig wrote: > Test the removal of the underlying device when the file system still > has dirty data. > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > --- After I merge this case, it always hit a kernel warning about a "possible circular locking dependency detected", no matter on ext4 [1] or xfs. This test isn't hang actually. But it always fail this test (got a .dmesg file). Any ideas about that? Thanks, Zorro [1] [58395.957612] run fstests generic/730 at 2023-08-05 10:48:20 [58400.113881] scsi_debug:sdebug_driver_probe: scsi_debug: trim poll_queues to 0. poll_q/nr_hw = (0/1) [58400.114840] scsi host1: scsi_debug: version 0191 [20210520] dev_size_mb=256, opts=0x0, submit_queues=1, statistics=0 [58400.123190] scsi 1:0:0:0: Direct-Access Linux scsi_debug 0191 PQ: 0 ANSI: 7 [58400.127198] scsi 1:0:0:0: Power-on or device reset occurred [58400.139241] sd 1:0:0:0: Attached scsi generic sg2 type 0 [58400.141245] sd 1:0:0:0: [sdb] 524288 512-byte logical blocks: (268 MB/256 MiB) [58400.143486] sd 1:0:0:0: [sdb] Write Protect is off [58400.144248] sd 1:0:0:0: [sdb] Mode Sense: 73 00 10 08 [58400.146650] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA [58400.151277] sd 1:0:0:0: [sdb] Preferred minimum I/O size 512 bytes [58400.152203] sd 1:0:0:0: [sdb] Optimal transfer size 524288 bytes [58400.189056] sd 1:0:0:0: [sdb] Attached SCSI disk [58401.481505] EXT4-fs (sdb): mounted filesystem c387e859-8bb8-4da9-80a3-7fce381c3b66 r/w with ordered data mode. Quota mode: none. [58401.685373] ====================================================== [58401.686080] WARNING: possible circular locking dependency detected [58401.686823] 6.5.0-rc4+ #1 Tainted: G W [58401.687111] ------------------------------------------------------ [58401.687927] 730/1784445 is trying to acquire lock: [58401.688583] ffff88815bd820e8 (&type->s_umount_key#66){++++}-{3:3}, at: get_super+0x104/0x1f0 [58401.689502] but task is already holding lock: [58401.690204] ffff888104b90528 (&disk->open_mutex){+.+.}-{3:3}, at: del_gendisk+0x166/0x940 [58401.690675] which lock already depends on the new lock. [58401.691079] the existing dependency chain (in reverse order) is: [58401.691868] -> #1 (&disk->open_mutex){+.+.}-{3:3}: [58401.692592] __lock_acquire+0xbd6/0x1c00 [58401.693222] lock_acquire+0x1da/0x5e0 [58401.693477] __mutex_lock+0x154/0x1490 [58401.694125] blkdev_put+0x99/0x830 [58401.694334] ext4_put_super+0x560/0xce0 [ext4] [58401.694839] generic_shutdown_super+0x13e/0x420 [58401.695125] kill_block_super+0x60/0xb0 [58401.695767] deactivate_locked_super+0x85/0x130 [58401.696055] cleanup_mnt+0x258/0x370 [58401.696271] task_work_run+0x132/0x220 [58401.696919] exit_to_user_mode_loop+0x14f/0x160 [58401.697211] exit_to_user_mode_prepare+0x142/0x150 [58401.697523] syscall_exit_to_user_mode+0x19/0x50 [58401.697807] do_syscall_64+0x69/0x90 [58401.698022] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [58401.698330] -> #0 (&type->s_umount_key#66){++++}-{3:3}: [58401.698697] check_prev_add+0x1bd/0x23a0 [58401.699341] validate_chain+0xb02/0xf30 [58401.699972] __lock_acquire+0xbd6/0x1c00 [58401.700595] lock_acquire+0x1da/0x5e0 [58401.700813] down_read+0xa1/0x4c0 [58401.701029] get_super+0x104/0x1f0 [58401.701270] fsync_bdev+0x11/0xc0 [58401.701594] del_gendisk+0x1f8/0x940 [58401.701815] sd_remove+0x85/0xf0 [sd_mod] [58401.702473] device_release_driver_internal+0x36d/0x530 [58401.703208] bus_remove_device+0x1ed/0x3f0 [58401.703883] device_del+0x342/0x910 [58401.704114] __scsi_remove_device+0x26b/0x330 [58401.705040] sdev_store_delete+0x83/0x120 [58401.705688] kernfs_fop_write_iter+0x358/0x530 [58401.705981] vfs_write+0x7c0/0xc40 [58401.706197] ksys_write+0xf1/0x1d0 [58401.706400] do_syscall_64+0x5c/0x90 [58401.706617] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [58401.706899] other info that might help us debug this: [58401.707320] Possible unsafe locking scenario: [58401.708016] CPU0 CPU1 [58401.708652] ---- ---- [58401.765786] lock(&disk->open_mutex); [58401.879949] lock(&type->s_umount_key#66); [58402.051096] lock(&disk->open_mutex); [58402.209171] rlock(&type->s_umount_key#66); [58402.209962] *** DEADLOCK *** [58402.210683] 5 locks held by 730/1784445: [58402.210912] #0: ffff88811827a430 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xf1/0x1d0 [58402.211345] #1: ffff888267d52c90 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x21a/0x530 [58402.212222] #2: ffff8881ff2060f0 (&shost->scan_mutex){+.+.}-{3:3}, at: sdev_store_delete+0x7b/0x120 [58402.212818] #3: ffff8881f73d6398 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x530 [58402.213385] #4: ffff888104b90528 (&disk->open_mutex){+.+.}-{3:3}, at: del_gendisk+0x166/0x940 [58402.214240] stack backtrace: [58402.214900] CPU: 7 PID: 1784445 Comm: 730 Kdump: loaded Tainted: G W 6.5.0-rc4+ #1 [58402.215768] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 03/05/2015 [58402.216121] Call Trace: [58402.216266] <TASK> [58402.216780] dump_stack_lvl+0x60/0xb0 [58402.216998] check_noncircular+0x2f9/0x3e0 [58402.217219] ? __pfx_check_noncircular+0x10/0x10 [58402.274370] ? check_path.constprop.0+0x24/0x50 [58402.417227] ? check_path.constprop.0+0x24/0x50 [58402.560177] ? alloc_chain_hlocks+0x1de/0x520 [58402.702879] check_prev_add+0x1bd/0x23a0 [58402.717964] validate_chain+0xb02/0xf30 [58402.718193] ? __pfx_validate_chain+0x10/0x10 [58402.718881] ? validate_chain+0x7eb/0xf30 [58402.719109] __lock_acquire+0xbd6/0x1c00 [58402.719365] lock_acquire+0x1da/0x5e0 [58402.719588] ? get_super+0x104/0x1f0 [58402.719809] ? __pfx_lock_acquire+0x10/0x10 [58402.720038] down_read+0xa1/0x4c0 [58402.720603] ? get_super+0x104/0x1f0 [58402.720830] ? do_raw_spin_trylock+0xb5/0x180 [58402.721457] ? __pfx_down_read+0x10/0x10 [58402.721679] ? get_super+0xfc/0x1f0 [58402.722252] get_super+0x104/0x1f0 [58402.723046] fsync_bdev+0x11/0xc0 [58402.723633] del_gendisk+0x1f8/0x940 [58402.723861] ? __pfx_del_gendisk+0x10/0x10 [58402.724079] ? _raw_spin_unlock_irqrestore+0x42/0x70 [58402.724354] ? __pm_runtime_resume+0x83/0x120 [58402.724994] sd_remove+0x85/0xf0 [sd_mod] [58402.725243] device_release_driver_internal+0x36d/0x530 [58402.725564] ? klist_put+0xf7/0x170 [58402.726151] bus_remove_device+0x1ed/0x3f0 [58402.726412] device_del+0x342/0x910 [58402.726993] ? __pfx_device_del+0x10/0x10 [58402.812754] ? __pfx_sysfs_kf_write+0x10/0x10 [58402.955456] __scsi_remove_device+0x26b/0x330 [58402.988906] blk_print_req_error: 48 callbacks suppressed [58402.988917] device offline error, dev sdb, sector 262148 op 0x0:(READ) flags 0x83700 phys_seg 1 prio class 2 [58403.098495] sdev_store_delete+0x83/0x120 [58403.098510] kernfs_fop_write_iter+0x358/0x530 [58403.098523] vfs_write+0x7c0/0xc40 [58403.098532] ? __pfx_vfs_write+0x10/0x10 [58403.098549] ? __pfx___lock_release+0x10/0x10 [58403.098559] ? __fget_light+0x51/0x220 [58403.227415] device offline error, dev sdb, sector 262176 op 0x0:(READ) flags 0x83700 phys_seg 1 prio class 2 [58403.228292] ? count_memcg_event_mm.part.0+0xbd/0x1f0 [58403.228308] ksys_write+0xf1/0x1d0 [58403.228667] device offline error, dev sdb, sector 262176 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 2 [58403.229369] ? __pfx_ksys_write+0x10/0x10 [58403.229375] ? ktime_get_coarse_real_ts64+0x130/0x170 [58403.229391] do_syscall_64+0x5c/0x90 [58403.230022] EXT4-fs error (device sdb): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 31, block_bitmap = 131088 [58403.230215] ? asm_exc_page_fault+0x22/0x30 [58403.231343] device offline error, dev sdb, sector 262146 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 2 [58403.232005] ? lockdep_hardirqs_on+0x79/0x100 [58403.232017] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [58403.232025] RIP: 0033:0x7fe57853eb97 [58403.232033] Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [58403.232365] device offline error, dev sdb, sector 262148 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 2 [58403.232368] EXT4-fs error (device sdb): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 16, block_bitmap = 131073 [58403.232910] RSP: 002b:00007ffd44273f88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [58403.232920] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fe57853eb97 [58403.233502] EXT4-fs error (device sdb): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 17, block_bitmap = 131074 [58403.233655] RDX: 0000000000000002 RSI: 000055e91f842090 RDI: 0000000000000001 [58403.234142] device offline error, dev sdb, sector 590 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2 [58403.234815] RBP: 000055e91f842090 R08: 0000000000000000 R09: 00007fe5785b14e0 [58403.234821] R10: 00007fe5785b13e0 R11: 0000000000000246 R12: 0000000000000002 [58403.234825] R13: 00007fe5785fb780 R14: 0000000000000002 R15: 00007fe5785f69e0 [58403.234842] </TASK> [58403.237507] device offline error, dev sdb, sector 491522 op 0x1:(WRITE) flags 0x4000 phys_seg 128 prio class 2 [58403.292737] device offline error, dev sdb, sector 590 op 0x1:(WRITE) flags 0x800 phys_seg 128 prio class 2 [58403.435617] device offline error, dev sdb, sector 492546 op 0x1:(WRITE) flags 0x0 phys_seg 128 prio class 2 [58404.245619] EXT4-fs warning (device sdb): ext4_end_bio:343: I/O error 17 writing to inode 12 starting block 246273) [58404.246595] Buffer I/O error on device sdb, logical block 245761 [58404.247309] Buffer I/O error on device sdb, logical block 245762 [58404.248033] Buffer I/O error on device sdb, logical block 245763 [58404.248737] Buffer I/O error on device sdb, logical block 245764 [58404.249663] Buffer I/O error on device sdb, logical block 245765 [58404.250419] Buffer I/O error on device sdb, logical block 245766 [58404.251133] Buffer I/O error on device sdb, logical block 245767 [58404.251880] Buffer I/O error on device sdb, logical block 245768 [58404.252614] Buffer I/O error on device sdb, logical block 245769 [58404.253389] Buffer I/O error on device sdb, logical block 245770 [58404.255576] device offline error, dev sdb, sector 229378 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2 [58404.256152] buffer_io_error: 23 callbacks suppressed [58404.256157] Buffer I/O error on dev sdb, logical block 114689, lost sync page write [58404.570222] JBD2: I/O error when updating journal superblock for sdb-8. [58404.741452] Aborting journal on device sdb-8. [58404.757264] Buffer I/O error on dev sdb, logical block 114689, lost sync page write [58404.758189] JBD2: I/O error when updating journal superblock for sdb-8. [58404.758803] JBD2: Detected IO errors while flushing file data on sdb-8 [58404.761560] EXT4-fs (sdb): shut down requested (2) [58404.787528] sd 1:0:0:0: [sdb] Synchronizing SCSI cache [58404.889094] EXT4-fs (sdb): unmounting filesystem c387e859-8bb8-4da9-80a3-7fce381c3b66. [58405.895121] EXT4-fs (sda3): unmounting filesystem 1fff3774-c594-4880-b940-900d39bc18b9. [58409.403411] EXT4-fs (sda3): mounted filesystem 1fff3774-c594-4880-b940-900d39bc18b9 r/w with ordered data mode. Quota mode: none. > tests/generic/730 | 57 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/730.out | 2 ++ > 2 files changed, 59 insertions(+) > create mode 100755 tests/generic/730 > create mode 100644 tests/generic/730.out > > diff --git a/tests/generic/730 b/tests/generic/730 > new file mode 100755 > index 00000000..11308cda > --- /dev/null > +++ b/tests/generic/730 > @@ -0,0 +1,57 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2015 Red Hat Inc. All Rights Reserved. > +# Copyright (c) 2023 Christoph Hellwig > +# > +# Test proper file system shut down when the block device is removed underneath > +# and there is dirty data. > +# > +. ./common/preamble > +_begin_fstest auto quick > + > +_cleanup() > +{ > + cd / > + $UMOUNT_PROG $SCSI_DEBUG_MNT >>$seqres.full 2>&1 > + _put_scsi_debug_dev > + rm -f $tmp.* > +} > + > +. ./common/filter > +. ./common/scsi_debug > + > +_supported_fs generic > + > +# We don't actually use the test device, but we need a block based fs > +_require_test > +_require_block_device $TEST_DEV > +_require_scsi_debug > + > +SCSI_DEBUG_DEV=`_get_scsi_debug_dev 512 512 0 256` > +test -b "$SCSI_DEBUG_DEV" || _notrun "Failed to initialize scsi debug device" > +echo "SCSI debug device $SCSI_DEBUG_DEV" >>$seqres.full > + > +run_check _mkfs_dev $SCSI_DEBUG_DEV > + > +SCSI_DEBUG_MNT="$TEST_DIR/scsi_debug_$seq" > +rm -rf $SCSI_DEBUG_MNT > +mkdir $SCSI_DEBUG_MNT > +run_check _mount $SCSI_DEBUG_DEV $SCSI_DEBUG_MNT > + > +# create a test file > +$XFS_IO_PROG -f -c "pwrite 0 1M" $SCSI_DEBUG_MNT/testfile >>$seqres.full > + > +# open a file descriptor for reading the file > +exec 3< $SCSI_DEBUG_MNT/testfile > + > +# delete the scsi debug device while it still has dirty data > +echo 1 > /sys/block/$(_short_dev $SCSI_DEBUG_DEV)/device/delete > + > +# try to read from the file, which should give us -EIO > +cat <&3 > /dev/null > + > +# close the file descriptor to not block unmount > +exec 3<&- > + > +status=0 > +exit > diff --git a/tests/generic/730.out b/tests/generic/730.out > new file mode 100644 > index 00000000..79e96db8 > --- /dev/null > +++ b/tests/generic/730.out > @@ -0,0 +1,2 @@ > +QA output created by 730 > +cat: -: Input/output error > -- > 2.39.2 >