On 8/24/18 6:21 PM, Jens Axboe wrote: > On 8/24/18 5:16 PM, Ming Lei wrote: >> Hi, >> >> On Fri, Aug 24, 2018 at 04:20:41PM -0600, Jens Axboe wrote: >>> Hi, >>> >>> Was testing other things today, but ended up with this: >>> >>> # echo "write through" > /sys/block/sde/device/scsi_disk/4:0:0:0/cache_type >>> >>> hanging. Looking closer, the request is successfully queued and the >>> caller is waiting on rq execution and completion, but the request is >>> sitting in the hctx->dispatch list and is continually being attempted >>> issued, but gets a BLK_STS_RESOURCE return. >> >> Just run fio randwrite and 'dbench -s' on virtio-scsi/usb-storage >> after setting 'write through', looks not see such issue. >> >> Also not see such kind of issue on blktests/xfstests against today's >> next tree too. >> >> Could you share a bit more(disk, io sched, dmesg log, workload) about >> how to reproduce it? Is it in normal IO path or EH? > > You're misunderstanding. The echo "write through" is the one that hangs, > not subsequent IO. As written above, that first spawns a TUR and that > request is being inserted, and the caller ends up waiting for it to > complete off blk_execute_rq(). But the request itself sits on the > dispatch list, gets dispatched, and gets BLK_STS_RESOURCE off > ->queue_rq(). It goes back on the dispatch list, and the process repeats > indefinitely since it always gets a BUSY return. On the SCSI side, what > happens is that scsi_host_queue_ready() keeps returning false, which is > why we keep returning BLK_STS_RESOURCE and not making any progress at > all. Task doing the echo: [<0>] blk_execute_rq+0x77/0xa0 [<0>] __scsi_execute+0xd3/0x1f0 [<0>] sd_revalidate_disk+0xda/0x1cd0 [sd_mod] [<0>] revalidate_disk+0x20/0x80 [<0>] cache_type_store+0x1f7/0x210 [sd_mod] [<0>] kernfs_fop_write+0x106/0x190 [<0>] __vfs_write+0x23/0x150 [<0>] vfs_write+0xbe/0x1b0 [<0>] ksys_write+0x45/0xa0 [<0>] do_syscall_64+0x42/0x100 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] 0xffffffffffffffff and the host is in perpetual recovery mode: # cat /sys/bus/scsi/devices/host4/scsi_host/host4/state recovery This is a normal SATA drive, hanging off ahci, queue depth 32. As mentioned earlier, scsi_host_queue_ready() keeps returning false. -- Jens Axboe