On 1/11/24 6:57 AM, Christoph Hellwig wrote: > q_usage_counter is the only thing preventing us from the limits changing > under us in __bio_split_to_limits, but blk_mq_submit_bio doesn't hold it. > > Change __submit_bio to always acquire the q_usage_counter counter before > branching out into bio vs request based helper, and let blk_mq_submit_bio > tell it if it consumed the reference by handing it off to the request. This causes hangs for me on shutdown/reset: [ 56.146054] sd 6:0:0:0: [sdb] Synchronizing SCSI cache [ 56.147739] sd 6:0:0:0: [sdb] Stopping disk [ 56.148976] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 56.150803] sd 0:0:0:0: [sda] Stopping disk [ 75.549201] INFO: task systemd-shutdow:1 blocked for more than 15 seconds. [ 75.549636] Not tainted 6.7.0-rc5-00173-g34d71db9cce2 #4540 [ 75.549977] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 75.550401] task:systemd-shutdow state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00000004 [ 75.550900] Call trace: [ 75.551042] __switch_to+0x114/0x150 [ 75.551253] __schedule+0x510/0x10d4 [ 75.551451] schedule+0x7c/0x1ac [ 75.551635] schedule_timeout+0xe8/0x1a4 [ 75.551857] blk_mq_freeze_queue_wait_timeout+0xf4/0x18c [ 75.552157] nvme_wait_freeze_timeout+0x68/0xa4 [ 75.552503] nvme_dev_disable+0x35c/0x374 [ 75.552734] nvme_shutdown+0x34/0x40 [ 75.552956] pci_device_shutdown+0x48/0x54 [ 75.553184] device_shutdown+0x1c4/0x314 [ 75.553403] kernel_power_off+0x40/0x88 which seems to indicate that a reference is being leaked. Haven't poked any further at it, I'll drop these two for now. -- Jens Axboe