Re: "blk-mq: fix tag_get wait task can't be awakened" causes hangs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex

1、Please help to import this structure:

blk_mq_tags <= request_queue->blk_mq_hw_ctx->blk_mq_tags

If there is no kernel dump, help to see the value of

cat /sys/block/sda/mq/0/nr_tags
               __ <= Change it to the problem device

And how many block devices in total by lsblk.

2、Please describe in detail how to reproduce the issue,

And what type of USB device?

3、Please help to try the attachment patch and see if it can be reproduced.

Thanks.

On 2022/1/25 0:24, Alex Xu (Hello71) wrote:
Hi,

Recently on torvalds master, I/O on USB flash drives started hanging
here:

task:systemd-udevd   state:D stack:    0 pid:  374 ppid:   347 flags:0x00004000
Call Trace:
  <TASK>
  ? __schedule+0x319/0x4a0
  ? schedule+0x77/0xa0
  ? io_schedule+0x43/0x60
  ? blk_mq_get_tag+0x175/0x2b0
  ? mempool_alloc+0x33/0x170
  ? init_wait_entry+0x30/0x30
  ? __blk_mq_alloc_requests+0x1b4/0x220
  ? blk_mq_submit_bio+0x213/0x490
  ? submit_bio_noacct+0x22c/0x270
  ? xa_load+0x48/0x80
  ? mpage_readahead+0x114/0x130
  ? blkdev_fallocate+0x170/0x170
  ? read_pages+0x48/0x1d0
  ? page_cache_ra_unbounded+0xee/0x1f0
  ? force_page_cache_ra+0x68/0xc0
  ? filemap_read+0x18c/0x9a0
  ? blkdev_read_iter+0x4e/0x120
  ? vfs_read+0x265/0x2d0
  ? ksys_read+0x50/0xa0
  ? do_syscall_64+0x62/0x90
  ? do_user_addr_fault+0x271/0x3c0
  ? asm_exc_page_fault+0x8/0x30
  ? exc_page_fault+0x58/0x80
  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
  </TASK>

mount(8) hangs with a similar backtrace, making the device effectively
unusable. It does not seem to affect NVMe or SATA attached drives. The
affected drive does not support UAS. I don't currently have UAS drives
to test with. The default I/O scheduler is set to noop.

I found that reverting 180dccb0dba4 ("blk-mq: fix tag_get wait
task can't be awakened") appears to resolve the issue.

Let me know what other information is needed.

Cheers,
Alex.
.


BR
Laibin
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 6220fa67fb7e..09d293c30fd2 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -488,9 +488,13 @@ void sbitmap_queue_recalculate_wake_batch(struct sbitmap_queue *sbq,
 					    unsigned int users)
 {
 	unsigned int wake_batch;
+	unsigned int min_batch;
+	unsigned int depth = (sbq->sb.depth + users - 1) / users;
 
-	wake_batch = clamp_val((sbq->sb.depth + users - 1) /
-			users, 4, SBQ_WAKE_BATCH);
+	min_batch = sbq->sb.depth >= (4 * SBQ_WAIT_QUEUES) ? 4 : 1;
+
+	wake_batch = clamp_val(depth / SBQ_WAIT_QUEUES,
+			min_batch, SBQ_WAKE_BATCH);
 	__sbitmap_queue_update_wake_batch(sbq, wake_batch);
 }
 EXPORT_SYMBOL_GPL(sbitmap_queue_recalculate_wake_batch);

[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux