Re: [PATCH RFT] blk-mq: optimize queue tag busy iter for shared_tags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kashyap,

This is for current/5.17. This patch is meaningfully only on top of [1].

[1] " blk-mq: Use shared tags for shared sbitmap support" Commit -
e155b0c238b20f0a866f4334d292656665836c8a


But your change seems effectively the same as in https://lore.kernel.org/all/1638794990-137490-4-git-send-email-john.garry@xxxxxxxxxx/, which is now merged in Jens' 5.17 queue:

https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-5.17/block&id=fea9f92f1748083cb82049ed503be30c3d3a9b69

While doing additional testing for [1], I noticed some performance issue.
Along with the performance issue, I noticed CPU lockup as well. Lockup
trace -

_raw_spin_lock_irqsave+0x42/0x50
  blk_mq_find_and_get_req+0x20/0xa0
  bt_iter+0x2d/0x80
  blk_mq_queue_tag_busy_iter+0x1aa/0x2f0
  ? blk_mq_complete_request+0x30/0x30
  ? blk_mq_complete_request+0x30/0x30
  ? __schedule+0x360/0x850
  blk_mq_timeout_work+0x5e/0x120
  process_one_work+0x1a8/0x380
  worker_thread+0x30/0x380
  ? wq_calc_node_cpumask.isra.30+0x100/0x100
  kthread+0x167/0x190
  ? set_kthread_struct+0x40/0x40
  ret_from_fork+0x22/0x30

It is a generic performance issue if driver use " shost->host_tagset = 1".
In fact, I found that [1] is useful to fix performance issue and provided
this additional patch.

I changed my setup to have 64 scsi_devices (earlier I just kept 16 or 24
drives, so did not noticed this issue). Performance/cpu lockup issue is not
due to [1].
More number of scsi device, hardware context per host and high queue depth
will increase the chances of lockup and performance drop.

Do you think, it is good to have changes in 5.16 + stable ?
I don't know if this  patch will create any side effect. Can you review and
let me know your feedback. ?


Can you test my merged change again for this scenario?

I will also note that I mentioned previously that blk_mq_queue_tag_busy_iter() was not optimum for shared sbitmap, i.e. before shared tags, but no one said performance was bad for shared sbitmap.

Thanks,
John



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux