Re: multipath_busy() stalls IO due to scsi_host_is_busy()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/16/2012 05:27 PM, Mike Christie wrote:
On 05/16/2012 09:29 AM, Bernd Schubert wrote:
On 05/16/2012 04:06 PM, James Bottomley wrote:
On Wed, 2012-05-16 at 14:28 +0200, Bernd Schubert wrote:
shost->can_queue ->   62 here
shost->host_busy ->   62 when one of the multipath groups does IO,
further
multipath groups then seem to get stalled.

I'm not sure yet why multipath_busy() does not stall IO when there is a
passive path in the prio group.

Any idea how to properly address this problem?

shost->can_queue is supposed to represent the maximum number of possible
outstanding commands per HBA (i.e. the HBA hardware limit).  Assuming
the driver got it right, the only way of increasing this is to buy a
better HBA.

HBA is a mellanox IB adapter. I have not checked yet where the limit of

What driver is this with? SRP or iSER or something else?


Its SRP. The command queue limit comes from SRP_RQ_SIZE. The value seems a bit low, IMHO. And its definitely lower than needed for optimal performance. However, given that I get good performance when multipath_busy() is a noop, I think this is the primary issue here. And it is always possible that a single LUN could use all command queues. Other LUNs still shouldn't be stalled completely.

So in summary we actually have two issues:

1) Unfair queuing/waiting of dm-mpath, which stalls an entire path and brings down overall performance.

2) Low SRP command queues. Is there a reason why SRP_RQ_SHIFT/SRP_RQ_SIZE and their depend values such as SRP_RQ_SIZE are so small?


Thanks,
Bernd


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel


[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux