multipath_busy() stalls IO due to scsi_host_is_busy()

Bernd Schubert <bernd.schubert@xxxxxxxxxxxxxxxxxx> · Wed, 16 May 2012 14:28:40 +0200

Hello,

while I actually want to benchmark FhGFS on a NetApp system, I'm somehow 
running from one kernel problem to another.
Yesterday we had to recable and while we are now still using multipath, 
each priority group now only has one underlying devices (we don't have 
sufficient IB srp ports on our test systems, but still want to benchmark 
a system as close as possible to a production system).
So after recabling actually all failover paths disappeared, which 
*shouldn't* have any influence on the performance. However, unexpectedly 
performance is now by less than 50% when I'm doing buffered IO. With 
direct IO it also still fine and reducing nr_requests of the multipath 
device to 8 also 'fixes' the problem. I then guessed it right and simply 
made multipath_busy() always to return 0, which also fixes the issue.

- problem:
	- iostat -x -m 1 shows that alternating one multipath devices starts to 
stall IO for several minutes
	- the other multipath device then does IO during that time with about 
600 to 700 MB/s, until it starts to stall IO
	- the active NetApp controller could server both multipath devices with 
about 600 to 700 MB/s

problem solutions:
	- add another passive sdX device to the multipath group
	- use direct IO
	- reduce /sys/block/dm-X/queue/nr_requests to 8
		- /sys/block/sdX does not need to be updated
	- disbable multipath_busy() by letting it return 0

Looking through the call chain, I see the underlying problem seems to be 
in scsi_host_is_busy().

static inline int scsi_host_is_busy(struct Scsi_Host *shost)
{
	if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
	    shost->host_blocked || shost->host_self_blocked)
		return 1;

	return 0;
}

shost->can_queue -> 62 here
shost->host_busy -> 62 when one of the multipath groups does IO, further 
multipath groups then seem to get stalled.

I'm not sure yet why multipath_busy() does not stall IO when there is a 
passive path in the prio group.

Any idea how to properly address this problem?

Thanks,
Bernd

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html