Hello,
while I actually want to benchmark FhGFS on a NetApp system, I'm somehow
running from one kernel problem to another.
Yesterday we had to recable and while we are now still using multipath,
each priority group now only has one underlying devices (we don't have
sufficient IB srp ports on our test systems, but still want to benchmark
a system as close as possible to a production system).
So after recabling actually all failover paths disappeared, which
*shouldn't* have any influence on the performance. However, unexpectedly
performance is now by less than 50% when I'm doing buffered IO. With
direct IO it also still fine and reducing nr_requests of the multipath
device to 8 also 'fixes' the problem. I then guessed it right and simply
made multipath_busy() always to return 0, which also fixes the issue.
- problem:
- iostat -x -m 1 shows that alternating one multipath devices starts to
stall IO for several minutes
- the other multipath device then does IO during that time with about
600 to 700 MB/s, until it starts to stall IO
- the active NetApp controller could server both multipath devices with
about 600 to 700 MB/s
problem solutions:
- add another passive sdX device to the multipath group
- use direct IO
- reduce /sys/block/dm-X/queue/nr_requests to 8
- /sys/block/sdX does not need to be updated
- disbable multipath_busy() by letting it return 0
Looking through the call chain, I see the underlying problem seems to be
in scsi_host_is_busy().
static inline int scsi_host_is_busy(struct Scsi_Host *shost)
{
if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
shost->host_blocked || shost->host_self_blocked)
return 1;
return 0;
}
shost->can_queue -> 62 here
shost->host_busy -> 62 when one of the multipath groups does IO, further
multipath groups then seem to get stalled.
I'm not sure yet why multipath_busy() does not stall IO when there is a
passive path in the prio group.
Any idea how to properly address this problem?
Thanks,
Bernd
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html