Mike Anderson wrote:
Bodo Stroesser [bstroesser@xxxxxxxxxxxxxxxxxxx] wrote:
Hi James,
disrupting a working FC connection makes my i386 SMP server
(2.6.12.2) freeze just one or two seconds after this.
I'm normally using lpfc_nodev_tmo = 1. When I change this to the
default value of 35, the system stalls about 36 seconds after
disruption. So I guess, the problem is caused by nodev_tmo
expiring.
I activated the nmi_watchdog, but no output.
What can I do to analyze this problem?
Does changing the timeout for a scsi device also alter the problem. In the
past people have seen issues of the nodev_tmo expiring near the scsi
timeout. This past cases lead to devices being offlined, but may this
could be causing a different symptom on your system.
The amount of time between cutting the connection and the system freezing
is nearly the same as lpfc_nodev_tmo. Using the default nodev_tmo of 35 seconds
results in about 36 seconds, while setting nodev_tmo to 1 results in
2 seconds. As the devices on the Fibre Channel are tapedrives scsi timeout is
900 seconds.
There are 8 tests running that write 8 tape-LUNs at the same SCSI target.
If the connection is broken, some of the tests immediately receive a bad
result for write(), some keep waiting for a result.
Meanwhile I also did some tests with timeout set to 5 and nodev_tmo to 35
(The test I'm running doesn't fail with that small timeout). Those tests, that
do not receive a bad result, stay waiting for result even after 5 second timeout
is expired. In most cases, the system doesn't freeze after nodev_tmo with this test.
But about 5 seconds after plugging FC cable again, it freezes.
You can change the timeout for the device by echoing a higher value into
/sys/bus/scsi/devices/${nexus}/timeout.
Is this a full system freeze or only the controlling console?
Full freeze, no more replies via console or network.
-andmike
--
Michael Anderson
andmike@xxxxxxxxxx
-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html