Nick, I tried the raising the timeouts (to double the original) and also increasing/decreasing the queue depth, neither seemed to make a difference, although at this point the infrequency of the crashes has made troubleshooting quite challenging. I recently had an idea though - I've always felt like it was the target giving up, not the ESXi hosts, but I couldn't think of a good way to prove this before. It occurred to me that I could test with a subset of hosts powered on until the target became unavailable, then power those hosts off before powering on a different host that would be unaware of the other hosts deciding the link was unavailable. The result was that the host that I powered on alone did NOT see the storage! Once I rebooted the target server the host in question could see the storage. If you can think of a way that the information about the target being unusable would be passed to the host that was only powered on later (other than vCenter, which was powered off) I would love to hear it. I honestly may be missing something here, but I can't think of anything that would cause this other than the storage server failing. Thanks, Dan On Thu, Mar 31, 2016 at 2:48 AM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: > On Wed, 2016-03-30 at 15:01 -0400, Dan Lane wrote: >> Nicholas, >> Can you please take another look at the fiber channel code and >> see if you can find the cause of the problems with ESXi/VAAI? > > There are two options at this point to get to root cause of why your ESX > hosts are continuously generating I/O timeouts and subsequent > ABORT_TASKs, even when the host is idle. > > First, have you looked into changing the ESX host FC LLD side timeout > and queuedepth that I asked about earlier here..? > > http://www.spinics.net/lists/target-devel/msg11844.html > > What effect does reducing the queue depth and increasing the timeout > have on the rate in which ABORT_TASKs are being generated..? > > Beyond that, you'll need to engage directly the QLogic folks (CC'ed) to > generate a firmware dump on both host and target sides for them to > analyze to figure out what's going on. > > Are you using QLogic HBAs on the ESX host side as well..? > -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html