Hi Jared, Adding Himanshu, Quinn + target-devel CC' On Mon, 2017-05-22 at 07:45 +1200, Jared Watts wrote: > Hi Nicholas, > > > Apologies for the unsolicited email (I'm sure you get these on an > ongoing basis). I'm trying to setup a homelab SAN and using Fedora 25 > to supply LUNs via FC to ESX 5.5 (using a Brocade 200E SAN switch and > QLogic 2462 HBAs). > > > I've successfully got it running with the targets appearing on each > ESX host, successfully creating a datastore. QLogic card on the server > in target mode, target created via targetcli and ACLs configured/LUN > added and showing in ESX. However after a little while (during high > IO) the target dies on each ESX host (with dead/error). Doing a > systemctl restart target.service gets it working again. > > > I did read the following: > https://www.spinics.net/lists/target-devel/msg15173.html > > > > https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2113956 > > > > And disabled ATS heartbeat - but the issue still occurs. > > Thanks for doing some homework first. ;) > How would you recommend debugging something like this? There are no > messages of interest from the kernel. I know what I'm doing (15 years > C programming) - all I need is a starting point to try and diagnose > the source of the issue. > So I'd recommend collecting all logs hardware + switch setup, driver, and firmware information for both ESX + target side, and send it along with the qla2xxx / Cavium folks CC'ed to have a look. That will be a good starting point to understand if it's something obvious that has already been fixed. Most likely they will have you collect a qla2xxx firmware dump once the bug triggers in order to understand what is going on. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html