Hi William, On Thu, 2017-02-16 at 07:30 -0500, William Thompson wrote: > Please keep me in CC, I am not subscribed. > > I've been using the iscsi target for a little while to provide extra storage > to 2 of our VMware hosts. Every so often, both of these hosts hang (vms > living on local storage continue to work) until the target is restarted. > Once I restart the target the hosts are no longer hung. > > I installed debian testing (stretch) on the target host due to issues with > getting targetcli installed from the stable release. > > Any suggestions? > > Kernel is: > Linux it-iscsi 4.6.0-1-amd64 #1 SMP Debian 4.6.2-2 (2016-06-25) x86_64 GNU/Linux > Thanks for your bug-report. Using a v4.6.x kernel, I think there are two issues to consider. First, note v4.6.x is not a long term stable kernel, so it's missing a number of regression bug-fixes from the past year that you'll likely encounter against ESX hosts with VAAI enabled. https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/target?h=linux-4.4.y&id=60ba156dda2c11ff7a44d78ec64abd21b9813115 https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/target?h=linux-4.4.y&id=f318588b758514c35f0a9227195178a3b2b4b733 https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/target?h=linux-4.4.y&id=56661d2b89b2a549be04f37dcf824c39d7aca9c6 These three are the minimum patches to v4.6.y that you'll need. However, considering v4.6.y is not a long-term stable kernel, I'd very much recommend instead you use the latest v4.1.y, v4.4.y or anything >= v4.8.y in order to get the full set of bug-fixes from upstream. Second, there is a well known ESX 5.5u2+ bug with VMFS5 that is triggered by targets that support VAAI AtomicTestandSet (emulate_caw=1 in LIO backend attribute speak). https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2113956 http://cormachogan.com/2015/04/17/heads-up-ats-miscompare-detected-between-test-and-set-hb-images/ http://www.thevirtualist.org/alert-application-outages-using-vaai-ats-on-vsphere-5-5-update2-vsphere-6-0/ https://www-304.ibm.com/support/docview.wss?uid=ssg1S1005201 http://h20565.www2.hpe.com/hpsc/doc/public/display?sp4ts.oid=75953&docId=mmr_sf-EN_US000005979&lang=en-us&cc=us&docLocale=en_US The result is that data stores will eventually go offline, if the ATS heartbeat logic for VMFS5 is not explicitly disabled on all ESX hosts. There have many, many users who have hit this, and based on the recommendation of all the other vendors above, it's a must disable in order to get a stable working ESX 5.5u2+ + VAAI setup. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html