On Thu, Sep 3, 2015 at 3:20 AM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: > (RESENDING) > > On Wed, 2015-09-02 at 21:14 -0400, Alex Gorbachev wrote: >> e have experienced a repeatable issue when performing the following: >> >> Ceph backend with no issues, we can repeat any time at will in lab and >> production. Cloning an ESXi VM to another VM on the same datastore on >> which the original VM resides. Practically instantly, the LIO machine >> becomes unresponsive, Pacemaker fails over to another LIO machine and >> that too becomes unresponsive. >> >> Both running Ubuntu 14.04, kernel 4.1 (4.1.0-040100-generic x86_64), >> Ceph Hammer 0.94.2, and have been able to take quite a workoad with no >> issues. >> >> output of /var/log/syslog below. I also have a screen dump of a >> frozen system - attached. >> >> Thank you, >> Alex >> > > The bug-fix patch to address this NULL pointer dereference with >= v4.1 > sbc_check_prot() sanity checks + EXTENDED_COPY I/O emulation has been > sent-out with your Reported-by. > > Please verify with your v4.1 environment that it resolves the original > ESX VAAI CLONE regression with a proper Tested-by tag. > > For now, it has also been queued to target-pending.git/for-next with a > stable CC'. > > Thanks for reporting! Thank you for the patch. I have compiled the kernel and tried the cloning - it completed successfully this morning. I will now try to build a package and deploy it on the larger systems where the failures occurred. Once completed I will learn about the Tested-by tag (never done it before) and submit the results. Best regards, Alex > > --nab > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com