I guess timing is everything. :-) I have updated my test system with the 3.10.26 kernel (released yesterday) and the problem appears to be fixed. Thanks to Jean-Tiare and Laurent for the feedback. Steve -----Original Message----- From: Laurent Barbe [mailto:laurent@xxxxxxxxxxx] Sent: Friday, January 10, 2014 1:42 AM To: ceph-users@xxxxxxxxxxxxxx; Stephen Taylor Cc: Jean-Tiare LE BIGOT Subject: Re: RBD kernel driver deadlock with emperor Hello, I had exactly the same problem without applying this patch : https://git.kernel.org/cgit/linux/kernel/git/sage/ceph-client.git/commit/?h=for-stable-3.10.24&id=bde57e9d9ec0d65dbeef27bff0c7a297d7ef784e Fixing this bug : http://tracker.ceph.com/issues/5760 Note that kernel 3.10.26 just released and fixes this particular problem. Laurent Barbe Le 09/01/2014 17:52, Jean-Tiare LE BIGOT a écrit : > I had the same issue a couple of days ago. I fixed it by applying > pending patches at > https://git.kernel.org/cgit/linux/kernel/git/sage/ceph-client.git/?h=f > or-stable-3.10.24 > > > According to recent mails, they should be included in next maintenance > linux release 3.10.26 > > On 01/09/14 17:44, Stephen Taylor wrote: >> I've recently run into an issue with the RBD kernel client in emperor >> where I'm mapping and formatting an image, then repeatedly mounting >> it, writing data to it, unmounting it, and snapshotting it. Nearly >> every time (with only one exception so far), the driver appears to >> deadlock after the eighth snapshot and the device becomes completely >> unresponsive until I reboot. The one exception deadlocked after the >> ninth snapshot. I have reproduced this with and without partitions on >> the device, using NTFS, ext4, and xfs as the filesystem, and using a >> variety of applications to write files to the device. >> >> I had been running similar tests previously on a dumpling system >> without issues, so I'm wondering if anyone has seen anything like >> this with emperor. There are other variables, so I'm not 100% sure >> it's an emperor issue, but that appears to be the case from what I have seen. >> >> I see there is an open issue #1769 where the kernel client can >> deadlock, but in my case the kernel client is a server machine with >> 8GB of memory, and memory utilization is not anywhere near capacity, >> so I don't think it's the same issue. >> >> Performing the same set of operations via librbd (not using the >> kernel >> client) doesn't seem to exhibit the deadlock. >> >> Any ideas? >> >> Steve >> >> --------------------------------------------------------------------- >> --- *Stephen Taylor *| Senior Software Engineer | *StorageCraft >> Technology >> Corporation* <http://www.storagecraft.com> >> 11850 Election Road Suite 100 | Draper | Utah | 84020 >> *Office: *801.871.2799 | *Fax: *801.545.4705 >> --------------------------------------------------------------------- >> --- If you are not the intended recipient of this message, be advised >> that any dissemination or copying of this message is prohibited. >> If you received this message erroneously, please notify the sender >> and delete it, together with any attachments. >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com