Since it sounds like the next hammer release is more or less ready, my vote would be that the fix can wait until the next release. Since it will only occur if the new exclusive lock feature is enabled (disabled on images by default) and the connection between librbd and the OSD is reset with writeback data waiting in the cache, it sounds like a rare enough issue. Final decision rests w/ Josh. -- Jason Dillaman Red Hat dillaman@xxxxxxxxxx http://www.redhat.com ----- Original Message ----- From: "Loic Dachary" <loic@xxxxxxxxxxx> To: "Jason Dillaman" <dillaman@xxxxxxxxxx> Cc: "Ceph Development" <ceph-devel@xxxxxxxxxxxxxxx> Sent: Friday, May 8, 2015 10:34:24 AM Subject: Re: long running workloads/rbd_fsx_cache_writeback.yaml on hammer Hi, On 08/05/2015 14:36, Jason Dillaman wrote: > Nice catch. It is deadlocked in a code path between handling a watch/notify error from librados and flushing the internal cache. The issue already appears to exist in the hammer branch. I'm glad it helped you figure out a bug :-) From the point of view of the upcoming v0.94.2, do you think we should wait until this is fixed ? Or is it rare enough and can wait until v0.94.3 ? I'm asking because this bug seems to be the only potential blocker for v0.94.2. It's ok either way, just let me know . Cheers -- Loïc Dachary, Artisan Logiciel Libre -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html