How do you know you have a deadlock in ImageWatcher? I don't see that in the provided log. Can you provide a backtrace for all threads? On Sun, Feb 26, 2017 at 7:44 PM, Rajesh Kumar <rajeskr@xxxxxxxxxxx> wrote: > Hi, > > We are using Ceph Jewel 10.2.5 stable release. We see deadlock with image > watcher and VM becomes dead, at this point I can't ping the VM. Here are > last few lines from qemu-rbd log. Has anyone seen this and is there a fix > for it? > > > 2017-02-26 11:05:54.647071 7faa927fc700 11 objectcacher flusher 34174976 / > 33554432: 0 tx, 0 rx, 34174976 clean, 0 dirty (16777216 target, 25165824 > max) > 2017-02-26 11:05:54.678375 7faa77fff700 11 objectcacher flusher 31911424 / > 33554432: 0 tx, 0 rx, 31911424 clean, 0 dirty (16777216 target, 0 max) > 2017-02-26 11:46:19.697590 7faaa898c700 20 librbd: flush 0x560f8cbbc450 > 2017-02-26 11:46:19.697604 7faaa898c700 10 librbd::ImageState: > 0x560f8cbbb980 send_refresh_unlock > 2017-02-26 11:46:19.697608 7faaa898c700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 send_v2_get_mutable_metadata > 2017-02-26 11:46:19.700752 7faa93fff700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 handle_v2_get_mutable_metadata: r=0 > 2017-02-26 11:46:19.700775 7faa93fff700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 send_v2_get_flags > 2017-02-26 11:46:19.702128 7faa93fff700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 handle_v2_get_flags: r=0 > 2017-02-26 11:46:19.702146 7faa93fff700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 send_v2_get_snapshots > 2017-02-26 11:46:19.704675 7faa93fff700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 handle_v2_get_snapshots: r=0 > 2017-02-26 11:46:19.704704 7faa93fff700 20 librbd::ExclusiveLock: > 0x7faa78014e10 is_lock_owner=0 > 2017-02-26 11:46:19.704709 7faa93fff700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 send_v2_apply > 2017-02-26 11:46:19.704747 7faa937fe700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 handle_v2_apply > 2017-02-26 11:46:19.704763 7faa937fe700 20 librbd::image::RefreshRequest: > 0x7faaa4001030 apply > 2017-02-26 11:46:19.704771 7faa937fe700 20 librbd::image::RefreshRequest: > new snapshot id=2536 name=61 size=50029658112 > 2017-02-26 11:46:19.704801 7faa937fe700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 send_flush_aio > 2017-02-26 11:46:19.704817 7faa937fe700 10 librbd::image::RefreshRequest: > 0x7faaa4001030 handle_flush_aio: r=0 > 2017-02-26 11:46:19.704830 7faa937fe700 10 librbd::ImageState: > 0x560f8cbbb980 handle_refresh: r=0 > 2017-02-26 11:48:01.389504 7faaa898c700 20 librbd: flush 0x560f8cbbc450 > > Thanks, > > Rajesh > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com