On 2017-06-02 13:01, Peter Maloney wrote: >> Is it easy for you to reproduce it? I had the same problem, and the same >> solution. But it isn't easy to reproduce... Jason Dillaman asked me for >> a gcore dump of a hung process but I wasn't able to get one. Can you do >> that, and when you reply, CC Jason Dillaman <jdillama@xxxxxxxxxx> ? > I mean a hung qemu process on the vm host (the one that uses librbd). > And I guess that should be TO rather than CC. > Peter, Can it be that my situation is different? In my case the guest/qemu process it self does not hang. The guest root filesystem resides in an rbd image w/o exclusive-lock enabled (the pre-existing kind I described). The problem surfaced when additional storage was attached to the guest, through a new rbd image created with exclusive-lock as it is the default on Jewel. Problem being when parted/fdisk is run on that device, they hang as reported. On the other hand, dd if=/dev/sdb of=/tmp/lala count=512 has no problem completing, While the reverse, dd if=/tmp/lala of=/dev/sdb count=512 hangs indefinately. While in this state, I can still,ssh to the guest and work as long as I don't touch the new device. It appears that when a write to the device backed by the exclusive-lock featured image hangs, a read to it will hang as well. -K. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com