Coming back to this, with Jason's insight it was quickly revealed that my problem was in reality a cephx authentication permissions issue. Specifically, exclusive-lock requires a cephx user with class-write access to the pool where the image resides. This wasn't clear in the documentation and the used I was using only has class-read access. Once a cephx user with proper permissions was used, the device backed by the exclusive-lock enabled rbd image started to behave. I'm really sorry for the red herring and thank you all for helping me expand my understanding of Ceph. -K. On 2017-06-02 14:07, Peter Maloney wrote: > On 06/02/17 12:25, koukou73gr wrote: >> On 2017-06-02 13:01, Peter Maloney wrote: >>>> Is it easy for you to reproduce it? I had the same problem, and the same >>>> solution. But it isn't easy to reproduce... Jason Dillaman asked me for >>>> a gcore dump of a hung process but I wasn't able to get one. Can you do >>>> that, and when you reply, CC Jason Dillaman <jdillama@xxxxxxxxxx> ? >>> I mean a hung qemu process on the vm host (the one that uses librbd). >>> And I guess that should be TO rather than CC. >>> >> Peter, >> >> Can it be that my situation is different? >> >> In my case the guest/qemu process it self does not hang. The guest root >> filesystem resides in an rbd image w/o exclusive-lock enabled (the >> pre-existing kind I described). > Of course it could be different, but it seems the same so far... same > solution, and same warnings in the guest, just it takes some time before > the guest totally hangs. > > Sometimes the OS seems ok but has those warnings... > then worse is you can see the disk looks busy in iostat like 100% but > has low activity like 1 w/s... > and worst is that you can't even get anything to run or any screen > output or keyboard input at all, and kill on the qemu process won't even > work at that point, except with -9. > > And sometimes you can get the exact same symptoms with a curable > problem... like if you stop too many osds and min_size is not reached > for just one pg that the image uses, then it looks like it works, until > it hits that bad pg, then the above symptoms happen. And then most of > the time the VM recovers when the osds are up again, but sometimes not. > But since you mentioned exclusive lock, I still think it seems the same > or highly related. > >> >> The problem surfaced when additional storage was attached to the guest, >> through a new rbd image created with exclusive-lock as it is the default >> on Jewel. >> >> Problem being when parted/fdisk is run on that device, they hang as >> reported. On the other hand, >> >> dd if=/dev/sdb of=/tmp/lala count=512 >> >> has no problem completing, While the reverse, >> >> dd if=/tmp/lala of=/dev/sdb count=512 >> >> hangs indefinately. While in this state, I can still,ssh to the guest >> and work as long as I don't touch the new device. It appears that when a >> write to the device backed by the exclusive-lock featured image hangs, a >> read to it will hang as well. >> >> -K. >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com