Hi, On 12/15/2011 04:07 PM, Guido Winkelmann wrote:
Hi, I've got a small ceph cluster with one mon, one mds and two osds (all on the same machine, for now), that I want to use as a block- and file storage backend for qemu machine virtualisation. I found that read access to some of the rbd images, or parts of some of them sometimes blocks indefinitely, usually after the image has been sitting around untouched for a while, for example over night. This has the effect that virtual machines that try to access their disks as well as rbd commands like "rbd cp" will just hang indefinitely. I found that these blocks can usually be "fixed" by restarting one of the osds. The last time this happened, ceph -s reported one of the osds to be in state "active+clean+scrubbing". (I'm afraid I don't have the complete output from ceph -s anymore.)
I've been seeing the exact same behaviour, but I wasn't able yet to get into it a bit deeper.
As far as I know, when a PG gets scrubbed it become unavailable for a short period, but since this scrub blocks/loops the PG will never become available again, thus blocking the virtual machine.
I saw this behaviour with v0.37 and 0.38, upgrading to 0.39 to see if it still exists.
Wido
Does anybody have any idea what could be going wrong here? Guido -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html