Hi, Dear All
I got an issue in my environment: qemu-kvm guests hang on disk write
with rbd storage.
My environment:
ceph version: 0.80.7
ceph osds: 11(hosts) * 10(osd) = 110
qemu version: 2.0 +
my operating steps:
ceph osd crush add-bucket ssd root
ceph osd getcrushmap -o mycrushmap
crushtool -d mycrushmap -o mycrushmap_v1
#modify mycrushmap_v1
#add 4 of 11 hosts into root=ssd .
#meanwhile the 11 hosts are still in root=default.
crushtool -c mycrushmap_v1 -o mycrushmap_input
ceph osd setcrushmap -i mycrushmap_input
After I doing above steps
In my environment, qemu-kvm VMs which attached ceph rbd storage all
hung. The kernel log shows:
kernel: INFO: task jbd2/sdb1-8:623 blocked for more than 120 seconds.
kernel: Not tainted 2.6.32-431.3.1.el6.x86_64 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kernel: jbd2/sdb1-8 D 0000000000000001 0 623 2 0x00000000
kernel: ffff88011c44dc20 0000000000000046 ffff8801ffffffff 00000000cc70801d
kernel: ffff88011c44db90 ffff880119466980 00000000d127ef64 ffffffffac2de373
kernel: ffff880119538638 ffff88011c44dfd8 000000000000fbc8 ffff880119538638
kernel: Call Trace:
In the meantime the ceph.log shows everything working fine and the ceph
health is ok. And The other guest VMs are fine which without ceph rbd
storage.
I tried many times in my testing environment, But I cannot reproduce it.
So that maybe not a problem.
Is there any defect/bug relates to this issue? Or any suggestion to
help me find the root cause?
Thanks very much.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html