Re: qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process

Mike Dawson <mike.dawson@xxxxxxxxxxxx> · Fri, 02 Aug 2013 17:47:47 -0400

Oliver,

We've had a similar situation occur. For about three months, we've run 
several Windows 2008 R2 guests with virtio drivers that record video 
surveillance. We have long suffered an issue where the guest appears to 
hang indefinitely (or until we intervene). For the sake of this 
conversation, we call this state "wedged", because it appears something 
(rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets 
wedged, we see the following:

- the guest will not respond to pings
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]

We can "un-wedge" the guest by opening a NoVNC session or running a 
'virsh screenshot' command. After that, the guest resumes and runs as 
expected. At that point we can examine the guest. Each time we'll see:

- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc

We had so many issue with guests becoming wedged, we wrote a script to 
'virsh screenshot' them via cron. Then we installed some updates and had 
a month or so of higher stability (wedging happened maybe 1/10th as 
often). Until today we couldn't figure out why.

Yesterday, I realized qemu was starting the instances without specifying 
cache=writeback. We corrected that, and let them run overnight. With RBD 
writeback re-enabled, wedging came back as often as we had seen in the 
past. I've counted ~40 occurrences in the past 12-hour period. So I feel 
like writeback caching in RBD certainly makes the deadlock more likely 
to occur.

Joshd asked us to gather RBD client logs:

"joshd> it could very well be the writeback cache not doing a callback 
at some point - if you could gather logs of a vm getting stuck with 
debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be 
great"

We'll do that over the weekend. If you could as well, we'd love the help!

[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt

Thanks,

Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/2/2013 6:22 AM, Oliver Francke wrote:
Well,

I believe, I'm the winner of buzzwords-bingo for today.

But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.

Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)

I just opened a bug for qemu per:

https://bugs.launchpad.net/qemu/+bug/1207686

with all dirty details.

Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.

Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,

Oliver.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com