On 11/18/2013 05:32 PM, Dave Airlie wrote:
On Tue, Nov 19, 2013 at 7:04 AM, Nahum Shalman <nshalman@xxxxxxxx> wrote:
Context:
Host is running qemu-kvm 1.1.2 and spice 0.12.2.
Fedora 16 VMs ran rock solid on these same virtualization hosts.
The Fedora 19 and 20(testing) VMs are running xf86-video-qxl compiled from
the master branch of the git repo.
We've been seeing a lot of X server crashes in Fedora 19 and 20, generally
after the VM has been running for at least 2-3 days.
The last gasp in the Xorg logs from these crashes generally look something
like:
[1024592.839] Out of memory allocating 261140 bytes
[1024592.839] Out of mem - stats
[1024592.850] max system bytes = 243257344
[1024592.850] system bytes = 243257344
[1024592.850] in use bytes = 133245384
Someone here managed to get a stack trace out of one such crash:
(EE) [mi] EQ overflowing. Additional events will be discarded until existing
events are processed.
(EE)
(EE) Backtrace:
(EE) 0: /usr/bin/X (mieqEnqueue+0x22b) [0x57691b]
(EE) 1: /usr/bin/X (QueuePointerEvents+0x52) [0x44d862]
(EE) 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x2913)
[0x7ff0faeb17e3]
(EE) 3: /usr/bin/X (DPMSSupported+0xe8) [0x4861f8]
(EE) 4: /usr/bin/X (xf86SerialModemClearBits+0x230) [0x4ae7b0]
(EE) 5: /lib64/libpthread.so.0 (__restore_rt+0x0) [0x3b7de0ef9f]
(EE) 6: /lib64/libpthread.so.0 (__nanosleep_nocancel+0x24) [0x3b7de0e804]
(EE) 7: /usr/lib64/xorg/modules/drivers/qxl_drv.so (qxl_handle_oom+0x69)
[0x7ff10fceccb9]
(EE) 8: /usr/lib64/xorg/modules/drivers/qxl_drv.so (qxl_allocnf+0x48)
[0x7ff10fcecd08]
(EE) 9: /usr/lib64/xorg/modules/drivers/qxl_drv.so
(qxl_bo_alloc_internal+0x76) [0x7ff10fcece06]
(EE) 10: /usr/lib64/xorg/modules/drivers/qxl_drv.so (qxl_image_create+0xf2)
[0x7ff10fce9782]
(EE) 11: /usr/lib64/xorg/modules/drivers/qxl_drv.so
(qxl_surface_put_image+0xf5) [0x7ff10fceb045]
(EE) 12: /usr/lib64/xorg/modules/drivers/qxl_drv.so (uxa_copy_n_to_n+0x5e7)
[0x7ff10fcf7127]
(EE) 13: /usr/bin/X (miCopyRegion+0x1ad) [0x574d2d]
(EE) 14: /usr/bin/X (miDoCopy+0x456) [0x5752b6]
(EE) 15: /usr/lib64/xorg/modules/drivers/qxl_drv.so (uxa_copy_area+0xae)
[0x7ff10fcf5efe]
(EE) 16: /usr/bin/X (dixDestroyPixmap+0x711) [0x433a31]
(EE) 17: /usr/bin/X (SendErrorToClient+0x3f7) [0x436fa7]
(EE) 18: /usr/bin/X (_init+0x3aaa) [0x429b4a]
(EE) 19: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x3b7d221b75]
(EE) 20: /usr/bin/X (_start+0x29) [0x4267b1]
(EE) 21: ? (?+0x29) [0x29]
(EE)
(EE) [mi] These backtraces from mieqEnqueue may point to a culprit higher up
the stack.
(EE) [mi] mieq is NOT the cause. It is a victim.
(EE) [mi] EQ overflow continuing. 100 events have been dropped.
His comment was:
Examining the stack trace more closely, the functions identified are
misleading. The offsets are sometimes larger than the named functions, and
point to different functions not listed in the stripped symbol table.
Looking at the source, it seems that:
(EE) 16: /usr/bin/X (dixDestroyPixmap+0x711) [0x433a31]
This is probably ProcCreatePixmap()
(EE) 17: /usr/bin/X (SendErrorToClient+0x3f7) [0x436fa7]
This is possibly init_screen() or AddScreen()
So, it appears the memory allocation fails while setting up a new screen
structure. This makes more sense, but still leaves open the question why
it's trying to create new screens long after startup.
It's hard to recreate the crashes other than by simply booting and using a
VM for a few days. One theory we're tossing around is that the memory buffer
xf86-video-qxl has to work with is getting fragmented and when the
fragmentation gets bad enough an allocation can fail.
Our best guess is that this is a bug in the xf86-video-qxl driver. Has
anyone else seen similar Xorg crashes?
Guidance on how to fix or at least troubleshoot this further would be
greatly appreciated.
Why aren't you running the Fedora packages?
When we were using the Fedora packages under F19 we were able to trigger
a crash much more easily:
Specifically we had a script that would repeated launch Firefox,
Thunderbird, Google Chrome, and Gedit, wait 5 seconds, then kill all 4
applications.
That script could trigger an X server crash within a couple of hours.
When we switched to compiling the master branch, that script didn't
crash the X server at all, but normal use still causes the crashes
described in my previous email.
Thanks!
-Nahum
_______________________________________________
Spice-devel mailing list
Spice-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/spice-devel