On 05/07/2014 05:41 PM, David Mansfield wrote:
On 05/02/2014 09:05 AM, Marc-André Lureau wrote:
Hi
FYI: I have been running with the attached patch (not the inline above)
to spice-gtk for one week now, and so has my colleague. Dual monitor
works perfectly. There is one other crash scenario (xorg in guest
crashes and won't restart) which sometimes happens consistently upon
logging in. This can be "fixed" by removing the file
~/.config/monitors.xml (in the guest).
I recommend strongly that the attached fix be included. I understand it
is a "band-aid" but so is "fixme: goto whole" (see source code)
which is
broken by definition.
You are basically ignoring the surface_id from the monitor config
with this patch. This isn't helping much, most probably this point
out to invalid config, so I'd rather not do that and keep displaying
the whole surface instead. Did you manage to trace back to where that
surface id was generated? That's what I would do. It could be
corrupted memory...
I have done a bit of poking into this. I added this into qxl.ko's
qxl_object.c:
--- qxl_object.c~ 2014-03-30 23:40:15.000000000 -0400
+++ qxl_object.c 2014-05-07 17:05:24.989185760 -0400
@@ -311,6 +311,7 @@
ret = qxl_hw_surface_alloc(qdev, bo, NULL);
if (ret)
return ret;
+ DRM_ERROR("qxl_bo_check_id: %p %d %d", bo,
(int)bo->is_primary, (int)bo->surface_id);
}
return 0;
}
And lo-and-behold, the system keeps creating new "primary" surfaces,
and not one of the resulting surface_id is '0'.
[ 25.393840] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id:
ffff880209670000 1 1
[ 86.104660] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id:
ffff8801fb5dcc00 1 2 <- this is the one that
remote-viewer sees with surface_id #2
[ 118.560840] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id:
ffff8801fb748800 1 3
I think that the entire concept of is_primary vs surface_id=0 is very
broken in this code, and that sometimes the "real" surface_id (which
is not 0) is "leaking out" into the protocol. These id's are allocated
by idr. Or perhaps the #2 surface is allocated as a primary before
the #1 has been freed, and somewhere a check on "only one primary
allowed" is hit, and so the primary gets toggled off. (surface #3 is
when I resized the second monitor and can be ignored).
I have an explanation for this, but not a fix. The fix needs to be made
by the owner of this code (Alon or Dave according to the header!)
The bug lies in qxl_display.c:qxl_crtc_mode_set. In this method, there
is a conditional termed "recreate_primary". The logic for this is based
on qcrtc->index. In other words, the "recreate_primary" branch is taken
only on the first head. In this case, the surface_id is FORCED to 0,
and for all other heads, the actual surface_id is used.
However, no _actual_ surface_id's are 0. See above. All surfaces
(valid for this context) have surface_id >0. So it is IMPOSSIBLE for
the monitor_config for the second monitor to have surface_id = 0.
So you may be asking yourself (or me) - so how is it working (in
gnome3)? Well, that is funny. It's just "working by coincidence" for
gnome3, but broken by design.
Here's how that works. When the bo is created, it _initially_ gets
surface_id = 0. The actual surface id (assigned by qxl_bo_check_id)
isn't assigned until the qxl_update_area_ioctl is called (by userspace).
In gnome3, this ioctl is called AFTER setting the mode on the 2nd
monitor, so the second branch above which returns the "actual"
surface_id still returns 0 because the surface hasn't been "checked"
(whatever that means). In MATE and other xrandr environments, the ioctl
is called right BEFORE setting the mode, so the actual surface_id is
assigned and we get the trap in spice-widget.c (which I have worked around).
Thoughts?
--
Thanks,
David Mansfield
Cobite, INC.
_______________________________________________
Spice-devel mailing list
Spice-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/spice-devel