Re: [RFC/POC PATCH 00/16] add output_id to monitors_config

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 
> On Fri, 2018-06-15 at 15:12 +0200, Marc-André Lureau wrote:
> > Hi
> > 
> > On Fri, Jun 15, 2018 at 1:01 PM, Lukáš Hrázký <lhrazky@xxxxxxxxxx> wrote:
> > > On Fri, 2018-06-15 at 12:24 +0200, Marc-André Lureau wrote:
> > > > Hi
> > > > 
> > > > On Fri, Jun 15, 2018 at 12:16 PM, Lukáš Hrázký <lhrazky@xxxxxxxxxx>
> > > > wrote:
> > > > > On Thu, 2018-06-14 at 21:12 +0200, Marc-André Lureau wrote:
> > > > > > Hi
> > > > > > 
> > > > > > On Tue, Jun 5, 2018 at 5:30 PM, Lukáš Hrázký <lhrazky@xxxxxxxxxx>
> > > > > > wrote:
> > > > > > > Hi everyone,
> > > > > > > 
> > > > > > > following is my attempt at solving the ID issues with
> > > > > > > monitors_config
> > > > > > > and streaming. The concept is as follows:
> > > > > > > 
> > > > > > 
> > > > > > Before introducing a new solution, could you describe the "issues
> > > > > > with
> > > > > > monitors_config and streaming"? I am not following closely enough
> > > > > > the
> > > > > > mailing list probably, so a recap would be welcome. I'd like to
> > > > > > understand the shortcomings of the current messages, and see if we
> > > > > > are
> > > > > > on the same page about it. Thanks!
> > > > > 
> > > > > Right, sorry about that!
> > > > > 
> > > > > The issue is when having a plugin a for the streaming agent that is
> > > > > using a HW-accelerated encoder. The typical case is that you have a
> > > > > QXL
> > > > > monitor in the guest which shows the boot and then you have a
> > > > > physical
> > > > > HW device (either directly assigned to the VM or a vGPU) which is
> > > > > configured in the X server.
> > > > > 
> > > > > Once the X server starts, the QXL monitor goes blank and you get a
> > > > > second monitor on the client with the streamed content from the
> > > > > streaming agent plugin.
> > > > > 
> > > > > The problem is the code and the protocol assumes (amongst other
> > > > > assumptions) that all monitors are configured in X. The monitors on
> > > > > the
> > > > > client are put into an array and the index is used as the ID
> > > > > throughout
> > > > > SPICE. So for the case above, the QXL monitor is ID 0 and the
> > > > > streamed
> > > > > monitor is ID 1.
> > > > 
> > > > 
> > > > Since the "streamed monitor" replaces the QXL monitor, wouldn't it
> > > > make sense for them to share the same ID? This would be handled by the
> > > > server/guest transparently.
> > > 
> > > Well... for one, it does not entirely replace it. From a user
> > > experience PoV, yes, it makes sense to "replace" one monitor for the
> > > other. We've even considered switching the content of the display
> > > channel from QXL to the streamed one. But in the system, they are
> > > different monitors on different channels, so it's a bad idea to give
> > > them the same ID. And the ID we're talking about is actually (for the
> > > monitors config messages):
> > 
> > So if it's the same from use PoV, the client API shouldn't need to change.
> 
> I didn't say it's the same from the user PoV, not entirely. We can
> assume he want's to either be looking at the QXL monitor or the
> streamed one, i.e.:
> 
> 1. VM is booting, we are showing QXL monitor
> 2. X started, we switch to showing the streamed monitor

Note: this is also due to X limitation not supporting multiple devices,
on Windows probably you'll have both used at the same time and probably
user will manually disable QXL (as wants to use the other)

> 3. The user switches to a different TTY, we switch back to QXL
> etc...
> 
> This is for the simple case of one QXL monitor and one streamed
> monitor. But we are still "hiding" from the user that his VM actually
> has two graphics devices, the QXL one and a physical (or virtual) GPU.
> 
> I once advocated the approach to switch the monitors for the users, but
> now I think it would actually complicate things more and could
> potentially cause problems.
> 

I think the possibility to have the switch would be interesting and help
but I think the actual protocol is limiting and potentially complicating
stuff.

> > The client only cares about about having a specific set of monitors
> > configuration. The way the server/guest handled them is not its
> > business, it expects the best configuration.

True, the client should implement the protocol and server should use
the protocol as best as it can. This does not however mean that the
current protocol is perfect and using it as it is don't limit us
and make the code a bunch of spaghetti code in order to make protocol
happy.

> 
> That is true, but the monitors configuration needs to reflect the
> actual configuration of the guest. If we mix it up for the clients
> convenience, problems... :)
> 
> > So far, we have the current simplified model (ie other more complex
> > combinations are not supported):
> > 
> > qxl device, display channel 0,
> >   monitor 0:
> >   monitor 1:
> >   monitor x..
> > 
> > 
> > or
> > 
> > qxl device, display channel 0,
> >   monitor 0:
> > qxl device, display channel 1,
> >   monitor 0:
> > ..
> > 
> > Feel free to correct me, I haven't been looking into this for many years
> > now.
> 
> Correct.
> 

True. Note that this model however have physical (sockets and connections)
repercussions which could be hard to maintain.
If we decide to keep the current protocol and have one device we end up
with a huge surface 0 with all monitors in it having to mux and unmux streamings
and commands in both server and clients having to implement fair queueing and
synchronization just to maintain an old design.
Said that this approach would be prohibitive and looking at X implementation
requiring a single surface this means that if streaming devices are used we must
limit to one QXL with 1 monitor only and "map" all monitors to different channels
(with one monitor each).

> > > 
> > > server -> client: a pair (channel_id, monitor_id)
> > > 
> > > client -> server: channel_id + monitor_id, converted to an array index
> > > under the assumption I mentioned
> > > 
> > > So we can't really make it the same even if we wanted.
> > 
> > With multi-head/xrandr qxl:
> > 
> > qxl device, display channel 0,
> >   monitor 0:
> >   monitor 1:
> >   monitor x..
> > gpu device, stream channel 0,
> >   maps to display channel 0, monitor 0
> > gpu device, stream channel 1,
> >   maps to display channel 0, monitor 1
> > ...
> >
> > With multihead/xrandr qxl & gpu:
> > qxl device, display channel 0,
> >   monitor 0:
> >   monitor 1:
> >   monitor x..
> > gpu device, stream channel 0,
> >   maps to display channel 0, monitor 0
> >   maps to display channel 0, monitor 1
> > ...
> > 
> > With multiple QXL devices:
> > 
> > qxl device, display channel 0,
> >   monitor 0:
> > qxl device, display channel 1,
> >   monitor 0:
> > gpu device, stream channel 0,
> >   maps to display channel 0, monitor 0
> > gpu device, stream channel 1,
> >   maps to display channel 1, monitor 0
> 
> But, regardless of multihead/multiple devices, there is no guarantee
> that the number of QXL monitors is going to match the number of GPU
> device monitors. In fact, the most probable scenario is one QXL monitor
> and multiple GPU device monitors.
> 

This is not an issue, would be just a problem of allocating enough
channels to support all monitors we need.

> I don't think it makes sense to have more than one QXL monitor anyway.
> And if you did, it makes no sense to map them to GPU device monitors
> besides the first one, I guess...
> 
> Also, it is unclear what you mean by the "mapping", i.e. how exactly
> would you implement it?
> 

Simply to the client you present a channel_id, monitor_id which can
be different on the server.

> > If you want to support QXL devices that should not be associated with
> > a gpu/stream channel, then you should be able to flag them.

Not clear what do you want with this. A QXL device should be either be
seen by the client or not. You need to flag if seen or not.

> > 
> > If you need a manual association of stream channel with qxl/monitor,
> > this could be done with a manual configuration.

I don't see a case where I would need a manual configuration, I mean,
should be configurable within the guest disabling/enabling devices
and/or monitors.

> > 
> > Imho, we should avoid making things more complex. Testing is hard
> > enough. We should actually take the simplest approach possible (the
> > same decision we took before, it's already a mess to deal with, as you
> > noticed)

You are stating that the current situation is complex and also we need
to keep things simple. Looks like a bit contradictory.
Maybe the current situation is complex, or feel more complex than it
should be, for the choices done and current implementation.
>From my point of view is just that client need to tell "I want this
monitor this size" and "I'm moving the mouse here on this monitor".
The problem is defining "this monitor" at protocol level from client
to server, including messages to agent.

> 
> I fully agree with the notion :) However, we are already making things
> much more complex by adding the GPU device streaming feature. It breaks

I don't think the problem is related to streaming, more about devices
which are not QXL. With only QXL all old assumptions are easy to maintain,
the problem is the new devices. For instance having QXL 0 as Xrandr ID 0
and QXL 1 as Xrandr ID 1 is pretty easy, is the same driver and actually
we also manage (at code level!) the guest device driver. Just imagine if
the driver changes the order or physical device scanning having QXL 0 as
Xrandr 1 and QXL 1 as Xrandr 0! This would break the channel_id+monitor_id
formula. This does not happen as we maintain it, but we (SPICE) don't
maintain Intel/Nvidia/ATI drivers or Xorg/Wayland/Windows!

I'm relative new in this group (3 years) compared to these stuff,
some are possibly not in current git repos, but I think that more or
less the history is:
- before multimonitor. No much issues the monitor was the monitor;
- start adding multi monitor support, only one device was used. The
  problem was only sending the ID of the monitor which was 0, 1, ...
  Added a monitor_config message in the SPICE protocol from server
  to client. The client could resize the monitors sending a message
  to the agent through the server, server was not involved. As there
  was only a QXL device and no other devices there was no reason to send
  information for channel/qxl device and monitor IDs matched Xrandr IDs;
- for some reasons on Windows was not great, somebody decided to use
  multiple QXL devices each with one monitor. As a workaround not having
  the channel in the protocol to the agent the channel_id+monitor_id formula
  was introduced to create an ID fitting into the existing one (for the
  agent);
- somebody realized that on Linux was possible to handle resolution
  changes talking directly to the device making possible resize without
  using the agent. This was implemented with a change in device modes
  (rom/ram) and an interrupt to the card handled by our Linux kernel
  driver. As on Linux there is only one device, spice-server just
  "redirects" the message for the agent sent by the client to the QXL
  card. This works on Windows as Windows driver does not handle this
  interrupt/feature (or it would replicate monitor settings on all
  cards!).
Possibly some wrong assumptions, please correct me.
Note: is not clear to me the history of the mouse events, currently
spice-server sends messages to the agent for the mouse.

> previous assumptions and adds more stuff on top of it. And I think your
> suggestion adds even more complexity than necessary. Perhaps not in the
> client, but the "mapping" code, which sounds rather non-trivial, needs
> to go somewhere, presumably the server?
> 

I think too that bending to the old "design" will do stuff more
complex.

> It is actually my intention to add as little complexity as possible
> with my patches. The fact is, both the current code and the protocol is
> not robust enough and need to be fixed. And even if they were robust

I would not speak about robustness and fix, from my point of view looks
more a problem of flexibility. Simply we now have new use cases to
support.

> enough, the protocol would need to be extended either way. On top of
> that, the backwards compatibility turns the solution into a very
> complex one. But if we ever could deprecate and drop the old code
> paths...
> 

I have same sensation, not changing protocol will make thing much
more complicated and possibly limited.

> > > > > The more pressing issue with this is that a mouse motion event is
> > > > > sent
> > > > > with a display_id 1 from the client, but when it reaches the guest,
> > > > > the
> > > > > correct monitor is ID 0, as it is the only one.
> > > > > 
> > > > > The other thing this breaks is monitors_config messages that
> > > > > enable/disable displays and change the resolution. Besides the same
> > > > > "shift" happening here as well, another problem is the information
> > > > > that
> > > > > the monitors belong to different devices (or channels) is erased when
> > > > > they're all put into the single array (under the assumption there is
> > > > > either one channel with multiple monitors or multiple channels with
> > > > > one
> > > > > monitor each).
> > > > 
> > > > Correct and so far was a fine solution.
> > > 
> > > I respectfully disagree. Doing the channel_id + monitor_id I mentioned
> > > above is asking for trouble, which is exactly what we have now.
> > 
> > With the limitation we decided, we avoid the biggest troubles though.

We avoid to change the protocol but to implement the new use cases
we introduce potentially many assumptions and code complexity.
Not saying that is hard to discuss new code thinking about design
which nobody wrote down and that is bound to old code which nobody
fully remember.

> 
> I take it you were involved with the original implementation then...
> I'm sorry for being so critical, it is not my intention to be mean :)
> 
> However, I am not criticising the limitations you chose to impose, but
> the implementation. I find two problems here:
> 
> 1. Not keeping the IDs of the monitors_config messages as a
> (channel_id, monitor_id) pair and instead abusing the limitation you
> introduced to change it to a singe ID. I sort of consider it a hard
> rule you always keep your unique IDs intact and always keep them with
> the data. That way they're there when you need them and they work
> (because you didn't mess with them :)).
> 

I think this single ID came from the history I wrote above (that could
be partly wrong).
I personally would add channel_id and monitor_id to the new messages,
this to make possible to send the correct modes to the card in every
possible case. Saving few bytes for these messages (which are not
that frequent) is not worth it.

> 2. The change in the IDs and the actual creation of the monitors_config
> list leaks over the spice-gtk API to the client application (via the
> spice_main_update_display{,_enabled}() function). It is unnecessary,
> gives more opportunity to the client app to break it, and any extension
> of the monitors_config message requires a spice-gtk API change.
> 
> You could have imposed the limitation you did (I'm not entirely sure
> what was the reason) and still make the implementation more robust by
> not doing the above...
> 
> And for these reasons this patch series is more complicated that it
> could have been (though as I said, the protocol extension, i.e. adding
> the output_id) would still be necessary.
> 
> I'm not saying this to pick on you, I'm not even sure how you were
> involved (didn't go digging who wrote the code), but more as an
> explanation and so that we all can learn from it for the future :)
> 

Usually complex code is the result of many passages and possibly
shortcuts, usually involving different people and undocumented
stuff. Picking on people is often unfair.

> > > > Either you do multiple monitor over a single channel, or you use
> > > > multiple
> > > > channels, each with one monitor.
> > > 
> > > That is a limitation you can introduce, perhaps there was a reason for
> > > it. But the code is taking shortcuts and borderline hacks because of
> > > this, while doing it right wouldn't be much harder. I'm not looking to
> > > blame or judge anyone, I don't know how this came into place. But what
> > > it is now is (IMO) a bad design and I feel the need to say it after
> > > working on it for quite some time ;)
> > 
> > Doing it differently would lead to push the complexity higher up the
> > stack, perhaps even to the user. We wanted to avoid that.

I don't see how this could reach the user.

> 
> Yeah, as I said, not sure what the reason was, but IMO you could have
> had that and still make it much more robust.

I cannot say a car has a bad design because it does not have the load
of a big truck, just a different use case. We probably started using
the car (to continue the metaphor) to load some stuff and we are now
exceeding the limit. At the beginning we though a car was enough :-)

> 
> Oh and BTW, not sure how old this is, but I'm sure I'd have much worse
> things to say about the code _I_ wrote say like 8 years ago ;)
> 
> > > > > Hope this explains it well enough. It's all very complex and goes
> > > > > over
> > > > > multiple interfaces (the network protocols as well as the spice-gtk
> > > > > API), so the more brilliant ideas you'll have, the more welcome they
> > > > > will be :)
> > > > 
> > > > If we can avoid modifying all the layers and exposing the inner
> > > > details, I would encourage the exploration of other solution.
> > > 
> > > I don't see how we can do that... And the inner details are already
> > > quite exposed, unfortunately :(
> > > 
> > > As I said, ideas welcome.
> > > 
> > > > > > > 1. The streaming-agent sends a new StreamInfo message when it
> > > > > > > starts
> > > > > > > streaming. The message contains the output_id of the streamed
> > > > > > > monitor.
> > > > > > > The actual value is the index in the list of xrandr outputs.
> > > > > > > Basically,
> > > > > > > the number should uniquely identify a monitor in the guest
> > > > > > > context under
> > > > > > > X.
> > > > > > > 
> > > > > > > 2. The server copies the number into the
> > > > > > > SpiceMsgDisplayMonitorsConfig
> > > > > > > message and sends it to the client as part of the monitors_config
> > > > > > > exchange.
> > > > > > > 
> > > > > > > 3. The client stores the output_id in it's list of
> > > > > > > monitor_configs. Here
> > > > > > > I had to make significant changes, because the monitors_config
> > > > > > > code in
> > > > > > > spice-gtk is very loosely written and also exposes quite a few
> > > > > > > unnecessary details to the client app. The client sends the
> > > > > > > output_id
> > > > > > > back to the server as part of the monitors_config messages
> > > > > > > (VDAgentMonitorsConfigV2) and also uses it for the mouse motion
> > > > > > > event,
> > > > > > > which also contains a display ID interpreted by the vd_agent. In
> > > > > > > the
> > > > > > > end, the API/ABI towards the client app should remain unchanged.
> > > > > > > 
> > > > > > > 4. The server passes the output_id in above-mentioned messages to
> > > > > > > the
> > > > > > > vd_agent. The output_id is meaningless in the server context.
> > > > > > > (Currently, it doesn't pass the monitors_config messages if there
> > > > > > > is a
> > > > > > > QXL device that supports it, though. Needs more work.)
> > > > > > > 
> > > > > > > 5. vd_agent:
> > > > > > >   a) For mouse input, the output_id was passed in the original
> > > > > > >   message,
> > > > > > >   so no change needed here, it works.
> > > > > > > 
> > > > > > >   b) If the server sends monitors_config to the guest, the
> > > > > > >   vdagent will
> > > > > > >   prefer to use monitors_configs with the output_ids set, if such
> > > > > > >   are
> > > > > > >   present. In that case, it ignores monitors_configs with the
> > > > > > >   output_id
> > > > > > >   unset. If no output_ids are present, it should behave as it
> > > > > > >   used to.
> > > > > > > 
> > > > > > > A couple of things to note:
> > > > > > > - While I did copy the VDAgentMonitorsConfig to a different
> > > > > > > message for
> > > > > > > backwards compatibility, I didn't do the same for
> > > > > > > SpiceMsgDisplayMonitorsConfig, it remains to be done.
> > > > > > > 
> > > > > > > - I didn't introduce any capabilities to handle the
> > > > > > > compatibility, also
> > > > > > > remains to be done. Hopefully it is also clear it will be quite a
> > > > > > > non-trivial job to do that :( Ain't gonna make the code prettier
> > > > > > > either.
> > > > > > > 
> > > > > > > For your convenience, you can also pull the branches below:
> > > > > > > https://gitlab.freedesktop.org/lukash/spice-protocol/tree/monitors-config-poc
> > > > > > > https://gitlab.freedesktop.org/lukash/spice-common/tree/monitors-config-poc
> > > > > > > https://gitlab.com/lhrazky/spice-streaming-agent/tree/monitors-config-poc
> > > > > > > https://gitlab.freedesktop.org/lukash/spice/tree/monitors-config-poc
> > > > > > > https://gitlab.freedesktop.org/lukash/spice-gtk/tree/monitors-config-poc
> > > > > > > https://gitlab.freedesktop.org/lukash/vd_agent/tree/monitors-config-poc
> > > > > > > 
> > > > > > > All in all, it's big, complex and invasive. Note I did review the
> > > > > > > emergency instructional video [1] and am therefore ready for any
> > > > > > > bombardment you'll be sending my way :D (Don't hesitate to
> > > > > > > contact me
> > > > > > > with questions either)
> > > > > > > 
> > > > > > > Last minute note: I've noticed some of the patches are missing
> > > > > > > Signed-off-by line, since they are not for merging, should not be
> > > > > > > an
> > > > > > > issue...
> > > > > > > 
> > > > > > > 
> > > > > > > Lukáš Hrázký (16):
> > > > > > > spice-protocol
> > > > > > >   Add the StreamInfo message
> > > > > > >   Create a version 2 of the VDAgentMonitorsConfig message
> > > > > > > spice-common
> > > > > > >   add output_id to SpiceMsgDisplayMonitorsConfig
> > > > > > > spice-streaming-agent
> > > > > > >   Send a StreamInfo message when starting to stream
> > > > > > > spice-server
> > > > > > >   Handle the StreamInfo message from the streaming agent
> > > > > > >   Use VDAgentMonitorsConfigV2 that contains the output_id field
> > > > > > > spice-gtk
> > > > > > >   Rework the handling of monitors_config
> > > > > > >   Remove the n_display_channels check when sending
> > > > > > >   monitors_config
> > > > > > >   Use an 'enabled' flag instead of status enum in monitors_config
> > > > > > >   Use VDAgentMonitorsConfigV2 as the message for monitors_config
> > > > > > >   Add output_id to the monitors_config
> > > > > > >   Use the new output_id as display ID for the mouse motion event
> > > > > > > vd_agent
> > > > > > >   vdagent: Log the diddly doo X11 error
> > > > > > >   Improve/add some logging messages
> > > > > > >   Use VDAgentMonitorsConfigV2 instead of VDAgentMonitorsConfig
> > > > > > >   vdagent: Use output_id from VDAgentMonitorsConfigV2
> > > > > > > 
> > > > > > > [1] https://www.youtube.com/watch?v=IKqXu-5jw60
> > > > > > > 

Frediano
_______________________________________________
Spice-devel mailing list
Spice-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/spice-devel




[Index of Archives]     [Linux Virtualization]     [Linux Virtualization]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]     [Monitors]