Adding Markus since we're talking about new CLI argument and capability reporting standards. On Fri, Sep 14, 2018 at 05:52:30PM +0400, Marc-André Lureau wrote: > As discussed during "[PATCH v4 00/29] vhost-user for input & GPU" > review, let's define a common set of backend conventions to help with > management layer implementation, and interoperability. > > v2: > - drop --pidfile > - add some notes about daemonizing & stdin/out/err > > Cc: libvir-list@xxxxxxxxxx > Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx> > Cc: Daniel P. Berrangé <berrange@xxxxxxxxxx> > Cc: Changpeng Liu <changpeng.liu@xxxxxxxxx> > Cc: Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx> > Cc: Felipe Franciosi <felipe@xxxxxxxxxxx> > Cc: Gonglei <arei.gonglei@xxxxxxxxxx> > Cc: Maxime Coquelin <maxime.coquelin@xxxxxxxxxx> > Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> > Cc: Victor Kaplansky <victork@xxxxxxxxxx> > Signed-off-by: Marc-André Lureau <marcandre.lureau@xxxxxxxxxx> > --- > docs/interop/vhost-user.txt | 109 +++++++++++++++++++++++++++++++++++- > 1 file changed, 107 insertions(+), 2 deletions(-) > > diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt > index ba5e37d714..339b335e9c 100644 > --- a/docs/interop/vhost-user.txt > +++ b/docs/interop/vhost-user.txt > @@ -17,8 +17,13 @@ The protocol defines 2 sides of the communication, master and slave. Master is > the application that shares its virtqueues, in our case QEMU. Slave is the > consumer of the virtqueues. > > -In the current implementation QEMU is the Master, and the Slave is intended to > -be a software Ethernet switch running in user space, such as Snabbswitch. > +In the current implementation QEMU is the Master, and the Slave is the > +external process consuming the virtio queues, for example a software > +Ethernet switch running in user space, such as Snabbswitch, or a block > +device backend processing read & write to a virtual disk. In order to > +facilitate interoperability between various backend implementations, > +it is recommended to follow the "Backend program conventions" > +described in this document. > > Master and slave can be either a client (i.e. connecting) or server (listening) > in the socket communication. > @@ -859,3 +864,103 @@ resilient for selective requests. > For the message types that already solicit a reply from the client, the > presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings > no behavioural change. (See the 'Communication' section for details.) > + > +Backend program conventions > +--------------------------- > + > +vhost-user backends provide various services and they may need to be > +configured manually depending on the use case. However, it is a good > +idea to follow the conventions listed here when possible. Users, QEMU > +or libvirt, can then rely on some common behaviour to avoid > +heterogenous configuration and management of the backend program and > +facilitate interoperability. > + > +In order to be discoverable, default vhost-user backends should be > +located under "/usr/libexec", and be named "vhost-user-$device" where > +"$device" is the device name in lower-case following the name listed > +in the Linux virtio_ids.h header (ex: the VIRTIO_ID_RPROC_SERIAL > +backend would be named "vhost-user-rproc-serial"). > + > +Mechanisms to list, and to select among alternatives implementations > +or modify the default backend are not described at this point (a > +distribution may use update-alternatives, for example, to list and to > +pick a different default backend). I don't think that update-alternatives is a good thing as it presumes that each host only needs a single preferred impl at a time. I think we need to be able to discover all impls for a given device type. This feels like the same problem we tackled recently with enumerating and choosing between multiple firmware impls. In $git/docs/interop/firmware.json we defined a way to drop config files into a standard directory, providing info about the firmware in a well defined QAPI based data format. Rather than requiring a special file naming convention I think we just need to register config files in a particular directory, letting the mgmt app enumerate them. eg /etc/qemu/vhost-user/50-rproc-serial.json (a default imp from QEMU) /etc/qemu/vhost-user/10-my-rproc-serial.json (my replacenment impl) a file could be something pretty simple like { "name": "my-rproc-serial", "description": "My rproc serial impl doing foo, bar, wizz", "device": "rproc-serial", "binary": "/usr/libexec/my-awesome-rproc-serial", } Mgmt apps can simply load all files in that directory to learn about the possible impls. The file load order gives a prioritization if multiple matches exist, or a specific impl can be requested by name "my-rproc-serial". This shouldn't provide full capabilities reporting though, just enough to identify viable binaries. Capabilities should still be via the binary itself so it can be dynamically tailored based on other environmental factors > + > +The backend program must not daemonize itself, but it may be > +daemonized by the management layer. It may also have a restricted > +access to the system. > + > +File descriptors 0, 1 and 2 will exist, and have regular > +stdin/stdout/stderr usage (they may be redirected to /dev/null by the > +management layer, or to a log handler). > + > +The backend program must end (as quickly and cleanly as possible) when > +the SIGTERM signal is received. Eventually, it may be SIGKILL by the > +management layer after a few seconds. > + > +The following command line options have an expected behaviour. They > +are mandatory, unless explicitly said differently: > + > +* --socket-path=PATH > + > +This option specify the location of the vhost-user Unix domain socket. > +It is incompatible with --fd. > + > +* --fd=FDNUM > + > +When this argument is given, the backend program is started with the > +vhost-user socket as file descriptor FDNUM. It is incompatible with > +--socket-path. > + > +* --print-capabilities > + > +Output to stdout a line-seperated list of backend capabilities, and > +then exit successfully. Other options and arguments should be ignored, > +and the backend program should not perform its normal function. This is going to repeat the mistakes we've had with every other binary in QEMU. A "simple" flag list or args sounds appealing, but we've always been burnt by it in the medium-long term, which is why we created QAPI. If we're doing to have any capabilities reporting, we should model it in QAPI schema, so any '--print-capabilities' arg should print a JSON doc following the documented schema. While talking about QAPI, I think this is an opportunity to also avoid the problems of CLI arg values becoming more complex than just scalars. eg --socket-path=PATH may inevitably grow more options - eg to perhaps say whether to use it in listen or connect mode. Or to indicate a reconnect timeout. etc I know Markus wants to replace QemuOpts with something that is again driven by QAPI, so that "-arg $VALUE" can handle $VALUE being complex non-scalar data following a QAPI schema with well defined semantics for parsing. Since we are defining a new standard, I think we should go todo something better than scalar values right from the start. > + > +At the time of writing, there are no common capabilities. Some > +device-specific capabilities are listed in the respective sections. By > +convention, device-specific capabilities are prefixed by their device > +name. > + > +vhost-user-input program conventions > +------------------------------------ > + > +Capabilities: > + > +input-evdev-path > + > + The --evdev-path command line option is supported. > + > +input-no-grab > + > + The --no-grab command line option is supported. > + > +* --evdev-path=PATH (optional) > + > +Specify the linux input device. > + > +* --no-grab (optional) > + > +Do no request exclusive access to the input device. > + > +vhost-user-gpu program conventions > +---------------------------------- > + > +Capabilities: > + > +gpu-render-node > + > + The --render-node command line option is supported. > + > +gpu-virgl > + > + The --virgl command line option is supported. > + > +* --render-node=PATH (optional) > + > +Specify the GPU DRM render node. > + > +* --virgl (optional) > + > +Enable virgl rendering support. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list