On 02/20/2013 03:17:27 PM, Marcelo Tosatti wrote:
On Mon, Feb 18, 2013 at 05:01:40PM -0600, Scott Wood wrote:
> On 02/18/2013 06:21:59 AM, Gleb Natapov wrote:
> >Copying Christoffer since ARM has in kernel irq chip too.
> >
> >On Wed, Feb 13, 2013 at 11:49:15PM -0600, Scott Wood wrote:
> >> Currently, devices that are emulated inside KVM are configured
in a
> >> hardcoded manner based on an assumption that any given
architecture
> >> only has one way to do it. If there's any need to access device
> >state,
> >> it is done through inflexible one-purpose-only IOCTLs (e.g.
> >> KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing
is
> >> cumbersome and depletes a limited numberspace.
Creation of ioctl has advantages. It makes explicit what the
data contains and how it should be interpreted.
This means that for example, policy control can be performed at ioctl
level (can group, by ioctl number, which ioctl calls are allowed after
VM creation, for example).
It also makes it explicit that its a userspace interface which
applications depend.
With a single 'set device attribute' ioctl its more difficult
to do that.
I don't see why creating ioctls rather than device attributes (or
whatever you want to call them) differs this way. Device attributes
are inherently userspace interfaces, just as ioctls are. What the data
contains and how it is interpreted are documented, albeit in a more
lightweight format than KVM usually uses for ioctls.
An ioctl is no guarantee of good documentation. KVM is far better than
most of the kernel in that regard, but even there some ioctls are
missing (e.g. KVM_IRQ_LINE_STATUS as mentioned elsewhere in this
thread), and some others are inadequately explained (e.g. IRQ routing).
By "policy control", do you just mean that some ioctls act on a
/dev/kvm fd, others on a vm fd, and others on a vcpu fd? This is
pretty similar to having a device fd, except for the mechanism used.
The main things that I dislike about adding new things at the ioctl
level are:
- limited numberspace, and more prone to merge conflicts than a
device-specific numberspace
- having to add a new pathway to get into the driver for each ioctl,
rather than having all operations on a particular device go to the
right place, and the device implementation interprets further (this
assumes that we're talking about vm ioctls, and not returning a
new fd for the device)
- awkward way of negotiating capabilities with userspace (have to
declare the capability id separately, and add code somewhere outside the
device implementation to respond to capability requests)
- api.txt formatting that imposes yet another document-internal
numberspace to conflict on :-)
Abstracting the details also makes it cumbersome to read
strace output :-)
You'd have to update strace one way or another if you want
pretty-printed output. Having a more restricted API than arbitrary
ioctls could actually improve the situation -- this could be a good
reason for including the attribute length as an explicit parameter
rather than being implicit in the attribute id. Then you'd just teach
strace about the device control API once, and not for each new
attribute or device that gets added.
> >> This API provides a mechanism to instantiate a device of a
certain
> >> type, returning an ID that can be used to set/get attributes of
the
> >> device. Attributes may include configuration parameters (e.g.
> >> register base address), device state, operational commands,
etc. It
> >> is similar to the ONE_REG API, except that it acts on devices
rather
> >> than vcpus.
> >You are not only provide different way to create in kernel irq
> >chip you
> >also use an alternate way to trigger interrupt lines. Before going
> >into
> >interface specifics lets think about whether it is really worth it?
>
> Which "it" do you mean here?
>
> The ability to set/get attributes is needed. Sorry, but "get or set
> one blob of data, up to 512 bytes, for the entire irqchip" is just
> not good enough -- assuming you don't want us to start sticking
> pointers and commands in *that* data. :-)
Why not? Is it necessary to constantly read/write attributes?
It's not about how often we do it, but how much state we have,
especially if we ever want to implement migration.
> If you mean the way to inject interrupts, it's simpler this way.
Are you injecting interrupts via this new SET_DEVICE_ATTRIBUTE ioctl?
Yes, but even if that gets shot down (the best objection IMHO is the
one nobody is raising -- how to hook into irqfd), we still need the
rest of it. I think we'd even want to keep attributes for IRQ line
status so that we have a way to read it, even if just for debugging.
-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html