On Thu, 13 Oct 2016 00:32:48 +0530
Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
> On 10/12/2016 9:29 PM, Alex Williamson wrote:
> > On Wed, 12 Oct 2016 20:43:48 +0530
> > Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
> >
> >> On 10/12/2016 7:22 AM, Tian, Kevin wrote:
> >>>> From: Kirti Wankhede [mailto:kwankhede@xxxxxxxxxx]
> >>>> Sent: Wednesday, October 12, 2016 4:45 AM
> >>>>>> +* mdev_supported_types:
> >>>>>> + List of current supported mediated device types and its details are added
> >>>>>> +in this directory in following format:
> >>>>>> +
> >>>>>> +|- <parent phy device>
> >>>>>> +|--- Vendor-specific-attributes [optional]
> >>>>>> +|--- mdev_supported_types
> >>>>>> +| |--- <type id>
> >>>>>> +| | |--- create
> >>>>>> +| | |--- name
> >>>>>> +| | |--- available_instances
> >>>>>> +| | |--- description /class
> >>>>>> +| | |--- [devices]
> >>>>>> +| |--- <type id>
> >>>>>> +| | |--- create
> >>>>>> +| | |--- name
> >>>>>> +| | |--- available_instances
> >>>>>> +| | |--- description /class
> >>>>>> +| | |--- [devices]
> >>>>>> +| |--- <type id>
> >>>>>> +| |--- create
> >>>>>> +| |--- name
> >>>>>> +| |--- available_instances
> >>>>>> +| |--- description /class
> >>>>>> +| |--- [devices]
> >>>>>> +
> >>>>>> +[TBD : description or class is yet to be decided. This will change.]
> >>>>>
> >>>>> I thought that in previous discussions we had agreed to drop
> >>>>> the <type id> concept and use the name as the unique identifier.
> >>>>> When reporting these types in libvirt we won't want to report
> >>>>> the type id values - we'll want the name strings to be unique.
> >>>>>
> >>>>
> >>>> The 'name' might not be unique but type_id will be. For example that Neo
> >>>> pointed out in earlier discussion, virtual devices can come from two
> >>>> different physical devices, end user would be presented with what they
> >>>> had selected but there will be internal implementation differences. In
> >>>> that case 'type_id' will be unique.
> >>>>
> >>>
> >>> Hi, Kirti, my understanding is that Neo agreed to use an unique type
> >>> string (if you still called it <type id>), and then no need of additional
> >>> 'name' field which can be put inside 'description' field. See below quote:
> >>>
> >>
> >> We had internal discussions about this within NVIDIA and found that
> >> 'name' might not be unique where as 'type_id' would be unique. I'm
> >> refering to Neo's mail after that, where Neo do pointed that out.
> >>
> >> https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07714.html
> >
> > Everyone not privy to those internal discussions, including me, seems to
> > think we dropped type_id and that if a vendor does not have a stable
> > name, they can compose some sort of stable type description based on the
> > name+id, or even vendor+id, ex. NVIDIA-11. So please share why we
> > haven't managed to kill off type_id yet. No matter what internal
> > representation each vendor driver has of "type_id" it seems possible
> > for it to come up with stable string to define a given configuration.
>
>
> The 'type_id' is unique and the 'name' are not, the name is just a
> virtual device name/ human readable name. Because at this moment Intel
> can't define a proper GPU class, we have to add a 'description' field
> there as well to represent the features of this virtual device, once we
> have all agreed with the GPU class and its mandatory attributes, the
> 'description' field can be removed. Here is an example,
> type_id/type_name = NVIDIA_11,
> name=M60-M0Q,
> description=2560x1600, 2 displays, 512MB"
>
> Neo's previous comment only applies to the situation where we will have
> the GPU class or optional attributes defined and recognized by libvirt,
> since that is not going to happen any time soon, we will have to have
> the new 'description' field, and we don't want to have it mixed up with
> 'name' field.
>
> We can definitely have something like name+id as Alex recommended to
> remove the 'name' field, but it will just require libvirt to have more
> logic to parse that string.
Let's use the mtty example driver provided in patch 5 so we can all
more clearly see how the interfaces work. I'll start from the
beginning of my experience and work my way to the type/name thing.
(please add a modules_install target to the Makefile)
# modprobe mtty
Now what? It seems like I need to have prior knowledge that this
drivers supports mdev devices and I need to go hunt for them. We need
to create a class (ex. /sys/class/mdev/) where a user can find all the
devices that participate in this mediated device infrastructure. That
would point me to /sys/devices/mtty.
# tree /sys/devices/mtty
/sys/devices/mtty
|-- mdev_supported_types
| `-- mtty1
| |-- available_instances (1)
| |-- create
| |-- devices
| `-- name ("Dual-port-serial")
|-- mtty_dev
| `-- sample_mtty_dev ("This is phy device")
|-- power
| |-- async
| |-- autosuspend_delay_ms
| |-- control
| |-- runtime_active_kids
| |-- runtime_active_time
| |-- runtime_enabled
| |-- runtime_status
| |-- runtime_suspended_time
| `-- runtime_usage
`-- uevent
Ok, but that was boring, we really need to have at least 2 supported
types to validate the interface, so without changing the actual device
backing, I pretended to have a single port vs dual port:
/sys/devices/mtty
|-- mdev_supported_types
| |-- mtty1
| | |-- available_instances (24)
| | |-- create
| | |-- devices
| | `-- name (Single-port-serial)
| `-- mtty2
| |-- available_instances (12)
| |-- create
| |-- devices
| `-- name (Dual-port-serial)
[snip]
I arbitrarily decides I have 24 ports and each single port uses 1 port
and each dual port uses 2 ports.
Before I start creating devices, what are we going to key the libvirt
XML on? Can we do anything to prevent vendors from colliding or do we
have any way to suggest meaningful and unique type_ids? Presumably if
we had a PCI device hosting this, we would be rooted at that parent
device, ex. /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0. Maybe
the type_id should automatically be prefixed by the vendor module name,
ex. mtty-1, i915-foo, nvidia-bar. There's something missing for
deterministically creating a "XYZ" device and knowing exactly what that
means and finding a parent device that supports it.
Let's get to mdev creating...
# uuidgen > mdev_supported_types/mtty2/create
# tree /sys/devices/mtty
/sys/devices/mtty
|-- e68189be-700e-41f7-93a3-b5351e79c470
| |-- driver -> ../../../bus/mdev/drivers/vfio_mdev
| |-- iommu_group -> ../../../kernel/iommu_groups/63
| |-- mtty2 -> ../mdev_supported_types/mtty2
| |-- power
| | |-- async
| | |-- autosuspend_delay_ms
| | |-- control
| | |-- runtime_active_kids
| | |-- runtime_active_time
| | |-- runtime_enabled
| | |-- runtime_status
| | |-- runtime_suspended_time
| | `-- runtime_usage
| |-- remove
| |-- subsystem -> ../../../bus/mdev
| |-- uevent
| `-- vendor
| `-- sample_mdev_dev ("This is MDEV e68189be-700e-41f7-93a3-b5351e79c470")
|-- mdev_supported_types
| |-- mtty1
| | |-- available_instances (22)
| | |-- create
| | |-- devices
| | `-- name
| `-- mtty2
| |-- available_instances (11)
| |-- create
| |-- devices
| | `-- e68189be-700e-41f7-93a3-b5351e79c470 -> ../../../e68189be-700e-41f7-93a3-b5351e79c470
| `-- name
The mdev device was created directly under the parent, which seems like
it's going to get messy to me (ie. imagine dropping a bunch of uuids
into a PCI parent device's sysfs directory, how does a user know what
they are?).
Under the device we have "mtty2", shouldn't that be
"mdev_supported_type", which then links to mtty2? Otherwise a user
needs to decode from the link what this attribute is.
Also here's an example of those vendor sysfs entries per device. So
long as the vendor never expects a tool like libvirt to manipulate
attributes there, I can see how that could be pretty powerful.
Moving down to the mdev_supported_types, I've updated mtty so that it
actually adjusts available instance, and we can now see a link under
the devices for mtty2.
Also worth noting is that a link for the device appears
in /sys/bus/mdev/devices.
BTW, specifying this device for QEMU vfio-pci is where the sysfsdev
option comes into play:
-device
vfio-pci,sysfsdev=/sys/devices/mtty/e68189be-700e-41f7-93a3-b5351e79c470
Which raises another question, we can tell through the vfio interfaces
that this is exposes as a PCI device, by creating a container
(open(/dev/vfio/vfio)), setting an iommu (ioctl(VFIO_SET_IOMMU)),
adding the group to the container (ioctl(VFIO_GROUP_SET_CONTAINER)),
getting the device (ioctl(VFIO_GROUP_GET_DEVICE_FD)), and finally
getting the device info (ioctl(VFIO_DEVICE_GET_INFO)) and checking the
flag bit that says the API is PCI. That's a long path to go and has
stumbling blocks like the type of iommu that's available for the given
platform. How do we make that manageable? I don't think we want to
create some artificial relationship that the type of the parent
necessarily match the type of the child mdev, we've already broken that
with a simple mdev tty driver.
One more:
# uuidgen > mdev_supported_types/mtty1/create
# tree /sys/devices/mtty
/sys/devices/mtty
|-- a7ae17d1-2de4-44c2-ae58-20ae0a0befe8
| |-- driver -> ../../../bus/mdev/drivers/vfio_mdev
| |-- iommu_group -> ../../../kernel/iommu_groups/64
| |-- mtty1 -> ../mdev_supported_types/mtty1
| |-- power
[snip]
| |-- remove
| |-- subsystem -> ../../../bus/mdev
| |-- uevent
| `-- vendor
| `-- sample_mdev_dev ("This is MDEV a7ae17d1-2de4-44c2-ae58-20ae0a0befe8")
|-- e68189be-700e-41f7-93a3-b5351e79c470
| |-- driver -> ../../../bus/mdev/drivers/vfio_mdev
| |-- iommu_group -> ../../../kernel/iommu_groups/63
| |-- mtty2 -> ../mdev_supported_types/mtty2
| |-- power
[snip]
| |-- remove
| |-- subsystem -> ../../../bus/mdev
| |-- uevent
| `-- vendor
| `-- sample_mdev_dev ("This is MDEV e68189be-700e-41f7-93a3-b5351e79c470")
|-- mdev_supported_types
| |-- mtty1
| | |-- available_instances (21)
| | |-- create
| | |-- devices
| | | `-- a7ae17d1-2de4-44c2-ae58-20ae0a0befe8 -> ../../../a7ae17d1-2de4-44c2-ae58-20ae0a0befe8
| | `-- name
| `-- mtty2
| |-- available_instances (10)
| |-- create
| |-- devices
| | `-- e68189be-700e-41f7-93a3-b5351e79c470 -> ../../../e68189be-700e-41f7-93a3-b5351e79c470
| `-- name
Hopefully as expected with the caveats for the first example.
# echo 1 > a7ae17d1-2de4-44c2-ae58-20ae0a0befe8/remove
# echo 1 > e68189be-700e-41f7-93a3-b5351e79c470/remove
These do what they're supposed to, the devices are gone.
Ok, I've identified some issues, let's figure out how to resolve them.
Thanks,
Alex
(hack multi-port mtty patch attached)
diff --git a/Documentation/vfio-mdev/Makefile b/Documentation/vfio-mdev/Makefile
index ff6f8a3..721daf0 100644
--- a/Documentation/vfio-mdev/Makefile
+++ b/Documentation/vfio-mdev/Makefile
@@ -8,6 +8,9 @@ obj-m:=mtty.o
default:
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
+modules_install:
+ $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules_install
+
clean:
@rm -rf .*.cmd *.mod.c *.o *.ko .tmp*
@rm -rf Module.* Modules.* modules.* .tmp_versions
diff --git a/Documentation/vfio-mdev/mtty.c b/Documentation/vfio-mdev/mtty.c
index 497c90e..05ab40d 100644
--- a/Documentation/vfio-mdev/mtty.c
+++ b/Documentation/vfio-mdev/mtty.c
@@ -141,8 +141,11 @@ struct mdev_state {
struct serial_port s[2];
struct mutex rxtx_lock;
struct vfio_device_info dev_info;
+ int nr_ports;
};
+#define MAX_MTTYS 24
+
struct mutex mdev_list_lock;
struct list_head mdev_devices_list;
@@ -723,6 +726,11 @@ int mtty_create(struct kobject *kobj, struct mdev_device *mdev)
if (mdev_state == NULL)
return -ENOMEM;
+ if (!strcmp(kobj->name, "mtty1"))
+ mdev_state->nr_ports = 1;
+ else if (!strcmp(kobj->name, "mtty2"))
+ mdev_state->nr_ports = 2;
+
mdev_state->irq_index = -1;
mdev_state->s[0].max_fifo_size = MAX_FIFO_SIZE;
mdev_state->s[1].max_fifo_size = MAX_FIFO_SIZE;
@@ -1224,7 +1232,12 @@ const struct attribute_group *mdev_dev_groups[] = {
static ssize_t
name_show(struct kobject *kobj, struct device *dev, char *buf)
{
- return sprintf(buf, "Dual-port-serial\n");
+ if (!strcmp(kobj->name, "mtty1"))
+ return sprintf(buf, "Single-port-serial\n");
+ if (!strcmp(kobj->name, "mtty2"))
+ return sprintf(buf, "Dual-port-serial\n");
+
+ return -EINVAL;
}
MDEV_TYPE_ATTR_RO(name);
@@ -1232,7 +1245,20 @@ MDEV_TYPE_ATTR_RO(name);
static ssize_t
available_instances_show(struct kobject *kobj, struct device *dev, char *buf)
{
- return sprintf(buf, "1\n");
+ struct mdev_state *mds;
+ int ports, used = 0;
+
+ if (!strcmp(kobj->name, "mtty1"))
+ ports = 1;
+ else if (!strcmp(kobj->name, "mtty2"))
+ ports = 2;
+ else
+ return -EINVAL;
+
+ list_for_each_entry(mds, &mdev_devices_list, next) {
+ used += mds->nr_ports;
+ }
+ return sprintf(buf, "%d\n", (MAX_MTTYS - used)/ports);
}
MDEV_TYPE_ATTR_RO(available_instances);
@@ -1243,13 +1269,19 @@ static struct attribute *mdev_types_attrs[] = {
NULL,
};
-static struct attribute_group mdev_type_group = {
+static struct attribute_group mdev_type_group1 = {
.name = "mtty1",
.attrs = mdev_types_attrs,
};
+static struct attribute_group mdev_type_group2 = {
+ .name = "mtty2",
+ .attrs = mdev_types_attrs,
+};
+
struct attribute_group *mdev_type_groups[] = {
- &mdev_type_group,
+ &mdev_type_group1,
+ &mdev_type_group2,
NULL,
};