On Wed, Apr 03, 2013 at 11:53:59AM +0200, Paolo Bonzini wrote: > Il 03/04/2013 11:45, Michael S. Tsirkin ha scritto: > > On Wed, Apr 03, 2013 at 11:28:55AM +0200, Paolo Bonzini wrote: > >> Il 03/04/2013 10:59, Michael S. Tsirkin ha scritto: > >>> This device is used for kvm unit tests, > >>> currently it supports testing performance of ioeventfd. > >>> Using updated kvm unittest, here's an example output: > >>> mmio-no-eventfd:pci-mem 8796 > >>> mmio-wildcard-eventfd:pci-mem 3609 > >>> mmio-datamatch-eventfd:pci-mem 3685 > >>> portio-no-eventfd:pci-io 5287 > >>> portio-wildcard-eventfd:pci-io 1762 > >>> portio-datamatch-eventfd:pci-io 1777 > >> > >> There is too much magic in this device, that is shared between the > >> testcase and the device. > > > > Paolo, it looks like there's some misunderstanding. > > There's no magic. Each BAR has the structure specified > > in both test and qemu: > > > > struct pci_test_dev_hdr { > > uint8_t test; -- test number > > uint8_t width; - how much to write > > uint8_t pad0[2]; > > uint32_t offset; - where to write > > uint32_t data; - what to write > > uint32_t count; - incremented on write > > uint8_t name[]; -- test name for debugging > > }; > > > > What you propose below has a bit less features > > if you add these you will get pretty much same thing. > > I think it has the same features, except the guest is in charge of > enabling/disabling ioeventfd. > > >> I think this device should be structured > >> differently. For example, you could have a single EventNotifier and 3 BARs: > >> > >> 1) BAR0/BAR1 are as in your device. All they do is count writes and > >> report them into a register in BAR2. > >> > >> 2) BAR2 has a control register for BAR0 and one for BAR1, that > >> enables/disables ioeventfd (bit 0 = enable/disable, bit 1 = > >> wildcard/datamatch, bits 2-3 = log2(width)). It also has a zero-on-read > >> counter that is incremented for each write to BAR0 or BAR1, and clears > >> the EventNotifier. > > > > This is pretty close to what I have, only I put control > > at the beginning of each BAR, and I support arbitrary > > length writes guest side, and I allow testing both > > datamatch and non data match. > > Yes, all of this is possible with my design too, the differences are: > * "what to write" is a fixed value (could be 0, or all-ones, whatever). > * the offset is a fixed value too (could be 0, or could be written in > other bits of the control register) > * no test names for debugging, because the guest picks the tests Heh but host and guest must agree on how to trigger each test. This means each time I add a test I need to update guest and host side. > For example: > > mmio-no-eventfd:pci-mem 0000 to BAR2's first control register > mmio-wildcard-eventfd:pci-mem 1001 to BAR2's first control register > mmio-datamatch-eventfd:pci-mem 1011 to BAR2's first control register > portio-no-eventfd:pci-io 0000 to BAR2's second register > portio-wildcard-eventfd:pci-io 1001 to BAR2's second register > portio-datamatch-eventfd:pci-io 1011 to BAR2's second register Exactly. I prefer a flexible structure so I can change things host side. For example register multiple datamatches with same address and compare speed of access on first and last one. Similarly for address. But there's no less magic because you share constants. > > I also do not want guest to control things like ioeventfd > > assignment. These are host side things and while you could > > argue security does not matter for a test device, > > I think it will set a bad example. > > Test devices exist for the purpose of doing things that you won't do in > normal devices. :) You might not care about setting bad examples for ISA but I care about setting bad examples for PCI. In particular this one actually is a completely normal PCI device. And what it does it very much like what we are discussing for virtio pci new layout. > > So any new benchmark can be added pretty much on qemu > > side without changing test. > > You could say the same with the control registers (new benchmarks can be > added on the test side without changing QEMU, for example benchmarks > using different write sizes. And then we'll get back to a layout kind of like what we have here, except guest side. Again, I have in mind several more tests so this is structured in a way that will make adding them easier. > (BTW, and unrelated to this, please use default-configs/ and > hw/Makefile.objs instead of hw/i386/Makefile.objs to enable the device). > > Paolo Okay but why does pc-test device sit in hw/i386/Makefile.objs ? > > > > > >>> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > >>> --- > >>> hw/i386/Makefile.objs | 2 +- > >>> hw/pci-testdev.c | 306 ++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> hw/pci/pci.h | 1 + > >>> 3 files changed, 308 insertions(+), 1 deletion(-) > >>> create mode 100644 hw/pci-testdev.c > >>> > >>> diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs > >>> index a78c0b2..7497e7a 100644 > >>> --- a/hw/i386/Makefile.objs > >>> +++ b/hw/i386/Makefile.objs > >>> @@ -11,7 +11,7 @@ obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o > >>> obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_msi.o > >>> obj-y += kvm/ > >>> obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o > >>> -obj-y += pc-testdev.o > >>> +obj-y += pc-testdev.o pci-testdev.o > >>> > >>> obj-y := $(addprefix ../,$(obj-y)) > >>> > >>> diff --git a/hw/pci-testdev.c b/hw/pci-testdev.c > >>> new file mode 100644 > >>> index 0000000..9486624 > >>> --- /dev/null > >>> +++ b/hw/pci-testdev.c > >>> @@ -0,0 +1,306 @@ > >>> +#include "hw/hw.h" > >>> +#include "hw/pci/pci.h" > >>> +#include "qemu/event_notifier.h" > >>> +#include "qemu/osdep.h" > >>> + > >>> +typedef struct PCITestDevHdr { > >>> + uint8_t test; > >>> + uint8_t width; > >>> + uint8_t pad0[2]; > >>> + uint32_t offset; > >>> + uint8_t data; > >>> + uint8_t pad1[3]; > >>> + uint32_t count; > >>> + uint8_t name[]; > >>> +} PCITestDevHdr; > >>> + > >>> +typedef struct IOTest { > >>> + MemoryRegion *mr; > >>> + EventNotifier notifier; > >>> + bool hasnotifier; > >>> + unsigned size; > >>> + bool match_data; > >>> + PCITestDevHdr *hdr; > >>> + unsigned bufsize; > >>> +} IOTest; > >>> + > >>> +#define IOTEST_DATAMATCH 0xFA > >>> +#define IOTEST_NOMATCH 0xCE > >>> + > >>> +#define IOTEST_IOSIZE 128 > >>> +#define IOTEST_MEMSIZE 2048 > >>> + > >>> +static const char *iotest_test[] = { > >>> + "no-eventfd", > >>> + "wildcard-eventfd", > >>> + "datamatch-eventfd" > >>> +}; > >>> + > >>> +static const char *iotest_type[] = { > >>> + "mmio", > >>> + "portio" > >>> +}; > >>> + > >>> +#define IOTEST_TEST(i) (iotest_test[((i) % ARRAY_SIZE(iotest_test))]) > >>> +#define IOTEST_TYPE(i) (iotest_type[((i) / ARRAY_SIZE(iotest_test))]) > >>> +#define IOTEST_MAX_TEST (ARRAY_SIZE(iotest_test)) > >>> +#define IOTEST_MAX_TYPE (ARRAY_SIZE(iotest_type)) > >>> +#define IOTEST_MAX (IOTEST_MAX_TEST * IOTEST_MAX_TYPE) > >>> + > >>> +enum { > >>> + IOTEST_ACCESS_NAME, > >>> + IOTEST_ACCESS_DATA, > >>> + IOTEST_ACCESS_MAX, > >>> +}; > >>> + > >>> +#define IOTEST_ACCESS_TYPE uint8_t > >>> +#define IOTEST_ACCESS_WIDTH (sizeof(uint8_t)) > >>> + > >>> +typedef struct PCITestDevState { > >>> + PCIDevice dev; > >>> + MemoryRegion mmio; > >>> + MemoryRegion portio; > >>> + IOTest *tests; > >>> + int current; > >>> +} PCITestDevState; > >>> + > >>> +#define IOTEST_IS_MEM(i) (strcmp(IOTEST_TYPE(i), "portio")) > >>> +#define IOTEST_REGION(d, i) (IOTEST_IS_MEM(i) ? &(d)->mmio : &(d)->portio) > >>> +#define IOTEST_SIZE(i) (IOTEST_IS_MEM(i) ? IOTEST_MEMSIZE : IOTEST_IOSIZE) > >>> +#define IOTEST_PCI_BAR(i) (IOTEST_IS_MEM(i) ? PCI_BASE_ADDRESS_SPACE_MEMORY : \ > >>> + PCI_BASE_ADDRESS_SPACE_IO) > >>> + > >>> +static int pci_testdev_start(IOTest *test) > >>> +{ > >>> + test->hdr->count = 0; > >>> + if (!test->hasnotifier) { > >>> + return 0; > >>> + } > >>> + event_notifier_test_and_clear(&test->notifier); > >>> + memory_region_add_eventfd(test->mr, > >>> + le32_to_cpu(test->hdr->offset), > >>> + test->size, > >>> + test->match_data, > >>> + test->hdr->data, > >>> + &test->notifier); > >>> + return 0; > >>> +} > >>> + > >>> +static void pci_testdev_stop(IOTest *test) > >>> +{ > >>> + if (!test->hasnotifier) { > >>> + return; > >>> + } > >>> + memory_region_del_eventfd(test->mr, > >>> + le32_to_cpu(test->hdr->offset), > >>> + test->size, > >>> + test->match_data, > >>> + test->hdr->data, > >>> + &test->notifier); > >>> +} > >>> + > >>> +static void > >>> +pci_testdev_reset(PCITestDevState *d) > >>> +{ > >>> + if (d->current == -1) { > >>> + return; > >>> + } > >>> + pci_testdev_stop(&d->tests[d->current]); > >>> + d->current = -1; > >>> +} > >>> + > >>> +static void pci_testdev_inc(IOTest *test, unsigned inc) > >>> +{ > >>> + uint32_t c = le32_to_cpu(test->hdr->count); > >>> + test->hdr->count = cpu_to_le32(c + inc); > >>> +} > >>> + > >>> +static void > >>> +pci_testdev_write(void *opaque, hwaddr addr, uint64_t val, > >>> + unsigned size, int type) > >>> +{ > >>> + PCITestDevState *d = opaque; > >>> + IOTest *test; > >>> + int t, r; > >>> + > >>> + if (addr == offsetof(PCITestDevHdr, test)) { > >>> + pci_testdev_reset(d); > >>> + if (val >= IOTEST_MAX_TEST) { > >>> + return; > >>> + } > >>> + t = type * IOTEST_MAX_TEST + val; > >>> + r = pci_testdev_start(&d->tests[t]); > >>> + if (r < 0) { > >>> + return; > >>> + } > >>> + d->current = t; > >>> + return; > >>> + } > >>> + if (d->current < 0) { > >>> + return; > >>> + } > >>> + test = &d->tests[d->current]; > >>> + if (addr != le32_to_cpu(test->hdr->offset)) { > >>> + return; > >>> + } > >>> + if (test->match_data && test->size != size) { > >>> + return; > >>> + } > >>> + if (test->match_data && val != test->hdr->data) { > >>> + return; > >>> + } > >>> + pci_testdev_inc(test, 1); > >>> +} > >>> + > >>> +static uint64_t > >>> +pci_testdev_read(void *opaque, hwaddr addr, unsigned size) > >>> +{ > >>> + PCITestDevState *d = opaque; > >>> + const char *buf; > >>> + IOTest *test; > >>> + if (d->current < 0) { > >>> + return 0; > >>> + } > >>> + test = &d->tests[d->current]; > >>> + buf = (const char *)test->hdr; > >>> + if (addr + size >= test->bufsize) { > >>> + return 0; > >>> + } > >>> + if (test->hasnotifier) { > >>> + event_notifier_test_and_clear(&test->notifier); > >>> + } > >>> + return buf[addr]; > >>> +} > >>> + > >>> +static void > >>> +pci_testdev_mmio_write(void *opaque, hwaddr addr, uint64_t val, > >>> + unsigned size) > >>> +{ > >>> + pci_testdev_write(opaque, addr, val, size, 0); > >>> +} > >>> + > >>> +static void > >>> +pci_testdev_pio_write(void *opaque, hwaddr addr, uint64_t val, > >>> + unsigned size) > >>> +{ > >>> + pci_testdev_write(opaque, addr, val, size, 1); > >>> +} > >>> + > >>> +static const MemoryRegionOps pci_testdev_mmio_ops = { > >>> + .read = pci_testdev_read, > >>> + .write = pci_testdev_mmio_write, > >>> + .endianness = DEVICE_LITTLE_ENDIAN, > >>> + .impl = { > >>> + .min_access_size = 1, > >>> + .max_access_size = 1, > >>> + }, > >>> +}; > >>> + > >>> +static const MemoryRegionOps pci_testdev_pio_ops = { > >>> + .read = pci_testdev_read, > >>> + .write = pci_testdev_pio_write, > >>> + .endianness = DEVICE_LITTLE_ENDIAN, > >>> + .impl = { > >>> + .min_access_size = 1, > >>> + .max_access_size = 1, > >>> + }, > >>> +}; > >>> + > >>> +static int pci_testdev_init(PCIDevice *pci_dev) > >>> +{ > >>> + PCITestDevState *d = DO_UPCAST(PCITestDevState, dev, pci_dev); > >>> + uint8_t *pci_conf; > >>> + char *name; > >>> + int r, i; > >>> + > >>> + pci_conf = d->dev.config; > >>> + > >>> + pci_conf[PCI_INTERRUPT_PIN] = 0; /* no interrupt pin */ > >>> + > >>> + memory_region_init_io(&d->mmio, &pci_testdev_mmio_ops, d, > >>> + "pci-testdev-mmio", IOTEST_MEMSIZE * 2); > >>> + memory_region_init_io(&d->portio, &pci_testdev_pio_ops, d, > >>> + "pci-testdev-portio", IOTEST_IOSIZE * 2); > >>> + pci_register_bar(&d->dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &d->mmio); > >>> + pci_register_bar(&d->dev, 1, PCI_BASE_ADDRESS_SPACE_IO, &d->portio); > >>> + > >>> + d->current = -1; > >>> + d->tests = g_malloc0(IOTEST_MAX * sizeof *d->tests); > >>> + for (i = 0; i < IOTEST_MAX; ++i) { > >>> + IOTest *test = &d->tests[i]; > >>> + name = g_strdup_printf("%s-%s", IOTEST_TYPE(i), IOTEST_TEST(i)); > >>> + test->bufsize = sizeof(PCITestDevHdr) + strlen(name) + 1; > >>> + test->hdr = g_malloc0(test->bufsize); > >>> + memcpy(test->hdr->name, name, strlen(name) + 1); > >>> + g_free(name); > >>> + test->hdr->offset = cpu_to_le32(IOTEST_SIZE(i) + i * IOTEST_ACCESS_WIDTH); > >>> + test->size = IOTEST_ACCESS_WIDTH; > >>> + test->match_data = strcmp(IOTEST_TEST(i), "wildcard-eventfd"); > >>> + test->hdr->test = i; > >>> + test->hdr->data = test->match_data ? IOTEST_DATAMATCH : IOTEST_NOMATCH; > >>> + test->hdr->width = IOTEST_ACCESS_WIDTH; > >>> + test->mr = IOTEST_REGION(d, i); > >>> + if (!strcmp(IOTEST_TEST(i), "no-eventfd")) { > >>> + test->hasnotifier = false; > >>> + continue; > >>> + } > >>> + r = event_notifier_init(&test->notifier, 0); > >>> + assert(r >= 0); > >>> + test->hasnotifier = true; > >>> + } > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static void > >>> +pci_testdev_uninit(PCIDevice *dev) > >>> +{ > >>> + PCITestDevState *d = DO_UPCAST(PCITestDevState, dev, dev); > >>> + int i; > >>> + > >>> + pci_testdev_reset(d); > >>> + for (i = 0; i < IOTEST_MAX; ++i) { > >>> + if (d->tests[i].hasnotifier) { > >>> + event_notifier_cleanup(&d->tests[i].notifier); > >>> + } > >>> + g_free(d->tests[i].hdr); > >>> + } > >>> + g_free(d->tests); > >>> + memory_region_destroy(&d->mmio); > >>> + memory_region_destroy(&d->portio); > >>> +} > >>> + > >>> +static void qdev_pci_testdev_reset(DeviceState *dev) > >>> +{ > >>> + PCITestDevState *d = DO_UPCAST(PCITestDevState, dev.qdev, dev); > >>> + pci_testdev_reset(d); > >>> +} > >>> + > >>> +static void pci_testdev_class_init(ObjectClass *klass, void *data) > >>> +{ > >>> + DeviceClass *dc = DEVICE_CLASS(klass); > >>> + PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); > >>> + > >>> + k->init = pci_testdev_init; > >>> + k->exit = pci_testdev_uninit; > >>> + k->vendor_id = PCI_VENDOR_ID_REDHAT; > >>> + k->device_id = PCI_DEVICE_ID_REDHAT_TEST; > >>> + k->revision = 0x00; > >>> + k->class_id = PCI_CLASS_OTHERS; > >>> + dc->desc = "PCI Test Device"; > >>> + dc->reset = qdev_pci_testdev_reset; > >>> +} > >>> + > >>> +static const TypeInfo pci_testdev_info = { > >>> + .name = "pci-testdev", > >>> + .parent = TYPE_PCI_DEVICE, > >>> + .instance_size = sizeof(PCITestDevState), > >>> + .class_init = pci_testdev_class_init, > >>> +}; > >>> + > >>> +static void pci_testdev_register_types(void) > >>> +{ > >>> + type_register_static(&pci_testdev_info); > >>> +} > >>> + > >>> +type_init(pci_testdev_register_types) > >>> diff --git a/hw/pci/pci.h b/hw/pci/pci.h > >>> index 774369c..d81198c 100644 > >>> --- a/hw/pci/pci.h > >>> +++ b/hw/pci/pci.h > >>> @@ -84,6 +84,7 @@ > >>> #define PCI_DEVICE_ID_REDHAT_SERIAL 0x0002 > >>> #define PCI_DEVICE_ID_REDHAT_SERIAL2 0x0003 > >>> #define PCI_DEVICE_ID_REDHAT_SERIAL4 0x0004 > >>> +#define PCI_DEVICE_ID_REDHAT_TEST 0x0005 > >>> #define PCI_DEVICE_ID_REDHAT_QXL 0x0100 > >>> > >>> #define FMT_PCIBUS PRIx64 > >>> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html