On Thu, 11 Feb 2021 16:10:16 +0000 Alexandru Elisei <alexandru.elisei@xxxxxxx> wrote: Hi, > On 12/10/20 2:28 PM, Andre Przywara wrote: > > In their core functionality MMIO and I/O port traps are not really > > different, yet we still have two totally separate code paths for > > handling them. Devices need to decide on one conduit or need to provide > > different handler functions for each of them. > > > > Extend the existing MMIO emulation to also cover ioport handlers. > > This just adds another RB tree root for holding the I/O port handlers, > > but otherwise uses the same tree population and lookup code. > > Maybe I'm missing something, but why two trees? Is it valid to have an overlap > between IO port and MMIO emulation? Or was it done to make the removal of ioport > emulation easier? So I thought about it as well, but figured it's easier this way: - x86 allows overlap, PIO is a totally separate address space from memory/MMIO. Early x86 CPUs had pins to indicate a PIO bus cycle, but using the same address and data pins otherwise. In practise there might be no overlap when it comes to *MMIO* traps vs PIO on x86 (there is DRAM only at the lowest 64K of the IBM PC memory map), but not sure we should rely on this. - For non-x86 this would indeed be non-overlapping, but this would need to be translated at init time then? And then we can't move those anymore, I guess? So I found it cleaner to keep this separate, and do the translation at trap time. - As a consequence we would need to have a bit indicating the address space. I haven't actually tried this, but my understanding is that this would spoil the whole rb_tree functions, since they rely on a linear addressing scheme, and adding another bit there would be at least cumbersome? At the end I decided to go for separate trees, as also this was less change. I agree that it would be nice to have one tree, from a design point of view, but I am afraid that would require more changes. If need be, I think we can always unify them later on, on top of this series? > If it's not valid to have that overlap, then I think having one tree for both > would better. Struct mmio_mapping would have to be augmented with a flags field > that holds the same flags given to kvm__register_iotrap to differentiate between > the two slightly different emulations. Saving the IOTRAP_COALESCE flag would also > make it trivial to call KVM_UNREGISTER_COALESCED_MMIO in kvm__deregister_iotrap, > which we currently don't do. > > > "ioport" or "mmio" just become a flag in the registration function. > > Provide wrappers to not break existing users, and allow an easy > > transition for the existing ioport handlers. > > > > This also means that ioport handlers now can use the same emulation > > callback prototype as MMIO handlers, which means we have to migrate them > > over. To allow a smooth transition, we hook up the new I/O emulate > > function to the end of the existing ioport emulation code. > > I'm sorry, but I don't understand that last sentence. Do you mean that the ioport > emulation code has been modified to use kvm__emulate_pio() as a fallback for when > the port is not found in the ioport_tree? I meant that for the transition period we have all of traditional MMIO, traditional PIO, *and* just transformed PIO. That means there are still PIO devices registered through ioport.c's ioport__register(), *and* PIO devices registered through mmio.c's kvm__register_pio(). Which means they end up in two separate PIO trees. And only the traditional kvm__emulate_io() from ioport.c is called upon a trap, so it needs to check both trees, which it does by calling into kvm__emulate_pio(), shall a search in the local tree fail. Or did you mean something else? > > > > Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx> > > --- > > include/kvm/kvm.h | 42 +++++++++++++++++++++++++++++---- > > ioport.c | 4 ++-- > > mmio.c | 59 +++++++++++++++++++++++++++++++++++++++-------- > > 3 files changed, 89 insertions(+), 16 deletions(-) > > > > diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h > > index ee99c28e..14f9d58b 100644 > > --- a/include/kvm/kvm.h > > +++ b/include/kvm/kvm.h > > @@ -27,10 +27,16 @@ > > #define PAGE_SIZE (sysconf(_SC_PAGE_SIZE)) > > #endif > > > > +#define IOTRAP_BUS_MASK 0xf > > It's not immediately obvious what this mask does. It turns out it's used to mask > the enum flags defined in the header devices.h, header which is not included in > this file. > > The flag names we pass to kvm__register_iotrap() are slightly inconsistent > (DEVICE_BUS_PCI, DEVICE_BUS_MMIO and IOTRAP_COALESCE), where DEVICE_BUS_{PCI, > MMIO} come from devices.h as an enum. I was wondering if I'm missing something and > there is a particular reason why we don't define our own flags for that here > (something like IOTRAP_PIO and IOTRAP_MMIO). I am not sure why this would be needed? We already define and use DEVICE_BUS_x elsewhere, so why not re-use it? > If we do decide to keep the flags from devices.h, I think it would be worth it to > have a compile time check (with BUILD_BUG_ON) that IOTRAP_BUS_MASK is >= > DEVICES_BUS_MAX, which would also be a good indication of where those flags are > coming from. Well, if that makes you happy, I am not sure we gain another 13 bus types anytime soon, though ;-) > > > +#define IOTRAP_COALESCE (1U << 4) > > + > > #define DEFINE_KVM_EXT(ext) \ > > .name = #ext, \ > > .code = ext > > > > +struct kvm_cpu; > > +typedef void (*mmio_handler_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, > > + u32 len, u8 is_write, void *ptr); > > typedef void (*fdt_irq_fn)(void *fdt, u8 irq, enum irq_type); > > > > enum { > > @@ -113,6 +119,8 @@ void kvm__irq_line(struct kvm *kvm, int irq, int level); > > void kvm__irq_trigger(struct kvm *kvm, int irq); > > bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction, int size, u32 count); > > bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u8 is_write); > > +bool kvm__emulate_pio(struct kvm_cpu *vcpu, u16 port, void *data, > > + int direction, int size, u32 count); > > int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr); > > int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr, > > enum kvm_mem_type type); > > @@ -136,10 +144,36 @@ static inline int kvm__reserve_mem(struct kvm *kvm, u64 guest_phys, u64 size) > > KVM_MEM_TYPE_RESERVED); > > } > > > > -int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool coalesce, > > - void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr), > > - void *ptr); > > -bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr); > > +int __must_check kvm__register_iotrap(struct kvm *kvm, u64 phys_addr, u64 len, > > + mmio_handler_fn mmio_fn, void *ptr, > > + unsigned int flags); > > + > > +static inline > > +int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr, > > + u64 phys_addr_len, bool coalesce, > > + mmio_handler_fn mmio_fn, void *ptr) > > +{ > > + return kvm__register_iotrap(kvm, phys_addr, phys_addr_len, mmio_fn, ptr, > > + DEVICE_BUS_MMIO | (coalesce ? IOTRAP_COALESCE : 0)); > > +} > > +static inline > > +int __must_check kvm__register_pio(struct kvm *kvm, u16 port, u16 len, > > + mmio_handler_fn mmio_fn, void *ptr) > > +{ > > + return kvm__register_iotrap(kvm, port, len, mmio_fn, ptr, > > + DEVICE_BUS_IOPORT); > > +} > > + > > +bool kvm__deregister_iotrap(struct kvm *kvm, u64 phys_addr, unsigned int flags); > > +static inline bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr) > > +{ > > + return kvm__deregister_iotrap(kvm, phys_addr, DEVICE_BUS_MMIO); > > +} > > +static inline bool kvm__deregister_pio(struct kvm *kvm, u16 port) > > +{ > > + return kvm__deregister_iotrap(kvm, port, DEVICE_BUS_IOPORT); > > +} > > + > > void kvm__reboot(struct kvm *kvm); > > void kvm__pause(struct kvm *kvm); > > void kvm__continue(struct kvm *kvm); > > diff --git a/ioport.c b/ioport.c > > index b98836d3..204d8103 100644 > > --- a/ioport.c > > +++ b/ioport.c > > @@ -147,7 +147,8 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction, > > > > entry = ioport_get(&ioport_tree, port); > > if (!entry) > > - goto out; > > + return kvm__emulate_pio(vcpu, port, data, direction, > > + size, count); > > I have to admit this gave me pause because this patch doesn't add any users for > kvm__register_pio() (although with this change the behaviour of kvm__emulate_io() > remains exactly the same). Do you think this change would fit better in patch #7, > where the first user for kvm__register_pio() is added, or do you prefer it here? I think it logically belongs here, as we introduce the kvm__emulate_pio() function here as well. Otherwise this function would have no caller. As it is now, it's just a "coincidence" that no one actually called kvm__register_pio() so far. This also makes the other patches movable and replaceable: this patch prepares the stage, the follow-up patches just fill it. > > > > ops = entry->ops; > > > > @@ -162,7 +163,6 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 > > port, void *data, int direction, > > ioport_put(&ioport_tree, entry); > > > > -out: > > if (ret) > > return true; > > > > diff --git a/mmio.c b/mmio.c > > index cd141cd3..4cce1901 100644 > > --- a/mmio.c > > +++ b/mmio.c > > @@ -19,13 +19,14 @@ static DEFINE_MUTEX(mmio_lock); > > > > struct mmio_mapping { > > struct rb_int_node node; > > - void (*mmio_fn)(struct kvm_cpu > > *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr); > > + mmio_handler_fn mmio_fn; > > void *ptr; > > u32 refcount; > > bool remove; > > }; > > > > static struct rb_root mmio_tree = RB_ROOT; > > +static struct rb_root pio_tree = RB_ROOT; > > > > static struct mmio_mapping *mmio_search(struct rb_root *root, u64 > > addr, u64 len) { > > @@ -103,9 +104,9 @@ static void mmio_put(struct kvm *kvm, struct > > rb_root *root, struct mmio_mapping mutex_unlock(&mmio_lock); > > } > > > > -int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 > > phys_addr_len, bool coalesce, > > - void (*mmio_fn)(struct kvm_cpu *vcpu, u64 > > addr, u8 *data, u32 len, u8 is_write, void *ptr), > > - void *ptr) > > +int kvm__register_iotrap(struct kvm *kvm, u64 phys_addr, u64 > > phys_addr_len, > > + mmio_handler_fn mmio_fn, void *ptr, > > + unsigned int flags) > > { > > struct mmio_mapping *mmio; > > struct kvm_coalesced_mmio_zone zone; > > @@ -127,7 +128,7 @@ int kvm__register_mmio(struct kvm *kvm, u64 > > phys_addr, u64 phys_addr_len, bool c .remove = false, > > }; > > > > - if (coalesce) { > > + if (flags & IOTRAP_COALESCE) { > > There is no such flag being used in ioport.c, is it valid to have the > flags DEVICE_BUS_IOPORT and IOTRAP_COALESCE set at the same time? Well, yes and no: Yes, as this maps to MMIO on non-x86, so theoretically could use the flag. No, as no one registering a trap handler through kvm__register_pio() would ever have the chance to set this flag. I can check for the registration being for the MMIO bus before entering the "if" branch, if that is what you mean? > > > zone = (struct kvm_coalesced_mmio_zone) { > > .addr = phys_addr, > > .size = phys_addr_len, > > @@ -139,18 +140,27 @@ int kvm__register_mmio(struct kvm *kvm, u64 > > phys_addr, u64 phys_addr_len, bool c } > > } > > mutex_lock(&mmio_lock); > > - ret = mmio_insert(&mmio_tree, mmio); > > + if ((flags & IOTRAP_BUS_MASK) == DEVICE_BUS_IOPORT) > > + ret = mmio_insert(&pio_tree, mmio); > > + else > > + ret = mmio_insert(&mmio_tree, mmio); > > mutex_unlock(&mmio_lock); > > > > return ret; > > } > > > > -bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr) > > +bool kvm__deregister_iotrap(struct kvm *kvm, u64 phys_addr, > > unsigned int flags) { > > struct mmio_mapping *mmio; > > + struct rb_root *tree; > > + > > + if ((flags & IOTRAP_BUS_MASK) == DEVICE_BUS_IOPORT) > > + tree = &pio_tree; > > + else > > + tree = &mmio_tree; > > > > mutex_lock(&mmio_lock); > > - mmio = mmio_search_single(&mmio_tree, phys_addr); > > + mmio = mmio_search_single(tree, phys_addr); > > if (mmio == NULL) { > > mutex_unlock(&mmio_lock); > > return false; > > @@ -167,7 +177,7 @@ bool kvm__deregister_mmio(struct kvm *kvm, u64 > > phys_addr) > > * called mmio_put(). This will trigger use-after-free > > errors on VCPU0. */ > > if (mmio->refcount == 0) > > - mmio_deregister(kvm, &mmio_tree, mmio); > > + mmio_deregister(kvm, tree, mmio); > > else > > mmio->remove = true; > > mutex_unlock(&mmio_lock); > > @@ -175,7 +185,8 @@ bool kvm__deregister_mmio(struct kvm *kvm, u64 > > phys_addr) return true; > > } > > > > -bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 > > *data, u32 len, u8 is_write) +bool kvm__emulate_mmio(struct kvm_cpu > > *vcpu, u64 phys_addr, u8 *data, > > + u32 len, u8 is_write) > > I don't think style changes should be part of this patch, the patch > is large enough as it is. I see, I just figured it's not worth a separate patch either. Cheers, Andre > > { > > struct mmio_mapping *mmio; > > > > @@ -194,3 +205,31 @@ bool kvm__emulate_mmio(struct kvm_cpu *vcpu, > > u64 phys_addr, u8 *data, u32 len, u out: > > return true; > > } > > + > > +bool kvm__emulate_pio(struct kvm_cpu *vcpu, u16 port, void *data, > > + int direction, int size, u32 count) > > +{ > > + struct mmio_mapping *mmio; > > + bool is_write = direction == KVM_EXIT_IO_OUT; > > + > > + mmio = mmio_get(&pio_tree, port, size); > > + if (!mmio) { > > + if (vcpu->kvm->cfg.ioport_debug) { > > + fprintf(stderr, "IO error: %s port=%x, > > size=%d, count=%u\n", > > + to_direction(direction), port, > > size, count); + > > + return false; > > + } > > + return true; > > + } > > + > > + while (count--) { > > + mmio->mmio_fn(vcpu, port, data, size, is_write, > > mmio->ptr); + > > + data += size; > > + } > > + > > + mmio_put(vcpu->kvm, &pio_tree, mmio); > > + > > + return true; > > +}