* Auger Eric (eric.auger@xxxxxxxxxx) wrote: > Hi, > > On 07/10/2018 12:48 AM, Christoffer Dall wrote: > > On Mon, Jul 09, 2018 at 09:42:40AM +0100, Marc Zyngier wrote: > >> On 04/07/18 10:38, Christoffer Dall wrote: > >>> Implement the required MMIO accessors for GICv2 and GICv3 for the > >>> IGROUPR distributor and redistributor registers. > >>> > >>> This can allow guests to change behavior compared to running on previous > >>> versions of KVM, but only to align with the architecture and hardware > >>> implementations. > >>> > >>> This also allows userspace to configure the groups for interrupts. Note > >>> that this potentially results in GICv2 guests not receiving interrupts > >>> after migration if migrating from an older kernel that exposes GICv2 > >>> interrupts as group 1. > >>> > >>> Cc: Andrew Jones <drjones@xxxxxxxxxx> > >>> Signed-off-by: Christoffer Dall <christoffer.dall@xxxxxxx> > >>> --- > >>> I implemented (but stashed) a version of this which predicated the > >>> behavior based on the value of GICD_IIDR revision field, falling back to > >>> ignoring writes and resetting GICv2 groups to 0 if the guest wrote a > >>> revision less than 2. However, current QEMU implementations simply > >>> don't write the GICD_IIDR, so this doesn't help at all without changing > >>> QEMU anyhow. > >>> > >>> The only actual fix I can see here to work around the problem in the > >>> kernel is to require an opt-in to allow restoring groups from userspace, > >>> but that's a lot of logic to support cross-kernel version migration. > >>> > >>> Question: Do we expect that cross-kernel version migration is a critical > >>> feature that people really expect to work, and do we actually have > >>> examples of catering to this in the kernel elsewhere? (Also, how would > >>> then that relate to the whole 'adding a new sysreg breaks migration' > >>> situation?) > >> > >> I don't really get why QEMU doesn't try to restore GICD_IIDR, while it > >> is definitely trying to restore RO sysregs (and that's how we detect > >> incompatibilities). > >> > >> I think we should at least give userspace a chance to do the right > >> thing. If it doesn't, well, too bad. > > > > This series should give userspace an option to adjust its behavior. > > > > My main concern is that this version of the series results in the worst > > kind of migration failures, where the guest simply doesn't run on the > > destination side with no warnings or error messages returned to the > > user.. > > Adding Dave in the loop to comment about general user perspective. Without understanding the details of the GIC, but I have to agree that a failure without error where the guest is hung is one of the worst cases - when a user comes to you saying that their VM hangs after migration with no diagnostics, then you know you're in for some nasty debug! On other architectures we definitely provide this level of compatibility between kernels, libraries, qemu and everything in between - it's not easy, and we do screwup from time to time; but it's still what we try and get right. So ideally you'd make this switchable and wire it into a versioned machine type in QEMU, so that only virt-3.0 VMs would use this (and they'd somehow know they needed the new kernel to do it). If that's not possible then you could add a subsection to the GIC migration data if you can detect at migration time that this feature is being used, and make the destination check for the feature/kernel. Migrating to an older qemu would fail since it wouldn't know about the new subsection. This should at least get a clear failure. For a user, this still gets messy if they do something like start upgrading some of the hosts in an openstack cluster, which they often do incrementally; so you'll suddenly get the situation of a VM that was started on a host with a newer kernel being migrated to an older one and stuff breaks. Dave > > > > We could add logic to return an error code if trying to write a > > different revision than what the kernel has (similar to the invariant > > sysregs), so that a simple fix to QEMU to save restore the GICD_IIDR > > register at least results in an error being returned to userspace. > > > > However, as QEMU doesn't do anything useful here today (not blaming > > anyone, I wrote the apparently broken GIC save/restore code for QEMU > > myself), we could also argue that QEMU might as well just fix things up > > if it detects a different IIDR. > > > >> > >> How bad is that "writable GICD_IIDR" patch? Because at this stage, and > >> in the absence of any comment, I'm close to just pick that series and > >> merge it for 4.19. > >> > > > > My guess is that a patch to "save/restore GICD_IIDR" is simple enough, > > but requires additional logic in the kernel that returns an error if the > > GICD_IIDR don't match on write. > > > > A patch which changes the groups and bumps the IIDR in userspace is > > probably more complex. > > > > Sounds like I should add the GICD_IIDR checking patch. Thoughts? > > > > What I would really like to know is whether this is really an issue or > > not. Do people who run products based on KVM, such as RHEV, have any > > expectations about cross-kernel version migration? > > Thanks > > Eric > > > > > > Thanks, > > -Christoffer > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > -- Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK