Hi Wanghaibin, On 12/09/2017 13:15, wanghaibin wrote: > On 2017/9/11 2:46, Auger Eric wrote: > >> Hi Wanghaibin, >> >> On 07/09/2017 13:28, Auger Eric wrote: >>> Hi Wanghaibin, >>> >>> On 07/09/2017 03:32, wanghaibin wrote: >>>> On 2017/9/7 0:20, Auger Eric wrote: >>>> >>>>> Hi, >>>>> >>>>> On 06/09/2017 15:05, wanghaibin wrote: >>>>>> This patch fix the migrate save tables failure. >>>>>> >>>>>> When the virtual machine is in booting and the devices haven't initialized, >>>>>> the all virtual dte/ite may be invalid. If migrate at this moment, the save >>>>>> tables interface traversal device list, and check the dte is valid or not. >>>>>> if not, it will return the -EINVAL. >>>>> >>>>> The issue on save is less clear to me. We are not checking the "dte" are >>>>> valid as it is said above. We are scrolling the ITS lists - which may be >>>>> empty - and dump them in guest memory. >>>>> >>>>> On save() there are quite few checks that can cause a failure. >>>>> vgic_its_check_id() can be among them. This typically requires the >>>>> GITS_BASER to have been properly set. Failing on save looks OK to me in >>>>> such situation. >>>>> >>>>> Sorry but I don't get the purpose of this patch. Does it fix a save failure? >>>> >>>> >>>> Yes, for save, vgic_its_check_id() func will check the L1 DTE valid or not through >>>> the code like : >>>> >>>> /* Each 1st level entry is represented by a 64-bit value. */ >>>> if (kvm_read_guest(its->dev->kvm, >>>> BASER_ADDRESS(baser) + index * sizeof(indirect_ptr), >>>> &indirect_ptr, sizeof(indirect_ptr))) >>>> return false; >>>> >>>> indirect_ptr = le64_to_cpu(indirect_ptr); >>>> >>>> /* check the valid bit of the first level entry */ >>>> if (!(indirect_ptr & BIT_ULL(63))) >>>> return false; >>>> >>>> If invalid , the save will return -EINVAL caused by the vgic_its_check_id() with return the false value. >>>> >>>> And form the cover letter, the problem happened when no one pci dev has been probed( guest driver haven't any >>>> mapd, mapti), So the L1 DTEs are all invalid currently. Just like you said, at this moment migrate, we are scrolling >>>> the ITS lists, next time check_id failed and save interface failed. >>>> >>>> I think the final reason is the device list free problem, at the reset/reboot, ITS dev/clo/itt lists are not be free >>>> and set NULL. So that, the save interface failed. >>>> This patch try to free the resource when vm reboot/reset. >>> OK understood. Indeed none of the device/collection lists should be non >>> empty at that stage, ie. when GITS_BASERn have not be written yet and >>> are marked invalid. >>> >>> For solving the specific save() issue here, I think the best is to check >>> the validity bit of the GITS_BASER (col, device) and if invalid do nothing. >> >> Actually the above proposal does not work as GITS_BASERn is not properly >> reset. Maybe the best way is to introduce an ITS KVM device reset IOTCL >> in the control group. Upon this command we could properly reset the >> requested registers and the lists. > > > Yes, It should free these lists when vits reset. > > This patch according the has_run_once and vcpu_init to mark the vcpu reset happened, > and scrolling all kvm devices to find the vits device to free the lists. > I think it's a little odd too. > > If we can add the reset IOCTL, I think it must be the best way. I looked at the qemu reset of the GICv3 and my understanding is the QEMU reset function sends reset values for each individual register. So my understanding is I should align QEMU ITS reset code. Then, we should free internal lists in the relevant register write when detecting the valid bit is not set. > > Thanks. > >> >> Thanks >> >> Eric >>> >>> Then we need to have a more global discussion about whether, when and >>> where the device and collection lists need to be freed. >>> >>> If you want I can respin with above suggestion and add the valid pointer >>> to the entry_fn_t to handle the restore path. Up to you. > > > All along, I want to contribute code to the community, so far, It has not been achieved. > So I would like to collect the solutions for this problem and try to fix it first, can I? OK sure. So I will send a QEMU patch today and I will review your kernel fixes then. Thanks Eric > > Thanks. > >>> >>> Thanks >>> >>> Eric >>> >>> >>>> BTW: these lists will re-bulid when the reboot vm run the probe pci device step. >>> >>>> >>>> Thanks >>>> >>>>> >>>>> Thanks >>>>> >>>>> Eric >>>>> >>>>> >>>>>> >>>>>> This patch try to free the its list resource when vm reboot or reset to avoid this. >>>>>> >>>>>> Signed-off-by: wanghaibin <wanghaibin.wang@xxxxxxxxxx> >>>>>> --- >>>>>> virt/kvm/arm/arm.c | 5 ++++- >>>>>> virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ >>>>>> virt/kvm/arm/vgic/vgic.h | 1 + >>>>>> 3 files changed, 15 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>>>>> index a39a1e1..db7632d 100644 >>>>>> --- a/virt/kvm/arm/arm.c >>>>>> +++ b/virt/kvm/arm/arm.c >>>>>> @@ -46,6 +46,7 @@ >>>>>> #include <asm/kvm_coproc.h> >>>>>> #include <asm/kvm_psci.h> >>>>>> #include <asm/sections.h> >>>>>> +#include "vgic.h" >>>>>> >>>>>> #ifdef REQUIRES_VIRT >>>>>> __asm__(".arch_extension virt"); >>>>>> @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, >>>>>> * Ensure a rebooted VM will fault in RAM pages and detect if the >>>>>> * guest MMU is turned off and flush the caches as needed. >>>>>> */ >>>>>> - if (vcpu->arch.has_run_once) >>>>>> + if (vcpu->arch.has_run_once) { >>>>>> stage2_unmap_vm(vcpu->kvm); >>>>>> + vgic_its_free_resource(vcpu->kvm); >>>>>> + } >>>>>> >>>>>> vcpu_reset_hcr(vcpu); >>>>>> >>>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >>>>>> index 25d614f..5c20352 100644 >>>>>> --- a/virt/kvm/arm/vgic/vgic-its.c >>>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c >>>>>> @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, >>>>>> .has_attr = vgic_its_has_attr, >>>>>> }; >>>>>> >>>>>> +void vgic_its_free_resource(struct kvm *kvm) >>>>>> +{ >>>>>> + struct kvm_device *dev, *tmp; >>>>>> + >>>>>> + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { >>>>>> + if(dev->ops == &kvm_arm_vgic_its_ops) >>>>>> + vgic_its_free_list(kvm, dev->private); >>>>>> + } >>>>>> +} >>>>>> + >>>>>> int kvm_vgic_register_its_device(void) >>>>>> { >>>>>> return kvm_register_device_ops(&kvm_arm_vgic_its_ops, >>>>>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h >>>>>> index c2be5b7..fbcbdfd 100644 >>>>>> --- a/virt/kvm/arm/vgic/vgic.h >>>>>> +++ b/virt/kvm/arm/vgic/vgic.h >>>>>> @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, >>>>>> >>>>>> bool lock_all_vcpus(struct kvm *kvm); >>>>>> void unlock_all_vcpus(struct kvm *kvm); >>>>>> +void vgic_its_free_resource(struct kvm *kvm); >>>>>> >>>>>> #endif >>>>>> >>>>> >>>>> . >>>>> >>>> >>>> >>>> >> >> . >> > > > _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm