Hi Marc, On 4/1/21 3:42 PM, Marc Zyngier wrote: > Hi Eric, > > On Thu, 01 Apr 2021 09:52:37 +0100, > Eric Auger <eric.auger@xxxxxxxxxx> wrote: >> >> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the >> reporting of GICR_TYPER.Last for userspace") temporarily fixed >> a bug identified when attempting to access the GICR_TYPER >> register before the redistributor region setting, but dropped >> the support of the LAST bit. >> >> Emulating the GICR_TYPER.Last bit still makes sense for >> architecture compliance though. This patch restores its support >> (if the redistributor region was set) while keeping the code safe. >> >> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which >> computes whether a redistributor is the highest one of a series >> of redistributor contributor pages. >> >> The spec says "Indicates whether this Redistributor is the >> highest-numbered Redistributor in a series of contiguous >> Redistributor pages." >> >> The code is a bit convulated since there is no guarantee > > nit: convoluted > >> redistributors are added in a given reditributor region in >> ascending order. In that case the current implementation was >> wrong. Also redistributor regions can be contiguous >> and registered in non increasing base address order. >> >> So the index of redistributors are stored in an array within >> the redistributor region structure. >> >> With this new implementation we do not need to have a uaccess >> read accessor anymore. >> >> Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx> > > This patch also hurt my head, a lot more than the first one. See > below. > >> --- >> arch/arm64/kvm/vgic/vgic-init.c | 7 +-- >> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 ++++++++++++++++++++---------- >> arch/arm64/kvm/vgic/vgic.h | 1 + >> include/kvm/arm_vgic.h | 3 + >> 4 files changed, 73 insertions(+), 35 deletions(-) >> >> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c >> index cf6faa0aeddb2..61150c34c268c 100644 >> --- a/arch/arm64/kvm/vgic/vgic-init.c >> +++ b/arch/arm64/kvm/vgic/vgic-init.c >> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) >> int i; >> >> vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF; >> + vgic_cpu->index = vcpu->vcpu_id; > > Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why > do we need another field? We can always get to the vcpu using a > container_of(). > >> >> INIT_LIST_HEAD(&vgic_cpu->ap_list_head); >> raw_spin_lock_init(&vgic_cpu->ap_list_lock); >> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm) >> dist->vgic_dist_base = VGIC_ADDR_UNDEF; >> >> if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) { >> - list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) { >> - list_del(&rdreg->list); >> - kfree(rdreg); >> - } >> + list_for_each_entry_safe(rdreg, next, &dist->rd_regions, list) >> + vgic_v3_free_redist_region(rdreg); > > Consider moving the introduction of vgic_v3_free_redist_region() into > a separate patch. On its own, that's a good readability improvement. > >> INIT_LIST_HEAD(&dist->rd_regions); >> } else { >> dist->vgic_cpu_base = VGIC_ADDR_UNDEF; >> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c >> index 987e366c80008..f6a7eed1d6adb 100644 >> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c >> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c >> @@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu, >> vgic_enable_lpis(vcpu); >> } >> >> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu) >> +{ >> + struct vgic_dist *vgic = &vcpu->kvm->arch.vgic; >> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; >> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg; >> + >> + if (!rdreg) >> + return false; >> + >> + if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) { >> + /* check whether there is no other contiguous rdist region */ >> + struct list_head *rd_regions = &vgic->rd_regions; >> + struct vgic_redist_region *iter; >> + >> + list_for_each_entry(iter, rd_regions, list) { >> + if (iter->base == rdreg->base + rdreg->count * KVM_VGIC_V3_REDIST_SIZE && >> + iter->free_index > 0) { >> + /* check the first rdist index of this region, if any */ >> + if (vgic_cpu->index < iter->rdist_indices[0]) >> + return false; > > rdist_indices[] contains the vcpu_id of the vcpu associated with a > given RD in the region. At this stage, you have established that there > is another region that is contiguous with the one associated with our > vcpu. You also know that this adjacent region has a vcpu mapped in > (free_index isn't 0). Isn't that enough to declare that our vcpu isn't > last? I definitely don't understand what the index comparison does > here. Assume the following case: 2 RDIST region region #0 contains rdist 1, 2, 4 region #1, adjacent to #0 contains rdist 3 Spec days: "Indicates whether this Redistributor is the highest-numbered Redistributor in a series of contiguous Redistributor pages." To me 4 is last and 3 is last too. > > It also seem to me that some of the complexity could be eliminated if > the regions were kept ordered at list insertion time. yes > >> + } >> + } >> + } else if (vgic_cpu->rdreg_index < rdreg->free_index - 1) { >> + /* look at the index of next rdist */ >> + int next_rdist_index = rdreg->rdist_indices[vgic_cpu->rdreg_index + 1]; >> + >> + if (vgic_cpu->index < next_rdist_index) >> + return false; > > Same thing here. We are in the middle of the allocated part of a > region, which means we cannot be last. I still don't get the index > check. Because within a region, nothing hinders rdist from being allocated in non ascending order. I exercise those cases in the kvmselftests one single RDIST region with the following rdists allocated there: 1, 3, 2 3 and 2 are "last", right? Or did I miss something. Yes that's totally not natural to do that kind of allocation but the API allows to do that. > >> + } >> + return true; >> +} >> + >> static unsigned long vgic_mmio_read_v3r_typer(struct kvm_vcpu *vcpu, >> gpa_t addr, unsigned int len) >> { >> unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu); >> - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; >> - struct vgic_redist_region *rdreg = vgic_cpu->rdreg; >> int target_vcpu_id = vcpu->vcpu_id; >> - gpa_t last_rdist_typer = rdreg->base + GICR_TYPER + >> - (rdreg->free_index - 1) * KVM_VGIC_V3_REDIST_SIZE; >> u64 value; >> >> value = (u64)(mpidr & GENMASK(23, 0)) << 32; >> value |= ((target_vcpu_id & 0xffff) << 8); >> >> - if (addr == last_rdist_typer) >> + if (vgic_has_its(vcpu->kvm)) >> + value |= GICR_TYPER_PLPIS; >> + >> + if (vgic_mmio_vcpu_rdist_is_last(vcpu)) >> value |= GICR_TYPER_LAST; >> - if (vgic_has_its(vcpu->kvm)) >> - value |= GICR_TYPER_PLPIS; >> >> return extract_bytes(value, addr & 7, len); >> } >> >> -static unsigned long vgic_uaccess_read_v3r_typer(struct kvm_vcpu *vcpu, >> - gpa_t addr, unsigned int len) >> -{ >> - unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu); >> - int target_vcpu_id = vcpu->vcpu_id; >> - u64 value; >> - >> - value = (u64)(mpidr & GENMASK(23, 0)) << 32; >> - value |= ((target_vcpu_id & 0xffff) << 8); >> - >> - if (vgic_has_its(vcpu->kvm)) >> - value |= GICR_TYPER_PLPIS; >> - >> - /* reporting of the Last bit is not supported for userspace */ >> - return extract_bytes(value, addr & 7, len); >> -} >> - >> static unsigned long vgic_mmio_read_v3r_iidr(struct kvm_vcpu *vcpu, >> gpa_t addr, unsigned int len) >> { >> @@ -612,7 +624,7 @@ static const struct vgic_register_region vgic_v3_rd_registers[] = { >> VGIC_ACCESS_32bit), >> REGISTER_DESC_WITH_LENGTH_UACCESS(GICR_TYPER, >> vgic_mmio_read_v3r_typer, vgic_mmio_write_wi, >> - vgic_uaccess_read_v3r_typer, vgic_mmio_uaccess_write_wi, 8, >> + NULL, vgic_mmio_uaccess_write_wi, 8, >> VGIC_ACCESS_64bit | VGIC_ACCESS_32bit), >> REGISTER_DESC_WITH_LENGTH(GICR_WAKER, >> vgic_mmio_read_raz, vgic_mmio_write_wi, 4, >> @@ -714,6 +726,16 @@ int vgic_register_redist_iodev(struct kvm_vcpu *vcpu) >> return -EINVAL; >> >> vgic_cpu->rdreg = rdreg; >> + vgic_cpu->rdreg_index = rdreg->free_index; >> + if (!rdreg->count) { >> + void *p = krealloc(rdreg->rdist_indices, >> + (vgic_cpu->rdreg_index + 1) * sizeof(u32), >> + GFP_KERNEL); >> + if (!p) >> + return -ENOMEM; >> + rdreg->rdist_indices = p; >> + } >> + rdreg->rdist_indices[vgic_cpu->rdreg_index] = vgic_cpu->index; > > I think I really have a problem with this array, which comes from me > not understanding the two checks I previously commented on. I hope the above clarified the array need. > > If we stick to the definition of 'Last', all that matters is the > position of the RD in a region (rdreg_index) and potentially the > presence of another contiguous region with allocated RDs in it. > > IIUC, the checks should read like this: > > if (vcpu->rdreg_index < (vcpu->rdreg->free_index - 1)) > last = false; > else if (vcpu->rdreg_index == (vcpu->rdreg->free_index - 1) && > adjacent_region(vcpu->rdreg)->free_index > 0) > last = false; > else > last = true; > > So why do we need to track the vcpu_id associated to a region? because the redistributors within a region can be in random order. That's why I need to store their number. Does that make more sense? Thanks Eric > >> >> rd_base = rdreg->base + rdreg->free_index * KVM_VGIC_V3_REDIST_SIZE; >> >> @@ -768,7 +790,7 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm) >> } >> >> /** >> - * vgic_v3_insert_redist_region - Insert a new redistributor region >> + * vgic_v3_alloc_redist_region - Allocate a new redistributor region >> * >> * Performs various checks before inserting the rdist region in the list. >> * Those tests depend on whether the size of the rdist region is known >> @@ -782,8 +804,8 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm) >> * >> * Return 0 on success, < 0 otherwise >> */ >> -static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index, >> - gpa_t base, uint32_t count) >> +static int vgic_v3_alloc_redist_region(struct kvm *kvm, uint32_t index, >> + gpa_t base, uint32_t count) >> { >> struct vgic_dist *d = &kvm->arch.vgic; >> struct vgic_redist_region *rdreg; >> @@ -839,6 +861,13 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index, >> rdreg->count = count; >> rdreg->free_index = 0; >> rdreg->index = index; >> + if (count) { >> + rdreg->rdist_indices = kcalloc(count, sizeof(u32), GFP_KERNEL); >> + if (!rdreg->rdist_indices) { >> + ret = -ENOMEM; >> + goto free; >> + } >> + } >> >> list_add_tail(&rdreg->list, rd_regions); >> return 0; >> @@ -847,11 +876,18 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index, >> return ret; >> } >> >> +void vgic_v3_free_redist_region(struct vgic_redist_region *rdreg) >> +{ >> + list_del(&rdreg->list); >> + kfree(rdreg->rdist_indices); >> + kfree(rdreg); >> +} >> + >> int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count) >> { >> int ret; >> >> - ret = vgic_v3_insert_redist_region(kvm, index, addr, count); >> + ret = vgic_v3_alloc_redist_region(kvm, index, addr, count); >> if (ret) >> return ret; >> >> @@ -864,8 +900,7 @@ int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count) >> struct vgic_redist_region *rdreg; >> >> rdreg = vgic_v3_rdist_region_from_index(kvm, index); >> - list_del(&rdreg->list); >> - kfree(rdreg); >> + vgic_v3_free_redist_region(rdreg); >> return ret; >> } >> >> diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h >> index 64fcd75111108..bc418c2c12141 100644 >> --- a/arch/arm64/kvm/vgic/vgic.h >> +++ b/arch/arm64/kvm/vgic/vgic.h >> @@ -293,6 +293,7 @@ vgic_v3_rd_region_size(struct kvm *kvm, struct vgic_redist_region *rdreg) >> >> struct vgic_redist_region *vgic_v3_rdist_region_from_index(struct kvm *kvm, >> u32 index); >> +void vgic_v3_free_redist_region(struct vgic_redist_region *rdreg); >> >> bool vgic_v3_rdist_overlap(struct kvm *kvm, gpa_t base, size_t size); >> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h >> index 3d74f1060bd18..9a3f060ac3547 100644 >> --- a/include/kvm/arm_vgic.h >> +++ b/include/kvm/arm_vgic.h >> @@ -197,6 +197,7 @@ struct vgic_redist_region { >> gpa_t base; >> u32 count; /* number of redistributors or 0 if single region */ >> u32 free_index; /* index of the next free redistributor */ >> + int *rdist_indices; /* indices of the redistributors */ > > You are treating it as an array of u32 when allocating it. Please > choose one type or the other. > >> struct list_head list; >> }; >> >> @@ -322,6 +323,8 @@ struct vgic_cpu { >> */ >> struct vgic_io_device rd_iodev; >> struct vgic_redist_region *rdreg; >> + u32 rdreg_index; >> + int index; /* vcpu index */ >> >> /* Contains the attributes and gpa of the LPI pending tables. */ >> u64 pendbaser; > > Thanks, > > M. >