On Tue, Sep 11, 2012 at 04:26:17PM +0300, Avi Kivity wrote: > On 09/11/2012 04:02 PM, Gleb Natapov wrote: > > Most interrupt are delivered to only one vcpu. Use pre-build tables to > > find interrupt destination instead of looping through all vcpus. In case > > of logical mode loop only through vcpus in a logical cluster irq is sent > > to. > > > > * fix rcu issues pointed to by MST. All but one. Still use > > call_rcu(). Do not think this is serious issue. If it is should be > > solved by RCU subsystem. > > Agree. > > Patch looks good but some minor comments follow. > > > struct kvm_arch { > > unsigned int n_used_mmu_pages; > > unsigned int n_requested_mmu_pages; > > @@ -528,6 +536,8 @@ struct kvm_arch { > > struct kvm_ioapic *vioapic; > > struct kvm_pit *vpit; > > int vapics_in_nmi_mode; > > + struct kvm_apic_map *apic_map; > > + struct mutex apic_map_lock; > > Reversing the order will make it clearer what the lock protects. > Hmm, OK. I thought names make it clear. > > > > +static void kvm_apic_get_logical_id(u32 ldr, bool flat, u8 ldr_bits, > > + u16 *cid, u16 *lid) > > +{ > > + if (ldr_bits == 32) { > > + *cid = ldr >> 16; > > + *lid = ldr & 0xffff; > > + } else { > > + ldr = GET_APIC_LOGICAL_ID(ldr); > > + > > + if (flat) { > > + *cid = 0; > > + *lid = ldr; > > + } else { > > + *cid = ldr >> 4; > > + *lid = ldr & 0xf; > > + } > > + } > > +} > > You could precaclulate lid_shift/lid_mask/cid_shift/cid_mask and have > just one version here. In fact you could drop the function. > You mean precalculate them in recalculate_apic_map() and store in kvm_apic_map? > > + > > +static inline void recalculate_apic_map(struct kvm *kvm) > > +{ > > + struct kvm_apic_map *new, *old = NULL; > > + struct kvm_vcpu *vcpu; > > + int i; > > + > > + new = kzalloc(sizeof(struct kvm_apic_map), GFP_KERNEL); > > + > > + mutex_lock(&kvm->arch.apic_map_lock); > > + > > + if (!new) > > + goto out; > > + > > + new->ldr_bits = 8; > > + new->flat = true; > > + kvm_for_each_vcpu(i, vcpu, kvm) { > > + u16 cid, lid; > > + struct kvm_lapic *apic = vcpu->arch.apic; > > + > > + if (!kvm_apic_present(vcpu)) > > + continue; > > + > > + if (apic_x2apic_mode(apic)) { > > + new->ldr_bits = 32; > > + new->flat = false; > > + } else if (kvm_apic_sw_enabled(apic) && new->flat && > > + kvm_apic_get_reg(apic, APIC_DFR) == APIC_DFR_CLUSTER) > > + new->flat = false; > > While a vcpu is being hotplugged in it will be in flat mode. The code > correctly gives precedence to x2apic and cluster modes over flat mode, > so it is correct in that respect, but the comment describing this is too > short. > Almost non existent. > > + > > + new->phys_map[kvm_apic_id(apic)] = apic; > > + kvm_apic_get_logical_id(kvm_apic_get_reg(apic, APIC_LDR), > > + new->flat, new->ldr_bits, &cid, &lid); > > + > > + if (lid) > > + new->logical_map[cid][ffs(lid) - 1] = apic; > > + } > > +out: > > + old = kvm->arch.apic_map; > > rcu_dereference(), just for kicks. > MST says rcu_dereference_protected() but honestly I look at it and rcu_dereference_check(...., 1) and condition they check are so obviously correct in the code that using them is just a clutter. In more complex cases, when dereference happens far away from locking it have its point. If you insist on it here should we add it too irq routing code too? > > + rcu_assign_pointer(kvm->arch.apic_map, new); > > + mutex_unlock(&kvm->arch.apic_map_lock); > > + > > + if (old) > > + kfree_rcu(old, rcu); > > Nice, removes the need for rcu_barrier(). > > > +} > > + > > > > +bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src, > > + struct kvm_lapic_irq *irq, int *r) > > +{ > > + struct kvm_apic_map *map; > > + unsigned long bitmap = 1; > > + struct kvm_lapic **dst; > > + int i; > > + > > + *r = -1; > > + > > + if (irq->shorthand == APIC_DEST_SELF) { > > + *r = kvm_apic_set_irq(src->vcpu, irq); > > + return true; > > + } > > + > > + if (irq->shorthand) > > + return false; > > + > > + rcu_read_lock(); > > + map = rcu_dereference(kvm->arch.apic_map); > > + > > + if (!map) { > > + rcu_read_unlock(); > > + return false; > > + } > > + > > + if (irq->dest_mode == 0) { /* physical mode */ > > + if (irq->delivery_mode == APIC_DM_LOWEST || > > + irq->dest_id == 0xff) { > > + rcu_read_unlock(); > > + return false; > > + } > > Two error paths with rcu_read_unlock(). Cleaner to have a bool ret = > false; in the beginning and 'goto out_unlock' here, IMO. > That's too against one :( > > > + dst = &map->phys_map[irq->dest_id & 0xff]; > > + } else { > > + u16 cid, lid; > > + u32 mda = irq->dest_id; > > + > > + if (map->ldr_bits == 8) > > + mda <<= 24; > > mda <<= 32 - map->ldr_bits; > Nice. > > + > > + kvm_apic_get_logical_id(mda, map->flat, map->ldr_bits, > > + &cid, &lid); > > + dst = map->logical_map[cid]; > > + > > + bitmap = lid; > > + if (irq->delivery_mode == APIC_DM_LOWEST && > > + hweight_long(bitmap) > 1) { > > + int l = -1; > > + for_each_set_bit(i, &bitmap, 16) { > > + if (!dst[i]) > > + continue; > > + if (l < 0) > > + l = i; > > + else if (kvm_apic_compare_prio(dst[i]->vcpu, dst[l]->vcpu) < 0) > > + l = i; > > + } > > + > > + bitmap = (l >= 0) ? 1 << l : 0; > > + } > > + } > > + > > + for_each_set_bit(i, &bitmap, 16) { > > + if (!dst[i]) > > + continue; > > + if (*r < 0) > > + *r = 0; > > + *r += kvm_apic_set_irq(dst[i]->vcpu, irq); > > + } > > + > > + rcu_read_unlock(); > > + return true; > > +} > > + > > /* > > @@ -6319,6 +6320,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm) > > put_page(kvm->arch.apic_access_page); > > if (kvm->arch.ept_identity_pagetable) > > put_page(kvm->arch.ept_identity_pagetable); > > + kfree(kvm->arch.apic_map); > > rcu_dereference(), even though it cannot be needed here, to shut down > static code checkers. > How to run those code checkers? Do they complain about irq routing code? Just curious. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html