On Fri, Nov 30, 2012 at 06:37:04AM +0000, Christoffer Dall wrote: > On Mon, Nov 19, 2012 at 9:57 AM, Will Deacon <will.deacon@xxxxxxx> wrote: > > > > I must be missing something here: how do you ensure that a guest running > > on multiple CPUs continues to have the same VMID across them after a > > rollover? > > > > when a roll over occurs, there's no problem until someone comes along > that doesn't have a valid vmid (need_new_vmid_gen will return true). Well a roll-over is triggered by somebody not having a valid VMID and us failing to allocate a new one from the current generation. > In this case, to assign a vmid, we need to start a new generation of > id's to assign one, and must ensure that all old vmid's are no longer > used. So how do we ensure that? > > Well, we increment the kvm_vmid_gen, causing all other cpus who try to > run a VM to hit the spin_lock if they exit the VMs. We reserve the > vmid 1 for the new cpu, and we call on_each_cpu, which causes an ipi > to all other physical cpus, and waits until the other physical cpus > actually complete reset_vm_context. > > At this point, once on_each_cpu(reset_vm_context) returns, all other > physical CPUs have cleared their data structures for occurences of old > vmids, and the kvm_vmid_gen has been incremented, so no other vcpus > can come and claim other vmids until we unlock the spinlock, and > everything starts over. > > Makes sense? Yes, but I still don't understand how you ensure VMID consistency across different vcpus of the same vm. Imagine the following scenario: We have two VMs: VM0: VCPU0 on physical CPU0, VCPU1 on physical CPU1 VM1: VCPU0 on physical CPU2 Also assume that VM0 is happily running and we want to schedule VM1 for the first time. Finally, also assume that kvm_next_vmid is zero (that is, the next allocation will trigger a roll-over). Now, we want to schedule VM1: kvm_arch_init_vm kvm->arch.vmid_gen = 0; kvm_arch_vcpu_ioctl_run update_vttbr need_new_vmid_gen == true lock(kvm_vmid_lock) kvm_vmid_gen++; kvm_next_vmid = 1; on_each_cpu(reset_vm_context); At this point, the other two (physical) CPUs exit the guest: kvm_guest_exit // Received IRQ from cross-call local_irq_enable kvm_call_hyp(__kvm_flush_vm_context); // Invalidate TLB (this is overkill as should be bcast) cond_resched; update_vttbr need_new_vmid_gen == true /* spin on kvm_vmid_lock */ I think the __kvm_flush_vm_context is overkill -- you should check tlb_ops_need_broadcast (which is currently only true for 11MPCore). However, continuing with this, the other CPU gets its vmid and releases the lock: /* receives vmid 1, kvm_next_vmid = 2 */ unlock(kvm_vmid_lock) /* Back to the guest */ Now, let's say that CPU0 takes an interrupt (which it goes off to handle) and CPU1 grabs the lock: lock(kvm_vmid_lock) /* CPU1 receives vmid 2, bumps vmid counter */ unlock(kvm_vmid_lock) /* Back to the guest */ At this point, VM1 is running and VM0:VCPU1 is running. VM0:VCPU0 is not running because physical CPU0 is handling an interrupt. The problem is that when VCPU0 *is* resumed, it will update the VMID of VM0 and could be scheduled in parallel with VCPU1 but with a different VMID. How do you avoid this in the current code? Will -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html