Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/22/2011 11:57 AM, Nadav Har'El wrote:
Hi Avi and Marcelo, here is a the new first patch to the nvmx patch set,
which overhauls the handling of vmcss on cpus, as you asked.

As you guessed, the nested entry and exit code becomes much simpler and
cleaner, with the whole VMCS switching code on entry, for example, reduced
to:
	cpu = get_cpu();
	vmx->loaded_vmcs = vmcs02;
	vmx_vcpu_put(vcpu);
	vmx_vcpu_load(vcpu, cpu);
	vcpu->cpu = cpu;
	put_cpu();

That's wonderful, it indicates the code is much better integrated. Perhaps later we can refine it to have separate _load and _put for host-related and guest-related parts (I think they already exist in the code, except they are always called together), but that is an optimization, and not the most important one by far.

You can apply this patch separately from the rest of the patch set, if you
wish. I'm sending just this one, like you asked - and can send the rest of
the patches when you ask me to.


Subject: [PATCH 01/31] nVMX: Keep list of loaded VMCSs, instead of vcpus.

In VMX, before we bring down a CPU we must VMCLEAR all VMCSs loaded on it
because (at least in theory) the processor might not have written all of its
content back to memory. Since a patch from June 26, 2008, this is done using
a per-cpu "vcpus_on_cpu" linked list of vcpus loaded on each CPU.

The problem is that with nested VMX, we no longer have the concept of a
vcpu being loaded on a cpu: A vcpu has multiple VMCSs (one for L1, a pool for
L2s), and each of those may be have been last loaded on a different cpu.

So instead of linking the vcpus, we link the VMCSs, using a new structure
loaded_vmcs. This structure contains the VMCS, and the information pertaining
to its loading on a specific cpu (namely, the cpu number, and whether it
was already launched on this cpu once). In nested we will also use the same
structure to hold L2 VMCSs, and vmx->loaded_vmcs is a pointer to the
currently active VMCS.

--- .before/arch/x86/kvm/x86.c	2011-05-22 11:41:57.000000000 +0300
+++ .after/arch/x86/kvm/x86.c	2011-05-22 11:41:57.000000000 +0300
@@ -2119,7 +2119,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu
  	if (need_emulate_wbinvd(vcpu)) {
  		if (kvm_x86_ops->has_wbinvd_exit())
  			cpumask_set_cpu(cpu, vcpu->arch.wbinvd_dirty_mask);
-		else if (vcpu->cpu != -1&&  vcpu->cpu != cpu)
+		else if (vcpu->cpu != -1&&  vcpu->cpu != cpu
+				&&  cpu_online(vcpu->cpu))
  			smp_call_function_single(vcpu->cpu,
  					wbinvd_ipi, NULL, 1);
  	}

Is this a necessary part of this patch?  Or an semi-related bugfix?

I think that it can't actually trigger before this patch due to luck. svm doesn't clear vcpu->cpu on cpu offline, but on the other hand it ->has_wbinvd_exit().

Joerg, is

    if (unlikely(cpu != vcpu->cpu)) {
        svm->asid_generation = 0;
        mark_all_dirty(svm->vmcb);
    }

susceptible to cpu offline/online?

@@ -971,22 +992,22 @@ static void vmx_vcpu_load(struct kvm_vcp

  	if (!vmm_exclusive)
  		kvm_cpu_vmxon(phys_addr);
-	else if (vcpu->cpu != cpu)
-		vcpu_clear(vmx);
+	else if (vmx->loaded_vmcs->cpu != cpu)
+		loaded_vmcs_clear(vmx->loaded_vmcs);

-	if (per_cpu(current_vmcs, cpu) != vmx->vmcs) {
-		per_cpu(current_vmcs, cpu) = vmx->vmcs;
-		vmcs_load(vmx->vmcs);
+	if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
+		per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
+		vmcs_load(vmx->loaded_vmcs->vmcs);
  	}

-	if (vcpu->cpu != cpu) {
+	if (vmx->loaded_vmcs->cpu != cpu) {
  		struct desc_ptr *gdt =&__get_cpu_var(host_gdt);
  		unsigned long sysenter_esp;

  		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
  		local_irq_disable();
-		list_add(&vmx->local_vcpus_link,
-			&per_cpu(vcpus_on_cpu, cpu));
+		list_add(&vmx->loaded_vmcs->loaded_vmcss_on_cpu_link,
+			&per_cpu(loaded_vmcss_on_cpu, cpu));
  		local_irq_enable();

  		/*
@@ -999,13 +1020,15 @@ static void vmx_vcpu_load(struct kvm_vcp
  		rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
  		vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
  	}
+	vmx->loaded_vmcs->cpu = cpu;

This should be within the if () block.

@@ -4344,11 +4369,13 @@ static struct kvm_vcpu *vmx_create_vcpu(
  		goto uninit_vcpu;
  	}

-	vmx->vmcs = alloc_vmcs();
-	if (!vmx->vmcs)
+	vmx->loaded_vmcs =&vmx->vmcs01;
+	vmx->loaded_vmcs->vmcs = alloc_vmcs();
+	if (!vmx->loaded_vmcs->vmcs)
  		goto free_msrs;
-
-	vmcs_init(vmx->vmcs);
+	vmcs_init(vmx->loaded_vmcs->vmcs);
+	vmx->loaded_vmcs->cpu = -1;
+	vmx->loaded_vmcs->launched = 0;

Perhaps a loaded_vmcs_init() to encapsulate initialization of these three fields, you'll probably reuse it later.

Please repost separately after the fix, I'd like to apply it before the rest of the series.

(regarding interrupts, I think we can do that work post-merge. But I'd like to see Kevin's comments addressed)

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux