RE: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Nadav Har'El [mailto:nyh@xxxxxxxxxxxxxxxxxxx]
> Sent: Tuesday, May 24, 2011 3:57 PM
> 
> On Tue, May 24, 2011, Tian, Kevin wrote about "RE: [PATCH 08/31] nVMX: Fix
> local_vcpus_link handling":
> > > +static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
> > > +{
> > > +	vmcs_clear(loaded_vmcs->vmcs);
> > > +	loaded_vmcs->cpu = -1;
> > > +	loaded_vmcs->launched = 0;
> > > +}
> > > +
> >
> > call it vmcs_init instead since you now remove original vmcs_init invocation,
> > which more reflect the necessity of adding VMCLEAR for a new vmcs?
> 
> The best name for this function, I think, would have been loaded_vmcs_clear,
> because this function isn't necessarily used to "init" - it's also called to
> VMCLEAR an old vmcs (and flush its content back to memory) - in that sense
> it is definitely not a "vmcs_init".
> 
> Unfortunately, I already have a whole chain of functions with this name :(
> the existing loaded_vmcs_clear() does an IPI to the CPU which has this VMCS
> loaded, and causes it to run __loaded_vmcs_clear(), which in turn calls
> the above loaded_vmcs_init(). I wish I could call all three functions
> loaded_vmcs_clear(), but of course I can't. If anyone reading this has a good
> suggestion on how to name these three functions, please let me know.

how about loaded_vmcs_reset?

> 
> > > +static void __loaded_vmcs_clear(void *arg)
> > >  {
> > > -	struct vcpu_vmx *vmx = arg;
> > > +	struct loaded_vmcs *loaded_vmcs = arg;
> > >  	int cpu = raw_smp_processor_id();
> > >
> > > -	if (vmx->vcpu.cpu == cpu)
> > > -		vmcs_clear(vmx->vmcs);
> > > -	if (per_cpu(current_vmcs, cpu) == vmx->vmcs)
> > > +	if (loaded_vmcs->cpu != cpu)
> > > +		return; /* cpu migration can race with cpu offline */
> >
> > what do you mean by "cpu migration" here? why does 'cpu offline'
> > matter here regarding to the cpu change?
> 
> __loaded_vmcs_clear() is typically called in one of two cases: "cpu migration"
> means that a guest that used to run on one CPU, and had its VMCS loaded
> there,
> suddenly needs to run on a different CPU, so we need to clear the VMCS on
> the old CPU. "cpu offline" means that we want to take a certain CPU offline,
> and before we do that we should VMCLEAR all VMCSs which were loaded on it.

So here you need explicitly differentiate a vcpu and a real cpu. For the 1st case
it's just 'vcpu migration", and the latter it's the real 'cpu offline'. 'cpu migration' 
is generally a RAS feature in mission critical world. :-) 

> 
> The (vmx->cpu.cpu != cpu) case in __loaded_vmcs_clear should ideally never
> happen: In the cpu offline path, we only call it for the loaded_vmcss which
> we know for sure are loaded on the current cpu. In the cpu migration path,
> loaded_vmcs_clear runs __loaded_vmcs_clear on the right CPU, which ensures
> that
> equality.
> 
> But, there can be a race condition (this was actually explained to me a while
> back by Avi - I never seen this happening in practice): Imagine that cpu
> migration calls loaded_vmcs_clear, which tells the old cpu (via IPI) to
> VMCLEAR this vmcs. But before that old CPU gets a chance to act on that IPI,
> a decision is made to take it offline, and all loaded_vmcs loaded on it
> (including the one in question) are cleared. When that CPU acts on this IPI,
> it notices that vmx->cpu.cpu==-1, i.e., != cpu, so it doesn't need to do
> anything (in the new version of the code, I made this more explicit, by
> returning immediately in this case).

the reverse also holds true. Right between the point where cpu_offline hits
a loaded_vmcs and the point where it calls __loaded_vmcs_clear, it's possible
that the vcpu is migrated to another cpu, and it's likely that migration path
(vmx_vcpu_load) has invoked loaded_vmcs_clear but hasn't delete this vmcs
from old cpu's linked list. This way later when __loaded_vmcs_clear is
invoked on the offlined cpu, there's still chance to observe cpu as -1.

> 
> At least this is the theory. As I said, I didn't see this problem in practice
> (unsurprising, since I never offlined any CPU). Maybe Avi or someone else can
> comment more about this (vmx->cpu.cpu == cpu) check, which existed before
> my patch - in __vcpu_clear().

I agree this check is necessary, but just want you to make the comment clear
with right term.

> 
> > > +static void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
> > > +{
> > > +	if (!loaded_vmcs->vmcs)
> > > +		return;
> > > +	loaded_vmcs_clear(loaded_vmcs);
> > > +	free_vmcs(loaded_vmcs->vmcs);
> > > +	loaded_vmcs->vmcs = NULL;
> > > +}
> >
> > prefer to not do cleanup work through loaded_vmcs since it's just a pointer
> > to a loaded vmcs structure. Though you can carefully arrange the nested
> > vmcs cleanup happening before it, it's not very clean and a bit error prone
> > simply from this function itself. It's clearer to directly cleanup vmcs01, and
> > if you want an assertion could be added to make sure loaded_vmcs doesn't
> > point to any stale vmcs02 structure after nested cleanup step.
> 
> I'm afraid I didn't understand what you meant here... Basically, this
> free_loaded_vmcs() is just a shortcut for loaded_vmcs_clear() and free_vmcs(),
> as doing both is needed in 3 places: nested_free_vmcs02,
> nested_free_all_saved_vmcss, vmx_free_vcpu. The same function is needed
> for both vmcs01 and vmcs02 VMCSs - in both cases when we don't need them
> any
> more we need to VMCLEAR them and then free the VMCS memory. Note that
> this
> function does *not* free the loaded_vmcs structure itself.
> 
> What's wrong with this?
> Would you prefer that I remove this function and explictly call
> loaded_vmcs_clear() and then free_vmcs() in all three places?
> 

Forgot about it. I originally thought this was only used to free vmcs01, and thus
wanted to make that purpose obvious.

Thanks
Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux