Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx

Jim Mattson <jmattson@xxxxxxxxxx> · Thu, 4 Jun 2020 13:54:08 -0700

On Thu, Jun 4, 2020 at 12:26 PM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On Thu, Jun 04, 2020 at 12:00:33PM -0700, Jim Mattson wrote:
> > On Thu, Jun 4, 2020 at 11:47 AM Sean Christopherson
> > <sean.j.christopherson@xxxxxxxxx> wrote:
> > >
> > > On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote:
> > > > On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson
> > > > <sean.j.christopherson@xxxxxxxxx> wrote:
> > > > > As an alternative to storing the last run/attempted CPU, what about moving
> > > > > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook
> > > > > that is called after IRQs are enabled but before preemption is enabled, e.g.
> > > > > detect_bad_exit or something?  All of the paths in patch 4/4 can easily be
> > > > > moved out of handle_exit.  VMX would require a little bit of refacotring for
> > > > > it's "no handler" check, but that should be minor.
> > > >
> > > > Given the alternatives, I'm willing to compromise my principles wrt
> > > > emulation_required. :-) I'll send out v4 soon.
> > >
> > > What do you dislike about the alternative approach?
> >
> > Mainly, I wanted to stash this in a common location so that I could
> > print it out in our local version of dump_vmcs(). Ideally, we'd like
> > to be able to identify the bad part(s) just from the kernel logs.
>
> But this would also move dump_vmcs() to before preemption is enabled, i.e.
> your version could read the CPU directly.

If it backports easily. The bigger the change, the less likely that is.

> And actually, if we're talking about ferreting out hardware issues, you
> really do want this happening before preemption is enabled so that the VMCS
> dump comes from the failing CPU.  If the vCPU is migrated, the VMCS will be
> dumped after a VMCLEAR->VMPTRLD, i.e. will be written to memory and pulled
> back into the VMCS cache on a different CPU, and will also have been written
> to by the new CPU to update host state.  Odds are that wouldn't affect the
> dump in a meaningful way, but never say never.

True.

> Tangentially related, what about adding an option to do VMCLEAR at the end
> of dump_vmcs(), followed by a dump of raw memory?  It'd be useless for
> debugging software issues, but might be potentially useful/interesting for
> triaging hardware problems.

Our dump_vmcs() dumps all vmreadable fields, which should be pretty
close to what we can get from a raw memory dump. We do have additional
instrumentation to aid in determining the layout of the VMCS in
memory, but it is too stupid to figure out how access rights are
stored. Maybe it could be beefed up a little, and we could at least
verify that VMCLEAR dumps the same thing to physical memory that we
get from the individual VMREADs.

> > That, and I wouldn't have been as comfortable with the refactoring
> > without a lot more testing.