RE: [PATCH v12 3/4] i386: Enable TOPOEXT feature on AMD EPYC CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Eduardo Habkost [mailto:ehabkost@xxxxxxxxxx]
> Sent: Friday, June 8, 2018 2:23 PM
> To: Moger, Babu <Babu.Moger@xxxxxxx>
> Cc: mst@xxxxxxxxxx; marcel.apfelbaum@xxxxxxxxx; pbonzini@xxxxxxxxxx;
> rth@xxxxxxxxxxx; mtosatti@xxxxxxxxxx; qemu-devel@xxxxxxxxxx;
> kvm@xxxxxxxxxxxxxxx; kash@xxxxxxxxxxxxxx; geoff@xxxxxxxxxxxxxxx; Juan
> Quintela <quintela@xxxxxxxxxx>; xiaoguangrong@xxxxxxxxxxx
> Subject: Re: [PATCH v12 3/4] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> On Fri, Jun 08, 2018 at 06:40:16PM +0000, Moger, Babu wrote:
> > Hi Eduardo,
> > Sorry for the late response. Got pulled into something else.
> >
> > > -----Original Message-----
> > > From: Eduardo Habkost [mailto:ehabkost@xxxxxxxxxx]
> > > Sent: Wednesday, June 6, 2018 5:40 PM
> > > To: Moger, Babu <Babu.Moger@xxxxxxx>
> > > Cc: mst@xxxxxxxxxx; marcel.apfelbaum@xxxxxxxxx;
> pbonzini@xxxxxxxxxx;
> > > rth@xxxxxxxxxxx; mtosatti@xxxxxxxxxx; qemu-devel@xxxxxxxxxx;
> > > kvm@xxxxxxxxxxxxxxx; kash@xxxxxxxxxxxxxx; geoff@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH v12 3/4] i386: Enable TOPOEXT feature on AMD EPYC
> > > CPU
> > >
> > > On Wed, Jun 06, 2018 at 10:36:45AM -0400, Babu Moger wrote:
> > > > Enable TOPOEXT feature on EPYC CPU. This is required to support
> > > > hyperthreading on VM guests. Also extend xlevel to 0x8000001E.
> > > >
> > > > Disable TOPOEXT feature for legacy machines.
> > > >
> > > > Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
> > >
> > > Now, I just noticed we have a problem here:
> > >
> > > "-machine pc -cpu EPYC -smp 64" works today
> > >
> > > This patch makes it stop working, but it shouldn't.
> >
> > No. It works fine. I have tested it.
> 
> This doesn't sound right.  The code in this series will error out
> of TOPOEXT is enabled and you have more than 64 VCPUs.
> 
> But I just noticed we have a bug introduced by:

Oh.. Ok..  Let me retry again with the new patch.

> 
> commit f548222c24342ca74689de7794f9006b43f86a54
> Author: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxx>
> Date:   Thu May 3 16:06:11 2018 +0800
> 
>     migration: introduce decompress-error-check
> 
>     QEMU 3.0 enables strict check for compression & decompression to
>     make the migration more robust, that depends on the source to fix
>     the internal design which triggers the unexpected error conditions
> 
>     To make it work for migrating old version QEMU to 2.13 QEMU, we
>     introduce this parameter to disable the error check on the
>     destination which is the default behavior of the machine type
>     which is older than 2.13, alternately, the strict check can be
>     enabled explicitly as followings:
>           -M pc-q35-2.11 -global migration.decompress-error-check=true
> 
>     Signed-off-by: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxx>
>     Reviewed-by: Juan Quintela <quintela@xxxxxxxxxx>
>     Signed-off-by: Juan Quintela <quintela@xxxxxxxxxx>
> 
> This commits added PC_COMPAT_2_12 to the 3.0 machine-types.
> Because of this bug, TOPOEXT is being unconditionally disabled on
> all machine-types, unless I apply the fix below:
> 
> 
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 3d81136065..b4c5b03274 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -430,7 +430,6 @@ static void
> pc_i440fx_3_0_machine_options(MachineClass *m)
>      pc_i440fx_machine_options(m);
>      m->alias = "pc";
>      m->is_default = 1;
> -    SET_MACHINE_COMPAT(m, PC_COMPAT_2_12);
>  }
> 
>  DEFINE_I440FX_MACHINE(v3_0, "pc-i440fx-3.0", NULL,
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index b60cbb9266..83d6d75efa 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -312,7 +312,6 @@ static void
> pc_q35_3_0_machine_options(MachineClass *m)
>  {
>      pc_q35_machine_options(m);
>      m->alias = "q35";
> -    SET_MACHINE_COMPAT(m, PC_COMPAT_2_12);
>  }
> 
>  DEFINE_Q35_MACHINE(v3_0, "pc-q35-3.0", NULL,
> 
> >
> > >
> > > On the other hand, I believe you expect:
> > > * "-machine pc -cpu EPYC -smp 8" to automatically enable topoext.
> > Yes. Only on new machines-types
> > > * "-machine pc -cpu Opteron_G1 -smp 8" to not enable topoext.
> > Yes.
> > > * What about "-machine -cpu Opteron_G1 -smp 8,threads=2"?
> > No. This should not enable topoext.  Topoext is not supported by
> Opteron_G1.
> > This should warn about hyperthreading and continue.
> 
> OK, makes sense to me.
> 
> > >
> > >
> > > We also have other requirements, I will try to enumerate all of
> > > them below:
> > >
> > > 0) "-topoext" explicitly configured (any machine-type):
> > > * Must never enable topoext.
> > Yes.
> > >
> > > 1) "+topoext" explicitly configured (any machine-type):
> > > * Must validate topology and refuse to start if unsupported.
> >
> > Yes.
> >
> > >
> > > 2) Older machine-types:
> > > * Must never enable topoext automatically, even if using "EPYC"
> > >   or "threads=2"
> > >
> > Yes.
> >
> > > 3) "EPYC" CPU model (on new machine-types):
> > > * Should enable topoext automatically, but only if topology is
> > >   supported.
> > > * Must not error out if topology is not supported.
> > In new machine types we will enable topoext for "EPYC" CPU model.
> > Right now(old machine type) we can disable for all the CPU models.
> > So, we don't need two bits(topoext and auto-topoext)
> 
> Right, so you agree that in this case we must _not_ error out if
> topology is unsupported, correct?  Otherwise we will break this
> existing use case:
>   "-machine pc -cpu EPYC -smp 64".

Ok. I will test this with new fix patch.  
> 
> >
> > I thought we should error out if topology cannot be supported. But we can
> warn(disable topoext) and continue that is another option.
> >
> > > * Should this enable topoext automatically even if threads=1?
> >
> > Yes. We should enable even with threads=1.
> >
> > >
> > > 4) Other AMD CPU models with "threads=2" (on new machine-types):
> > > * We might want to make this enable topoext automatically, too.
> > >   What do you think?
> >
> >   No. We should not enable topoext here. We should depend on CPU
> model table here.
> >
> > >
> > > Is the above description accurate?  Do you agree with these
> > > requirements?
> >
> > With these requirements in mind, I will send that patches. We can start our
> discussion.
> > We don't need one more bits. That is my opinion.
> 
> Thanks for confirming the requirements above.
> 
> But it doesn't seem to be possible represent these requirements
> with just one bit.  Otherwise you can't differentiate explicit
> "+topoext" (1 above) from topoext being implicitly enabled by
> "-cpu EPYC" (3 above).
> 
> Another problem is query-cpu-model-expansion QMP command: this
> patch makes "topoext" appear on the output of
> "query-cpu-model-expansion model=EPYC", meaning that management
> software will assume everybody using the "EPYC" CPU model will
> require +topoext.  A separate "auto-topoext" property would avoid
> this issue.

Sure.  Will work on it.  Yes, I know all these combinations make it very tricky. 
> 
> (Yeah, this is tricky.  I want to eventually encode these subtle
> rules in automated test cases, so these issues could be detected
> by software instead of code inspection.)
> 
> --
> Eduardo




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux