Re: [Qemu-devel] [PATCHv4 3/4] cpuid: disable pv eoi for 1.1 and older compat types

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 29, 2012 at 01:21:13AM +0300, Michael S. Tsirkin wrote:
> On Tue, Aug 28, 2012 at 07:02:42PM -0300, Eduardo Habkost wrote:
> > On Wed, Aug 29, 2012 at 12:35:28AM +0300, Michael S. Tsirkin wrote:
> > > On Tue, Aug 28, 2012 at 04:13:38PM -0300, Eduardo Habkost wrote:
> > > > On Tue, Aug 28, 2012 at 08:43:52PM +0300, Michael S. Tsirkin wrote:
> > > > > In preparation for adding PV EOI support, disable PV EOI by default for
> > > > > 1.1 and older machine types, to avoid CPUID changing during migration.
> > > > > 
> > > > > PV EOI can still be enabled/disabled by specifying it explicitly.
> > > > >     Enable for 1.1
> > > > >     -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > > > >     Disable for 1.2
> > > > >     -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > > > > 
> > > > 
> > > > What about users that are already running "qemu-1.1 -M pc-1.1" on a host
> > > > kernel that supports PV EOI already? They would get PV EOI disabled when
> > > > migrating to a destination running "qemu-1.2 -M pc-1.1".
> > > > 
> > > > (On the other hand, people running "qemu-1.1 -M pc-1.1" on a host kernel
> > > > supporting PV EOI already have migration broken, so there's not much we
> > > > can do for them)
> > > 
> > > Exactly.
> > > 
> > > Talked to Gleb, long term I think we should rework code to make
> > > it forward-compatible wrt adding new MSRs:
> > > - source gets list of MSRs to be migrated from KVM and simply sends them all
> > > - send all MSRS in key/value format
> > > - destination gets list of MSRs to be migrated from KVM and
> > >   only restores the supported ones
> > 
> > As far as I understand the migration code requirements/expectations, if
> > the origin is sending some data, it is because it is part of the
> > guest-visible machine state that must be kept while migrating. Because
> > of that, the destination is not allowed to drop anything it doesn't know
> > about.
> 
> 
> We have a ton of code that reads in values then just
> ignores them, for compat with old qemu.
> This will be exactly such a case:
> we don't drop anything - protocol does not
> support this. We read and simply do not tell kvm
> about it. We also have tons of code that sends
> useless values again for compatibility.
> 
> > At the same time, if it's not part of guest-visible machine
> > state, it doesn't have to be sent by the migration origin.
> > 
> 
> False, we often send internal device state which is not
> directly guest visible.

Besides, all MSRs in KVM's MSR list are actually guest visible, even if
guests normally do not invoke these MSRs - they can.

> > On the other hand, a mode of operation that doesn't require updating
> > QEMU every time there's a new bit of guest-visible state to be migrated
> > would be nice (just like the "-cpu host" mode, that doesn't require
> > updating QEMU for every new CPU feature, is nice for some use cases). I
> > just don't know how to make work with the current migration protocol.
> > 
> 
> I don't understand. What is the problem with the proposal?
> What will not work with our protocol?
> Can you give an example please?
> 
> 
> > > Too late for 1.2?
> > 
> > Absolutely (in my opinion).
> > 
> > > 
> > > > While we don't make the KVM feature-bit handling sane (with defaults
> > > > that are not blindly derived from the host kernel capabilities), maybe
> > > > the safest bet is to expect users to not migrate between hosts running
> > > > kernels with different KVM capabilities? (I am not sure which option is
> > > > better)
> > > 
> > > Sorry not sure what you talk about here. What has KVM feature-bit
> > > handling to do with this patchset?
> > 
> > Everything? The whole point of this patch is to filter out the PV_EOI
> > KVM feature bit.
> > 
> > 
> > This part of the current code, specifically, is wrong:
> > 
> > > > >      plus_kvm_features = ~0; /* not supported bits will be filtered out later */
> > 
> > The QEMU-side list of KVM features should be whitelist-based, not
> > blacklist-based (unless the user doesn't need migration, in that case he
> > can use "-cpu host" and get every feature blindly enabled), because QEMU
> > can't know if a new feature involves guest-visible state that has to be
> > migrated.
> > 
> > 
> > > 
> > > > 
> > > > > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > > > > ---
> > > > >  hw/Makefile.objs  |  2 +-
> > > > >  hw/cpu_flags.c    | 32 ++++++++++++++++++++++++++++++++
> > > > >  hw/cpu_flags.h    |  9 +++++++++
> > > > >  hw/pc_piix.c      |  2 ++
> > > > >  target-i386/cpu.c |  8 ++++++++
> > > > >  5 files changed, 52 insertions(+), 1 deletion(-)
> > > > >  create mode 100644 hw/cpu_flags.c
> > > > >  create mode 100644 hw/cpu_flags.h
> > > > > 
> > > > > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > > > > index 850b87b..3f2532a 100644
> > > > > --- a/hw/Makefile.objs
> > > > > +++ b/hw/Makefile.objs
> > > > > @@ -1,5 +1,5 @@
> > > > >  hw-obj-y = usb/ ide/
> > > > > -hw-obj-y += loader.o
> > > > > +hw-obj-y += loader.o cpu_flags.o
> > > > >  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
> > > > >  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> > > > >  hw-obj-y += fw_cfg.o
> > > > > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> > > > > new file mode 100644
> > > > > index 0000000..7a633c0
> > > > > --- /dev/null
> > > > > +++ b/hw/cpu_flags.c
> > > > > @@ -0,0 +1,32 @@
> > > > > +/*
> > > > > + * CPU compatibility flags.
> > > > > + *
> > > > > + * Copyright (c) 2012 Red Hat Inc.
> > > > > + * Author: Michael S. Tsirkin.
> > > > > + *
> > > > > + * This program is free software; you can redistribute it and/or modify
> > > > > + * it under the terms of the GNU General Public License as published by
> > > > > + * the Free Software Foundation; either version 2 of the License, or
> > > > > + * (at your option) any later version.
> > > > > + *
> > > > > + * This program is distributed in the hope that it will be useful,
> > > > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > > > + * GNU General Public License for more details.
> > > > > + *
> > > > > + * You should have received a copy of the GNU General Public License along
> > > > > + * with this program; if not, see <http://www.gnu.org/licenses/>.
> > > > > + */
> > > > > +#include "hw/cpu_flags.h"
> > > > > +
> > > > > +static bool kvm_pv_eoi_disabled_state;
> > > > > +
> > > > > +void disable_kvm_pv_eoi(void)
> > > > > +{
> > > > > +       kvm_pv_eoi_disabled_state = true;
> > > > > +}
> > > > > +
> > > > > +bool kvm_pv_eoi_disabled(void)
> > > > > +{
> > > > > +       return kvm_pv_eoi_disabled_state;
> > > > > +}
> > > > > diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
> > > > > new file mode 100644
> > > > > index 0000000..05777b6
> > > > > --- /dev/null
> > > > > +++ b/hw/cpu_flags.h
> > > > > @@ -0,0 +1,9 @@
> > > > > +#ifndef HW_CPU_FLAGS_H
> > > > > +#define HW_CPU_FLAGS_H
> > > > > +
> > > > > +#include <stdbool.h>
> > > > > +
> > > > > +void disable_kvm_pv_eoi(void);
> > > > > +bool kvm_pv_eoi_disabled(void);
> > > > > +
> > > > > +#endif
> > > > > diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> > > > > index 008d42f..bdbceda 100644
> > > > > --- a/hw/pc_piix.c
> > > > > +++ b/hw/pc_piix.c
> > > > > @@ -46,6 +46,7 @@
> > > > >  #ifdef CONFIG_XEN
> > > > >  #  include <xen/hvm/hvm_info_table.h>
> > > > >  #endif
> > > > > +#include "cpu_flags.h"
> > > > >  
> > > > >  #define MAX_IDE_BUS 2
> > > > >  
> > > > > @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
> > > > >  
> > > > >  static void pc_machine_v1_1_compat(void)
> > > > >  {
> > > > > +    disable_kvm_pv_eoi();
> > > > >  }
> > > > >  
> > > > >  static void pc_init_pci_v1_1(ram_addr_t ram_size,
> > > > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > > > > index 120a2e3..0d02fd1 100644
> > > > > --- a/target-i386/cpu.c
> > > > > +++ b/target-i386/cpu.c
> > > > > @@ -23,6 +23,7 @@
> > > > >  
> > > > >  #include "cpu.h"
> > > > >  #include "kvm.h"
> > > > > +#include "asm/kvm_para.h"
> > > > >  
> > > > >  #include "qemu-option.h"
> > > > >  #include "qemu-config.h"
> > > > > @@ -33,6 +34,7 @@
> > > > >  #include "hyperv.h"
> > > > >  
> > > > >  #include "hw/hw.h"
> > > > > +#include "hw/cpu_flags.h"
> > > > >  
> > > > >  /* feature flags taken from "Intel Processor Identification and the CPUID
> > > > >   * Instruction" and AMD's "CPUID Specification".  In cases of disagreement
> > > > > @@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model)
> > > > >  
> > > > >      plus_kvm_features = ~0; /* not supported bits will be filtered out later */
> > > > >  
> > > > > +    /* Disable PV EOI for old machine types.
> > > > > +     * Feature flags can still override. */
> > > > > +    if (kvm_pv_eoi_disabled()) {
> > > > > +        plus_kvm_features &= ~(0x1 << KVM_FEATURE_PV_EOI);
> > > > > +    }
> > > > > +
> > > > >      add_flagname_to_bitmaps("hypervisor", &plus_features,
> > > > >          &plus_ext_features, &plus_ext2_features, &plus_ext3_features,
> > > > >          &plus_kvm_features, &plus_svm_features);
> > > > > -- 
> > > > > MST
> > > > > 
> > > > > 
> > > > 
> > > > -- 
> > > > Eduardo
> > 
> > -- 
> > Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux