On Tue, Aug 28, 2012 at 07:02:42PM -0300, Eduardo Habkost wrote: > On Wed, Aug 29, 2012 at 12:35:28AM +0300, Michael S. Tsirkin wrote: > > On Tue, Aug 28, 2012 at 04:13:38PM -0300, Eduardo Habkost wrote: > > > On Tue, Aug 28, 2012 at 08:43:52PM +0300, Michael S. Tsirkin wrote: > > > > In preparation for adding PV EOI support, disable PV EOI by default for > > > > 1.1 and older machine types, to avoid CPUID changing during migration. > > > > > > > > PV EOI can still be enabled/disabled by specifying it explicitly. > > > > Enable for 1.1 > > > > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi > > > > Disable for 1.2 > > > > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi > > > > > > > > > > What about users that are already running "qemu-1.1 -M pc-1.1" on a host > > > kernel that supports PV EOI already? They would get PV EOI disabled when > > > migrating to a destination running "qemu-1.2 -M pc-1.1". > > > > > > (On the other hand, people running "qemu-1.1 -M pc-1.1" on a host kernel > > > supporting PV EOI already have migration broken, so there's not much we > > > can do for them) > > > > Exactly. > > > > Talked to Gleb, long term I think we should rework code to make > > it forward-compatible wrt adding new MSRs: > > - source gets list of MSRs to be migrated from KVM and simply sends them all > > - send all MSRS in key/value format > > - destination gets list of MSRs to be migrated from KVM and > > only restores the supported ones > > As far as I understand the migration code requirements/expectations, if > the origin is sending some data, it is because it is part of the > guest-visible machine state that must be kept while migrating. Because > of that, the destination is not allowed to drop anything it doesn't know > about. We have a ton of code that reads in values then just ignores them, for compat with old qemu. This will be exactly such a case: we don't drop anything - protocol does not support this. We read and simply do not tell kvm about it. We also have tons of code that sends useless values again for compatibility. > At the same time, if it's not part of guest-visible machine > state, it doesn't have to be sent by the migration origin. > False, we often send internal device state which is not directly guest visible. > On the other hand, a mode of operation that doesn't require updating > QEMU every time there's a new bit of guest-visible state to be migrated > would be nice (just like the "-cpu host" mode, that doesn't require > updating QEMU for every new CPU feature, is nice for some use cases). I > just don't know how to make work with the current migration protocol. > I don't understand. What is the problem with the proposal? What will not work with our protocol? Can you give an example please? > > Too late for 1.2? > > Absolutely (in my opinion). > > > > > > While we don't make the KVM feature-bit handling sane (with defaults > > > that are not blindly derived from the host kernel capabilities), maybe > > > the safest bet is to expect users to not migrate between hosts running > > > kernels with different KVM capabilities? (I am not sure which option is > > > better) > > > > Sorry not sure what you talk about here. What has KVM feature-bit > > handling to do with this patchset? > > Everything? The whole point of this patch is to filter out the PV_EOI > KVM feature bit. > > > This part of the current code, specifically, is wrong: > > > > > plus_kvm_features = ~0; /* not supported bits will be filtered out later */ > > The QEMU-side list of KVM features should be whitelist-based, not > blacklist-based (unless the user doesn't need migration, in that case he > can use "-cpu host" and get every feature blindly enabled), because QEMU > can't know if a new feature involves guest-visible state that has to be > migrated. > > > > > > > > > > > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > > > > --- > > > > hw/Makefile.objs | 2 +- > > > > hw/cpu_flags.c | 32 ++++++++++++++++++++++++++++++++ > > > > hw/cpu_flags.h | 9 +++++++++ > > > > hw/pc_piix.c | 2 ++ > > > > target-i386/cpu.c | 8 ++++++++ > > > > 5 files changed, 52 insertions(+), 1 deletion(-) > > > > create mode 100644 hw/cpu_flags.c > > > > create mode 100644 hw/cpu_flags.h > > > > > > > > diff --git a/hw/Makefile.objs b/hw/Makefile.objs > > > > index 850b87b..3f2532a 100644 > > > > --- a/hw/Makefile.objs > > > > +++ b/hw/Makefile.objs > > > > @@ -1,5 +1,5 @@ > > > > hw-obj-y = usb/ ide/ > > > > -hw-obj-y += loader.o > > > > +hw-obj-y += loader.o cpu_flags.o > > > > hw-obj-$(CONFIG_VIRTIO) += virtio-console.o > > > > hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o > > > > hw-obj-y += fw_cfg.o > > > > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c > > > > new file mode 100644 > > > > index 0000000..7a633c0 > > > > --- /dev/null > > > > +++ b/hw/cpu_flags.c > > > > @@ -0,0 +1,32 @@ > > > > +/* > > > > + * CPU compatibility flags. > > > > + * > > > > + * Copyright (c) 2012 Red Hat Inc. > > > > + * Author: Michael S. Tsirkin. > > > > + * > > > > + * This program is free software; you can redistribute it and/or modify > > > > + * it under the terms of the GNU General Public License as published by > > > > + * the Free Software Foundation; either version 2 of the License, or > > > > + * (at your option) any later version. > > > > + * > > > > + * This program is distributed in the hope that it will be useful, > > > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > > > + * GNU General Public License for more details. > > > > + * > > > > + * You should have received a copy of the GNU General Public License along > > > > + * with this program; if not, see <http://www.gnu.org/licenses/>. > > > > + */ > > > > +#include "hw/cpu_flags.h" > > > > + > > > > +static bool kvm_pv_eoi_disabled_state; > > > > + > > > > +void disable_kvm_pv_eoi(void) > > > > +{ > > > > + kvm_pv_eoi_disabled_state = true; > > > > +} > > > > + > > > > +bool kvm_pv_eoi_disabled(void) > > > > +{ > > > > + return kvm_pv_eoi_disabled_state; > > > > +} > > > > diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h > > > > new file mode 100644 > > > > index 0000000..05777b6 > > > > --- /dev/null > > > > +++ b/hw/cpu_flags.h > > > > @@ -0,0 +1,9 @@ > > > > +#ifndef HW_CPU_FLAGS_H > > > > +#define HW_CPU_FLAGS_H > > > > + > > > > +#include <stdbool.h> > > > > + > > > > +void disable_kvm_pv_eoi(void); > > > > +bool kvm_pv_eoi_disabled(void); > > > > + > > > > +#endif > > > > diff --git a/hw/pc_piix.c b/hw/pc_piix.c > > > > index 008d42f..bdbceda 100644 > > > > --- a/hw/pc_piix.c > > > > +++ b/hw/pc_piix.c > > > > @@ -46,6 +46,7 @@ > > > > #ifdef CONFIG_XEN > > > > # include <xen/hvm/hvm_info_table.h> > > > > #endif > > > > +#include "cpu_flags.h" > > > > > > > > #define MAX_IDE_BUS 2 > > > > > > > > @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = { > > > > > > > > static void pc_machine_v1_1_compat(void) > > > > { > > > > + disable_kvm_pv_eoi(); > > > > } > > > > > > > > static void pc_init_pci_v1_1(ram_addr_t ram_size, > > > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c > > > > index 120a2e3..0d02fd1 100644 > > > > --- a/target-i386/cpu.c > > > > +++ b/target-i386/cpu.c > > > > @@ -23,6 +23,7 @@ > > > > > > > > #include "cpu.h" > > > > #include "kvm.h" > > > > +#include "asm/kvm_para.h" > > > > > > > > #include "qemu-option.h" > > > > #include "qemu-config.h" > > > > @@ -33,6 +34,7 @@ > > > > #include "hyperv.h" > > > > > > > > #include "hw/hw.h" > > > > +#include "hw/cpu_flags.h" > > > > > > > > /* feature flags taken from "Intel Processor Identification and the CPUID > > > > * Instruction" and AMD's "CPUID Specification". In cases of disagreement > > > > @@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) > > > > > > > > plus_kvm_features = ~0; /* not supported bits will be filtered out later */ > > > > > > > > + /* Disable PV EOI for old machine types. > > > > + * Feature flags can still override. */ > > > > + if (kvm_pv_eoi_disabled()) { > > > > + plus_kvm_features &= ~(0x1 << KVM_FEATURE_PV_EOI); > > > > + } > > > > + > > > > add_flagname_to_bitmaps("hypervisor", &plus_features, > > > > &plus_ext_features, &plus_ext2_features, &plus_ext3_features, > > > > &plus_kvm_features, &plus_svm_features); > > > > -- > > > > MST > > > > > > > > > > > > > > -- > > > Eduardo > > -- > Eduardo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html