>>> I've now changed my L2 guest's CPU configuration so that libvirt (in >>> L1) starts the L2 guest with the following settings: >>> >>> <cpu> >>> <model fallback='forbid'>Haswell-noTSX</model> >>> <vendor>Intel</vendor> >>> <feature policy='disable' name='vme'/> >>> <feature policy='disable' name='ss'/> >>> <feature policy='disable' name='f16c'/> >>> <feature policy='disable' name='rdrand'/> >>> <feature policy='disable' name='hypervisor'/> >>> <feature policy='disable' name='arat'/> >>> <feature policy='disable' name='tsc_adjust'/> >>> <feature policy='disable' name='xsaveopt'/> >>> <feature policy='disable' name='abm'/> >>> <feature policy='disable' name='aes'/> >>> <feature policy='disable' name='invpcid'/> >>> </cpu> >> >> Maybe one of these features is the root cause of the "messed up" state >> in KVM. So disabling it also makes the L1 state "less broken". > > Would you try a guess as to which of the above features is a likely culprit? > Hmm, actually no idea, but you can bisect :) (but watch out, it could also just be "coincidence". Especially if you migrate while all VCPUs of L1 are currently not executing L2, chances might be better for L1 to survive a migration - L2 will still fail hard, and L1 certainly, too when trying to run L2 again) -- Thanks, David / dhildenb