On Thu, Feb 8, 2018 at 2:47 PM, David Hildenbrand <david@xxxxxxxxxx> wrote: >> Again, I'm somewhat struggling to understand this vs. live migration — >> but it's entirely possible that I'm sorely lacking in my knowledge of >> kernel and CPU internals. > > (savevm/loadvm is also called "migration to file") > > When we migrate to a file, it really is the same migration stream. You > "dump" the VM state into a file, instead of sending it over to another > (running) target. > > Once you load your VM state from that file, it is a completely fresh > VM/KVM environment. So you have to restore all the state. Now, as nVMX > state is not contained in the migration stream, you cannot restore that > state. The L1 state is therefore "damaged" or incomplete. *lightbulb* Thanks a lot, that's a perfectly logical explanation. :) >> Now, here's a bit more information on my continued testing. As I >> mentioned on IRC, one of the things that struck me as odd was that if >> I ran into the issue previously described, the L1 guest would enter a >> reboot loop if configured with kernel.panic_on_oops=1. In other words, >> I would savevm the L1 guest (with a running L2), then loadvm it, and >> then the L1 would stack-trace, reboot, and then keep doing that >> indefinitely. I found that weird because on the second reboot, I would >> expect the system to come up cleanly. > > Guess the L1 state (in the kernel) is broken that hard, that even a > reset cannot fix it. ... which would also explain that in contrast to that, a virsh destroy/virsh start cycle does fix things. >> I've now changed my L2 guest's CPU configuration so that libvirt (in >> L1) starts the L2 guest with the following settings: >> >> <cpu> >> <model fallback='forbid'>Haswell-noTSX</model> >> <vendor>Intel</vendor> >> <feature policy='disable' name='vme'/> >> <feature policy='disable' name='ss'/> >> <feature policy='disable' name='f16c'/> >> <feature policy='disable' name='rdrand'/> >> <feature policy='disable' name='hypervisor'/> >> <feature policy='disable' name='arat'/> >> <feature policy='disable' name='tsc_adjust'/> >> <feature policy='disable' name='xsaveopt'/> >> <feature policy='disable' name='abm'/> >> <feature policy='disable' name='aes'/> >> <feature policy='disable' name='invpcid'/> >> </cpu> > > Maybe one of these features is the root cause of the "messed up" state > in KVM. So disabling it also makes the L1 state "less broken". Would you try a guess as to which of the above features is a likely culprit? >> Basically, I am disabling every single feature that my L1's "virsh >> capabilities" reports. Now this does not make my L1 come up happily >> from loadvm. But it does seem to initiate a clean reboot after loadvm, >> and after that clean reboot it lives happily. >> >> If this is as good as it gets (for now), then I can totally live with >> that. It certainly beats running the L2 guest with Qemu (without KVM >> acceleration). But I would still love to understand the issue a little >> bit better. > > I mean the real solution to the problem is of course restoring the L1 > state correctly (migrating nVMX state, what people are working on right > now). So what you are seeing is a bad "side effect" of that. > > For now, nested=true should never be used along with savevm/loadvm/live > migration. Yes, I gathered as much. :) Thanks again! Cheers, Florian