On Thu, Feb 21, 2019 at 05:20:32PM +0100, Greg Kroah-Hartman wrote: > On Thu, Feb 21, 2019 at 03:47:01PM +0100, Joerg Roedel wrote: > > On Thu, Feb 21, 2019 at 03:15:30PM +0100, Greg Kroah-Hartman wrote: > > > Ugh, good catch! > > > > > > Any hint as to what type of testing that you did that caught this? I > > > keep asking people to run some kvm tests, but so far no one is :( > > > > We caught this at SUSE while testing candidate kernel updates for one of > > our service packs using a 4.4-based kernel and debugging turned > > out that this is issue came in via stable-updates. We also build a > > vanilla-flavour of the kernel which is nearly identical to the upstream > > stable tree, but what usually ends up in testing is the full tree with > > other backports. > > > > This particular issue was found by updating some openstack machines with > > the candidate kernel, which then triggered the problem in some guests. > > It is also a very special one, since I was only able to trigger the > > problem on Westmere-based machines with a specific guest-config. > > Nice work. Any chance that "test" could be added to the kvm testing > scripts that I think are being worked on somewhere? Ideally we would > have caught this before it ever hit the stable tree. Due to the lack of > good KVM testing, that's one of the areas I am always most worried about This bug exists only in the 4.4.y backport; upstream, 4.9.y and 4.14.y all had the correct code from the get-go. And there is already a KVM unit test that *should* hit this, albeit somewhat indirectly. I'll verify the tests that touch the TPR actually run with x2APIC enabled. Assuming the KVM unit test actually works, it's not a stretch for the bug to esacpe, e.g. if the tests weren't run on 4.4.y at all, or were only run on hardware with x2APIC.