On Tue, Jan 22, 2019 at 02:51:11PM +0000, Marc Zyngier wrote: > On Tue, 22 Jan 2019 13:56:34 +0000, > Dave Martin <Dave.Martin@xxxxxxx> wrote: > > > > On Tue, Jan 22, 2019 at 11:11:09AM +0000, Marc Zyngier wrote: > > > On Tue, 22 Jan 2019 10:17:00 +0000, > > > Dave Martin <Dave.Martin@xxxxxxx> wrote: > > > > > > > > On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote: > > > > > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help > > > > > from the firmware, so KVM implements an interface to provide that for > > > > > guests. When such a guest is migrated, we want to make sure we don't > > > > > loose the protection the guest relies on. > > > > > > > > > > This introduces two new firmware registers in KVM's GET/SET_ONE_REG > > > > > interface, so userland can save the level of protection implemented by > > > > > the hypervisor and used by the guest. Upon restoring these registers, > > > > > we make sure we don't downgrade and reject any values that would mean > > > > > weaker protection. > > > > > > > > Just trolling here, but could we treat these as immutable, like the ID > > > > registers? > > > > > > > > We don't support migration between nodes that are "too different" in any > > > > case, so I wonder if adding complex logic to compare vulnerabilities and > > > > workarounds is liable to create more problems than it solves... > > > > > > And that's exactly the case we're trying to avoid. Two instances of > > > the same HW. One with firmware mitigations, one without. Migrating in > > > one direction is perfectly safe, migrating in the other isn't. > > > > > > It is not about migrating to different HW at all. > > > > So this is a realistic scenario when deploying a firmware update across > > a cluter that has homogeneous hardware -- there will temporarly be > > different firmware versions running on different nodes? > > Case in point: I have on my desk two AMD Seattle systems. One with an > ancient firmware that doesn't mitigate anything, and one that has all > the mitigations applied (and correctly advertised). I can migrate > stuff back and forth, and that's really bad. Agreed. > What people do in their data centre is none of my business, > really. What concerns me is that there is a potential for something > bad to happen without people noticing. And it is KVM's job to do the > right thing in this case. Fair enough. > > My concern is really "will the checking be too buggy / untested in > > practice to be justified by the use case". > > Not doing anything is not going to make the current situation "less > buggy". We have all the stuff we need to test this. We can even > artificially create the various scenarios on a model. Agreed. My concern is about how this will scale if future vulnerabilities are added to the mix. We might ultimately end up in a worse mess, but I may be being paranoid. > > I'll take a closer look at the checking logic. See the other thread. I have an idea there for exposing the information in a different way that may simplfy things (or be totally misguided...) Cheers ---Dave