On Tue, 12 Mar 2019 02:48:39 +0000 "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx] > > Sent: Tuesday, March 12, 2019 4:19 AM > > On Mon, 11 Mar 2019 02:33:11 +0000 > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > > [...] > > > > I think I've fully conceded any notion of security by obscurity towards > > opaque data already, but segregating types of device data still seems > > to unnecessarily impose a usage model on the vendor driver that I think > > we should try to avoid. The migration interface should define the > > protocol through which userspace can save and restore the device state, > > not impose how the vendor driver exposes or manages that state. Also, I > > got the impression (perhaps incorrectly) that you were trying to mmap > > live data to userspace, which would allow not only saving the state, > > but also unchecked state modification by userspace. I think we want > > more of a producer/consumer model of the state where consuming state > > also involves at least some degree of sanity or consistency checking. > > Let's not forget too that we're obviously dealing with non-open source > > driver in the mdev universe as well. > > OK. I think for this part we are in agreement - as long as the goal of > this interface is clearly defined as such way. :-) > > [...] > > > But I still didn't get your exact concern about security part. For > > > version yes we still haven't worked out a sane way to represent > > > vendor-specific compatibility requirement. But allowing user > > > space to modify data through this interface has really no difference > > > from allowing guest to modify data through trapped MMIO interface. > > > mdev driver should guarantee that operations through both interfaces > > > can modify only the state associated with the said mdev instance, > > > w/o breaking the isolation boundary. Then the former becomes just > > > a batch of operations to be verified in the same way as if they are > > > done individually through the latter interface. > > > > It seems like you're assuming a working model for the vendor driver and > > the data entering and exiting through this interface. The vendor > > drivers can expose data any way that they want. All we need to do is > > imagine that the migration data stream includes an array index count > > somewhere which the user could modify to trigger the in-kernel vendor > > driver to allocate an absurd array size and DoS the target. This is > > probably the most simplistic attack, possibly knowing the state machine > > of the vendor driver a malicious user could trick it into providing > > host kernel data. We're not necessarily only conveying state that the > > user already has access to via this interface, the vendor driver may > > include non-visible internal state as well. Even the state that is > > user accessible is being pushed into the vendor driver via an alternate > > path from the user mediation we have on the existing paths. > > Then I don't know how this concern can be effectively addressed > since you assume vendor drivers are not trusted here. and why do > you trust vendor drivers on mediating existing path but not this > alternative one? non-visible internal states just mean more stuff > to be carefully scrutinized, which is not essentially causing a > conceptual difference of trust level. > > Or can this concern be partially mitigated if we create some > test cases which poke random data through the new interface, > and mark vendor drivers which pass such tests as trusted? Then > there is also an open who should be in charge of such certification > process... The vendor driver is necessarily trusted, it lives in the kernel, it works in the kernel address space. Unfortunately that's also the risk with passing data from userspace into the vendor driver, the vendor driver needs to take every precaution in sanitizing and validating that data. I wish we had a common way to perform that checking, but it seems that each vendor driver is going to need to define their own protocol and battle their own bugs and exploits in the code implementing that protocol. For open source drivers we can continue to rely on review and openness, for closed drivers... the user has already accepted the risk to trust the driver themselves. Perhaps all I can do is raise the visibility that there are potential security issues here and vendor drivers need to own that risk. A fuzzing test would be great, we could at least validate whether a vendor driver implements some sort of CRC test, but I don't think we can create a certification process around that. Thanks, Alex