* Alex Williamson (alex.williamson@xxxxxxxxxx) wrote: > On Thu, 7 Mar 2019 23:20:36 +0000 > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx] > > > Sent: Friday, March 8, 2019 1:44 AM > > > > > > > > > > > This kind of data needs to be saved / loaded in pre-copy and > > > > > > stop-and-copy phase. > > > > > > The data of device memory is held in device memory region. > > > > > > Size of devie memory is usually larger than that of device > > > > > > memory region. qemu needs to save/load it in chunks of size of > > > > > > device memory region. > > > > > > Not all device has device memory. Like IGD only uses system > > > memory. > > > > > > > > > > It seems a little gratuitous to me that this is a separate region or > > > > > that this data is handled separately. All of this data is opaque to > > > > > QEMU, so why do we need to separate it? > > > > hi Alex, > > > > as the device state interfaces are provided by kernel, it is expected to > > > > meet as general needs as possible. So, do you think there are such use > > > > cases from user space that user space knows well of the device, and > > > > it wants kernel to return desired data back to it. > > > > E.g. It just wants to get whole device config data including all mmios, > > > > page tables, pci config data... > > > > or, It just wants to get current device memory snapshot, not including any > > > > dirty data. > > > > Or, It just needs the dirty pages in device memory or system memory. > > > > With all this accurate query, quite a lot of useful features can be > > > > developped in user space. > > > > > > > > If all of this data is opaque to user app, seems the only use case is > > > > for live migration. > > > > > > I can certainly appreciate a more versatile interface, but I think > > > we're also trying to create the most simple interface we can, with the > > > primary target being live migration. As soon as we start defining this > > > type of device memory and that type of device memory, we're going to > > > have another device come along that needs yet another because they have > > > a slightly different requirement. Even without that, we're going to > > > have vendor drivers implement it differently, so what works for one > > > device for a more targeted approach may not work for all devices. Can > > > you enumerate some specific examples of the use cases you imagine your > > > design to enable? > > > > > > > Do we want to consider an use case where user space would like to > > selectively introspect a portion of the device state (including implicit > > state which are not available through PCI regions), and may ask for > > capability of direct mapping of selected portion for scanning (e.g. > > device memory) instead of always turning on dirty logging on all > > device state? > > I don't see that a migration interface necessarily lends itself to this > use case. A migration data stream has no requirement to be user > consumable as anything other than opaque data, there's also no > requirement that it expose state in a form that directly represents the > internal state of the device. In fact I'm not sure we want to encourage > introspection via this data stream. If a user knows how to interpret > the data, what prevents them from modifying the data in-flight? I've > raised the question previously regarding how the vendor driver can > validate the integrity of the migration data stream. Using the > migration interface to introspect the device certainly suggests an > interface ripe for exploiting any potential weakness in the vendor > driver reassembling that migration stream. If the user has an mmap to > the actual live working state of the vendor driver, protection in the > hardware seems like the only way you could protect against a malicious > user. Please be defensive in what is directly exposed to the user and > what safeguards are in place within the vendor driver for validating > incoming data. Thanks, Hmm; that sounds like a security-by-obscurity answer! The scripts/analyze-migration.py scripts will actually dump the migration stream data in an almost readable format. So if you properly define the VMState definitions it should be almost readable; it's occasionally been useful. I agree that you should be very very careful to validate the incoming migration stream against: a) Corruption b) Wrong driver versions c) Malicious intent c.1) Especially by the guest c.2) Or by someone trying to feed you a duff stream d) Someone trying load the VFIO stream into completely the wrong device. Whether the migration interface is the right thing to use for that inspection hmm; well it might be - if you're trying to debug your device and need a dump of it's state, then why not? (I guess you end up with something not dissimilar to what things like intek_reg_snapshot in intel-gpu-tools does). Dave > Alex -- Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK