> From: Tian, Kevin > Sent: Wednesday, January 5, 2022 9:59 AM > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Sent: Wednesday, January 5, 2022 12:10 AM > > > > On Tue, Jan 04, 2022 at 03:49:07AM +0000, Tian, Kevin wrote: > > > > > btw can you elaborate the DOS concern? The device is assigned > > > to an user application, which has one thread (migration thread) > > > blocked on another thread (vcpu thread) when transiting the > > > device to NDMA state. What service outside of this application > > > is denied here? > > > > The problem is the VM controls when the vPRI is responded and > > migration cannot proceed until this is done. > > > > So the basic DOS is for a hostile VM to trigger a vPRI and then never > > answer it. Even trivially done from userspace with a vSVA and > > userfaultfd, for instance. > > > > This will block the hypervisor from ever migrating the VM in a very > > poor way - it will just hang in the middle of a migration request. > > it's poor but 'hang' won't happen. PCI spec defines completion timeout > for ATS translation request. If timeout the device will abort the in-fly > request and report error back to software. > > > > > Regardless of the complaints of the IP designers, this is a very poor > > direction. > > > > Progress in the hypervisor should never be contingent on a guest VM. > > > > Whether the said DOS is a real concern and how severe it is are usage > specific things. Why would we want to hardcode such restriction on > an uAPI? Just give the choice to the admin (as long as this restriction is > clearly communicated to userspace clearly)... > > IMHO encouraging IP designers to work out better hardware shouldn't > block supporting current hardware which has limitations but also values > in scenarios where those limitations are tolerable. > btw although the uapi is named 'migration', it's really about device state management. Whether the managed device state is further migrated and whether failure to migrate is severe are really not the kernel's business. It's just simple that changing device state could fail. and vPRI here is just one failure reason due to no response from the user after certain timeout (for a user-managed page table). Then it's Qemu which should document the restriction and provide options for the admin to decide whether to expose vPRI vs. migration based on specific usage requirement. The choices could be vPRI-off/ migration-on, vPRI-on/migration-off, or enabling both (migration failure is tolerable or no 'hostile' VM in the setup)... Thanks Kevin