RE: [PATCH RFC v2] vfio: Documentation for the migration region

Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx> · Wed, 1 Dec 2021 09:54:27 +0000

> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg@xxxxxxxxxx]
> Sent: 01 December 2021 03:14
> To: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Cc: Jonathan Corbet <corbet@xxxxxxx>; linux-doc@xxxxxxxxxxxxxxx; Cornelia
> Huck <cohuck@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; Kirti Wankhede
> <kwankhede@xxxxxxxxxx>; Max Gurtovoy <mgurtovoy@xxxxxxxxxx>;
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>; Yishai
> Hadas <yishaih@xxxxxxxxxx>
> Subject: Re: [PATCH RFC v2] vfio: Documentation for the migration region
> 
> On Tue, Nov 30, 2021 at 03:35:41PM -0700, Alex Williamson wrote:
> 
> > > From what HNS said the device driver would have to trap every MMIO to
> > > implement NDMA as it must prevent touches to the physical HW MMIO to
> > > maintain the NDMA state.
> > >
> > > The issue is that the HW migration registers can stop processing the
> > > queue and thus enter NDMA but a MMIO touch can resume queue
> > > processing, so NDMA cannot be sustained.
> > >
> > > Trapping every MMIO would have a huge negative performance impact.
> So
> > > it doesn't make sense to do so for a device that is not intended to be
> > > used in any situation where NDMA is required.
> >
> > But migration is a cooperative activity with userspace.  If necessary
> > we can impose a requirement that mmap access to regions (other than the
> > migration region itself) are dropped when we're in the NDMA or !RUNNING
> > device_state.
> 
> It is always NDMA|RUNNING, so we can't fully drop access to
> MMIO. Userspace would have to transfer from direct MMIO to
> trapping. With enough new kernel infrastructure and qemu support it
> could be done.

As far as our devices are concerned we put the dev queue into a PAUSE state
in the !RUNNUNG state. And since we don't have any P2P support, is it ok
to put the onus on userspace here that it won't try to access the MMIO during
!RUNNUNG state?

So just to make it clear , if a device declares that it doesn't support NDMA
and P2P, is the v1 version of the spec good enough or we still need to take
care the case that a malicious user might try MMIO access in !RUNNING
state and should have kernel infrastructure in place to safe guard that?

> 
> Even so, we can't trap accesses through the IOMMU so such a scheme
> would still require removing IOMMU acess to the device. Given that the
> basic qemu mitigation for no NDMA support is to eliminate P2P cases by
> removing the IOMMU mappings this doesn't seem to advance anything and
> only creates complexity.
> 
> At least I'm not going to insist that hns do all kinds of work like
> this for a edge case they don't care about as a precondition to get a
> migration driver.

Yes. That's our concern too.

(Just a note to clarify that these are not HNS devices per se. HNS actually
stands for HiSilicon Network Subsystem and doesn't currently have live
migration capability. The devices capable of live migration are HiSilicon
Accelerator devices).

Thanks,
Shameer