Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 27, 2021 at 01:05:20PM -0600, Alex Williamson wrote:

> > As far as the actual issue, if you hadn't just discovered it now
> > nobody would have known we have this gap - much like how the very
> > similar reset issue was present in VFIO for so many years until you
> > plugged it.
> 
> But the fact that we did discover it is hugely important.  We've
> identified that the potential use case is significantly limited and
> that userspace doesn't have a good mechanism to determine when to
> expose that limitation to the user.  

Huh?

We've identified that, depending on device behavior, the kernel may
need to revoke MMIO access to protect itself from hostile userspace
triggering TLP Errors or something.

Well behaved userspace must already stop touching the MMIO on the
device when !RUNNING - I see no compelling argument against that
position.

We've been investigating how the mlx5 HW will behave in corner cases,
and currently it looks like mlx5 vfio will not generate error TLPs, or
corrupt the device itself due to MMIO operations when !RUNNING. So the
driver itself, as written, probably does not currently have a bug
here, or need changes.

> We're tossing around solutions that involve extensions, if not
> changes to the uAPI.  It's Wednesday of rc7.

The P2P issue is seperate, and as I keep saying, unless you want to
block support for any HW that does not have freeze&queice userspace
must be aware of this ability and it is logical to design it as an
extension from where we are now.

> I feel like we've already been burned by making one of these
> "reasonable quanta of progress" to accept and mark experimental
> decisions with where we stand between defining the uAPI in the kernel
> and accepting an experimental implementation in QEMU.  

I won't argue there..

> Now we have multiple closed driver implementations (none of which
> are contributing to this discussion), but thankfully we're not
> committed to supporting them because we have no open
> implementations.  I think we could get away with ripping up the uAPI
> if we really needed to.

Do we need to?

> > > Deciding at some point in the future to forcefully block device MMIO
> > > access from userspace when the device stops running is clearly a user
> > > visible change and therefore subject to the don't-break-userspace
> > > clause.    
> > 
> > I don't think so, this was done for reset retroactively after
> > all. Well behaved qmeu should have silenced all MMIO touches as part
> > of the ABI contract anyhow.
> 
> That's not obvious to me and I think it conflates access to the device
> and execution of the device.  If it's QEMU's responsibility to quiesce
> access to the device anyway, why does the kernel need to impose this
> restriction.  I'd think we'd generally only impose such a restriction
> if the alternative allows the user to invoke bad behavior outside the
> scope of their use of the device or consistency of the migration data.
> It appears that any such behavior would be implementation specific here.

I think if an implementation has a problem, like error TLPs, then yes,
it must fence. The conservative specification of the uAPI is that
userspace should not allow MMIO when !RUNNING.

If we ever get any implementation that needs this to fence then we
should do it for all implementations just out of consistency.

> > The "don't-break-userspace" is not an absolute prohibition, Linus has
> > been very clear this limitation is about direct, ideally demonstrable,
> > breakage to actually deployed software.
> 
> And if we introduce an open driver that unblocks QEMU support to become
> non-experimental, I think that's where we stand.

Yes, if qemu becomes deployed, but our testing shows qemu support
needs a lot of work before it is deployable, so that doesn't seem to
be an immediate risk.

> > > That might also indicate that "freeze" is only an implementation
> > > specific requirement.  Thanks,  
> > 
> > It doesn't matter if a theoretical device can exist that doesn't need
> > "freeze" - this device does, and so it is the lowest common
> > denominator for the uAPI contract and userspace must abide by the
> > restriction.
> 
> Sorry, "to the victor go the spoils" is not really how I strictly want
> to define a uAPI contract with userspace.  

This is not the "victor go the spoils" this is meeting the least
common denominator of HW we have today.

If some fictional HW can be more advanced and can snapshot not freeze,
that is great, but it doesn't change one bit that mlx5 cannot and will
not work that way. Since mlx5 must be supported, there is no choice
but to define the uAPI around its limitations.

snapshot devices are strictly a superset of freeze devices, they can
emulate freeze by doing snapshot at the freeze operation.

In all cases userspace should not touch the device when !RUNNING to
preserve generality to all implementations.

> If we're claiming that userspace is responsible for quiescing
> devices and we're providing a means for that to occur, and userspace
> is already responsible for managing MMIO access, then the only
> reason the kernel would forcefully impose such a restriction itself
> would be to protect the host and the implementation of that would
> depend on whether this is expected to be a universal or device
> specific limitation.  

I think the best way forward is to allow for revoke to happen if we
ever need it (by specification), and not implement it right now.

So, I am not left with a clear idea what is still open that you see as
blocking. Can you summarize?

Jason



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux