On Fri, Mar 26, 2021 at 09:00:50AM -0700, Alexander Duyck wrote: > On Thu, Mar 25, 2021 at 11:44 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote: > > > > On Thu, Mar 25, 2021 at 03:28:36PM -0300, Jason Gunthorpe wrote: > > > On Thu, Mar 25, 2021 at 01:20:21PM -0500, Bjorn Helgaas wrote: > > > > On Thu, Mar 25, 2021 at 02:36:46PM -0300, Jason Gunthorpe wrote: > > > > > On Thu, Mar 25, 2021 at 12:21:44PM -0500, Bjorn Helgaas wrote: > > > > > > > > > > > NVMe and mlx5 have basically identical functionality in this respect. > > > > > > Other devices and vendors will likely implement similar functionality. > > > > > > It would be ideal if we had an interface generic enough to support > > > > > > them all. > > > > > > > > > > > > Is the mlx5 interface proposed here sufficient to support the NVMe > > > > > > model? I think it's close, but not quite, because the the NVMe > > > > > > "offline" state isn't explicitly visible in the mlx5 model. > > > > > > > > > > I thought Keith basically said "offline" wasn't really useful as a > > > > > distinct idea. It is an artifact of nvme being a standards body > > > > > divorced from the operating system. > > > > > > > > > > In linux offline and no driver attached are the same thing, you'd > > > > > never want an API to make a nvme device with a driver attached offline > > > > > because it would break the driver. > > > > > > > > I think the sticky part is that Linux driver attach is not visible to > > > > the hardware device, while the NVMe "offline" state *is*. An NVMe PF > > > > can only assign resources to a VF when the VF is offline, and the VF > > > > is only usable when it is online. > > > > > > > > For NVMe, software must ask the PF to make those online/offline > > > > transitions via Secondary Controller Offline and Secondary Controller > > > > Online commands [1]. How would this be integrated into this sysfs > > > > interface? > > > > > > Either the NVMe PF driver tracks the driver attach state using a bus > > > notifier and mirrors it to the offline state, or it simply > > > offline/onlines as part of the sequence to program the MSI change. > > > > > > I don't see why we need any additional modeling of this behavior. > > > > > > What would be the point of onlining a device without a driver? > > > > Agree, we should remember that we are talking about Linux kernel model > > and implementation, where _no_driver_ means _offline_. > > The only means you have of guaranteeing the driver is "offline" is by > holding on the device lock and checking it. So it is only really > useful for one operation and then you have to release the lock. The > idea behind having an "offline" state would be to allow you to > aggregate multiple potential operations into a single change. What we really want is a solution where the SRIOV device exist for the HW but isn't registered yet as a pci_device. We have endless problems with needing to configure SRIOV instances at the PF before they get plugged into the kernel and the no driver autoprobe buisness is such a hack. But that is a huge problem and not this series. Jason