Jonathan Cameron wrote: > On Tue, 2 Apr 2024 16:45:32 -0700 > Dave Jiang <dave.jiang@xxxxxxxxx> wrote: > > > SBR is equivalent to a device been hot removed and inserted again. Doing a > > SBR on a CXL type 3 device is problematic if the exported device memory is > > part of system memory that cannot be offlined. The event is equivalent to > > violently ripping out that range of memory from the kernel. While the > > hardware requires the "Unmask SBR" bit set in the Port Control Extensions > > register and the kernel currently does not unmask it, user can unmask > > this bit via setpci or similar tool. > > > > The driver does not have a way to detect whether a reset coming from the > > PCI subsystem is a Function Level Reset (FLR) or SBR. The only way to > > detect is to note if a decoder is marked as enabled in software but the > > decoder control register indicates it's not committed. > > > > A helper function is added to find discrepancy between the decoder > > software state versus the hardware register state. > > > > Suggested-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > Signed-off-by: Dave Jiang <dave.jiang@xxxxxxxxx> > > As I said way back on v1, this smells hacky. > > Why not pass the info on what reset was done down from the PCI core? > I see Bjorn commented it would be *possible* to do it in the PCI core > but raised other concerns that needed addressing first (I think you've > dealt with thosenow). Doesn't look that hard to me (I've not coded it > up yet though). > > The core code knows how far it got down the list reset_methods before > it succeeded in resetting. So... > > Modify __pci_reset_function_locked() to return the index of the reset > method that succeeded. Then pass that to pci_dev_restore(). > Finally push it into a reset_done2() that takes that as an extra > parameter and the driver can see if it is FLR or SBR. > The extended reset_done is to avoid modifying lots of drivers. > However a quick grep suggests it's not that heavily used (15ish?) > so maybe just add the parameter. > > There are a few other paths, but non look that problematic at > first glance... > > So Bjorn, now the rest of this is hopefully close to what you'll be > happey with, which way do you prefer? I will defer to Bjorn, but I am not fan of this reset_done2() proposal. "Revalidate after reset" is a common driver pattern and all that plumbing the effective-reset-type does is make cxl_reset_done() more precise for no discernible value.