On Tue, Feb 1, 2022 at 7:24 AM Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote: > > On 22-01-31 18:19:24, Jonathan Cameron wrote: > > On Sun, 23 Jan 2022 16:31:02 -0800 > > Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > > > From: Ben Widawsky <ben.widawsky@xxxxxxxxx> > > > > > > The PCIe device DVSEC, defined in the CXL 2.0 spec, 8.1.3 is required to > > > be implemented by CXL 2.0 endpoint devices. Since the information > > > contained within this DVSEC will be critically important, it makes sense > > > to find the value early, and error out if it cannot be found. > > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@xxxxxxxxx> > > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > Guess the logic makes sense about checking this early though my cynical > > mind says, that if someone is putting in devices that claim to be > > CXL ones and this isn't there it is there own problem if they > > kernel wastes effort bringing the driver up only to find later > > it can't finish doing so... > > I don't remember if Dan and I discussed actually failing to bind this early if > the DVSEC isn't there. On second look, the error message does not make sense because there is "no functionality" not "limited functionality" as a result of this failure because the cxl_pci driver just gives up. This failure should be limited to cxl_mem, not cxl_pci as there might still be value in accessing the mailbox on this device. > I think the concern is less about wasted effort and more > about the inability to determine if the device is actively decoding something > and then having the kernel driver tear that out when it takes over the decoder > resources. This was specifically targeted toward the DVSEC range registers > (obviously things would fail later if we couldn't find the MMIO). If there is no CXL DVSEC then cxl_mem should fail, that's it. > I agree with your cynical mind though that it might not be our job to prevent > devices which aren't spec compliant. I'd say if we start seeing bug reports > around this we can revisit. What would the bug report be, "driver fails to attach to device that does not implement the spec"?