On Mon, Nov 08, 2021 at 03:02:36PM +0000, Jonathan Cameron wrote: > On Fri, 5 Nov 2021 16:50:55 -0700 > <ira.weiny@xxxxxxxxx> wrote: > > > From: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > > [snip] > > A few more things came to mind whilst reading the rest of the series. In particular > lifetime management for the doe structures. Thanks for the review I'm just working through all the comments and so I'm somewhat working backwards. > > > * DOC: cxl pci > > @@ -575,17 +576,106 @@ static int cxl_setup_doe_devices(struct cxl_dev_state *cxlds) > > if (rc) > > return rc; > > > > - if (device_attach(&adev->dev) != 1) > > + if (device_attach(&adev->dev) != 1) { > > dev_err(&adev->dev, > > "Failed to attach a driver to DOE device %d\n", > > adev->id); > > + goto next; > > + } > > + > > + if (pci_doe_supports_prot(new_dev, PCI_DVSEC_VENDOR_ID_CXL, > > + CXL_DOE_PROTOCOL_TABLE_ACCESS)) > > + cxlds->cdat_doe = new_dev; > > I'm probably missing something, but what prevents new_dev from going away after > this assignment? > Perhaps a force unbind or driver removal. Should we get a > reference? I had a get_device() here at one point but took it out... Because I was thinking that new_dev's lifetime was equal to cxlds because cxlds 'owned' the DOE devices. However this is totally not true. And there is a race between the device going away and cxlds going away which could be a problem. > > Also it's possible we'll have multiple CDAT supporting DOEs so > I'd suggest checking if cxlds->cdata_doe is already set before setting it. Sure. > > We could break out of the loop early, but I want to bolt the CMA doe detection > in there so I'd rather we didn't. This is all subject to whether we attempt > to generalize this support and move it over to the PCI side of things. I'm not 100% sure about moving it to the PCI side but it does make some sense because really the auxiliary devices are only bounded by the PCI device being available. None of the CXL stuff needs to exist for the DOE driver to talk to the device but the pdev does need to be there... :-/ This is all part of what drove the cxl_mem rename because that structure was really confusing me. Dan got me straightened out but I did not revisit this series after that. Now off the top of my head I'm not sure that cxlds needs to be involved in the auxiliary device creation. OTOH I was making it a central place for in kernel users to know where/how to get information from DOE mailboxes. Hence caching which of these devices had CDAT capability.[1] Since you seem to have arrived at this conclusion before me where in the PCI code do you think is appropriate for this? Ira [1] I'm not really sure what is going to happen if multiple DOE boxes have CDAT capability. This seems like a recipe for confusion. > > > > > +next: > > pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DOE); > > } > > > > return 0; > > } > >