On Fri, Apr 30, 2021 at 02:31:40PM +0200, Cornelia Huck wrote: > On Thu, 29 Apr 2021 15:13:47 -0300 > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > On Thu, Apr 29, 2021 at 01:58:55PM +0200, Cornelia Huck wrote: > > > > > > This seems like one of these cases where using the mdev GUID API > > > > was not a great fit. The ccs_driver should have just directly > > > > created a vfio_device and not gone into the mdev guid lifecycle > > > > world. > > > > > > I don't remember much of the discussion back then, but I don't think > > > the explicit generation of devices was the part we needed, but rather > > > some other kind of mediation -- probably iommu related, as subchannels > > > don't have that concept on their own. Anyway, too late to change now. > > > > The mdev part does three significant things: > > - Provide a lifecycle model based on sysfs and the GUIDs > > - Hackily inject itself into the VFIO IOMMU code as a special case > > - Force the creation of a unique iommu group as the group FD is > > mandatory to get the device FD. > > > > This is why PASID is such a mess for mdev because it requires even > > more special hacky stuff to link up the dummy IOMMU but still operate > > within the iommu group of the parent device. > > > > I can see an alternative arrangement using the /dev/ioasid idea that > > is a lot less hacky and does not force the mdev guid lifecycle on > > everyone that wants to create vfio_device. > > I have not followed that discussion -- do you have a summary or a > pointer? I think it is still evolving, I'm hoping Intel can draft some RFC soonish Basically, I'd imagine to put the mdev driver itself directly in charge of how the iommu is operated. When the driver is commanded to connect to an ioasid (which is sort of like a VFIO container) it can tell drivers/iommu exactly what it wants, be it a PASID in a physical iommu device or a simple SW "page table" like the current mdevs use. This would replace all the round about stuff to try and get other components to setup things the way they hope the mdev driver needs. > > All the checks for !private need some kind of locking. The driver core > > model is that the 'struct device_driver' callbacks are all called > > under the device_lock (this prevents the driver unbinding during the > > callback). I didn't check if ccs does this or not.. > > probe/remove/shutdown are basically a forward of the callbacks at the > bus level. These are all covered by device_lock > The css bus should make sure that we serialize > irq/sch_event/chp_event with probe/remove. Hum it doesn't look OK, like here: css_process_crw() css_evaluate_subchannel() sch = bus_find_device() -- So we have a refcount on the struct device css_evaluate_known_subchannel() { if (sch->driver) { if (sch->driver->sch_event) ret = sch->driver->sch_event(sch, slow); } But the above call and touches to sch->driver (which is really just sch->dev.driver) are unlocked and racy. I would hold the device_lock() over all touches to sch->driver outside of a driver core callback. Jason