sysfs methods can race with ->remove

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tejun:

The context is that we have been talking about
drivers/scsi/scsi_scan.c:scsi_rescan_device(), which is called by the
store_rescan_field() sysfs method in scsi_sysfs.c.  The problem is
this: What happens in scsi_rescan_device if the device is unbound from
its driver before the module_put call?  The dev->driver->owner
calculation would dereference a NULL pointer.

On Thu, 15 Jan 2015, Christoph Hellwig wrote:

> On Wed, Jan 14, 2015 at 10:07:00AM -0500, Alan Stern wrote:
> > and the kernfs core insures that the underlying device won't be 
> > deallocated while a sysfs method runs.
> 
> It has a reference to keep it from beeing freed, but so far I can't find
> anything that prevents ->remove from beeing called while we are in or
> just before a method call.

There are two types of methods to think about: Those registered by the 
subsystem and those registered by the driver.

If a method is registered by the driver, then the driver will
unregister it when the ->remove routine runs.  I don't know for
certain, but I would expect that the sysfs/kernfs core will make sure
that any existing method calls complete before unregister returns.  
This would prevent races.

If a method is registered by the subsystem, and if the method runs 
entirely within the subsystem's code, then ->remove doesn't matter.  
The driver could be unbound while the method is running and it would be 
okay.

The only time we have a problem is when the method is registered by the 
subsystem and the method calls into the driver.  (Note that this is 
exactly what happens with scsi_rescan_device.)

> > > But this seems like a more generic problem, and at least a quick glance at
> > > the pci_driver methods seems like others don't have a good
> > > synchroniation of ->remove against random driver methods.
> > 
> > Can you give one or two examples?
> 
> I look at the sriov_configure PCI method, or the various sub-methods
> under pci_driver.err_handler.

The sriov_numvfs_store method does have the same problem, and so does 
the reset_store method (by way of pci_reset_function -> 
pci_dev_save_and_disable -> pci_reset_notify).

Tejun, is my analysis correct?  How should we fix these races?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux