On Tue, 16 Jul 2013 16:03:38 +0400 James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 2013-07-16 at 17:30 +0530, Reddy, Sreekanth wrote: > > James, > > > > This patch seem to be fine. Please consider this patch. > > Where's the new version? The one that has all of this fixed: > > > Off list, Sreekanth from LSI tested and noticed a few issues with this > > patch: > > > > - mpt2sas_base_stop_watchdog is called twice: The call from > > mpt2sas_base_detach is safe, but now unnecessary (as a call was > > added earlier up in the PCI driver callbacks to ensure that the > > watchdog was out of the way.) This second invocation can be > > removed. > > > > - If the watchdog detects a bad IOC, the watchdog remains running: > > The watchdog workqueue isn't cleaned up until > > mpt2sas_base_stop_watchdog is called, so in the case that the > > watchdog removes the device from SCSI topo, the workqueue will > > remain unused until PCI .remove/.shutdown cleans it up. Perhaps a > > single watchdog that iterates over all adapters would be simpler? > > > > Finally, if SCSI topo detachment is all that is interesting here, > > would > > it make more sense to move the watchdog into the MPT "scsi" code? I > > haven't looked at the code yet, but this might make an MPT fusion > > patch > > easier (due to dependencies between its "scsi" and "base" modules). This patch fizzled out in May as other work took priority. If LSI is still interested in these changes, I can dust off my notes and test/rebase for the 3.11 series. A few of the issues quoted above are easily fixed, however I remember having an outstanding question of how to best clean up the driver's per device watchdog workqueue: The way the MPT drivers are working right now is that the watchdog workqueue function _base_fault_reset_work() initiates a PCI device removal via kthread. The PCI callback kthread context then tears down the device and cancel/flush/destroys the watchdog workqueue. This patch eliminated the kthread and its call into PCI API, simply detaching from the SCSI midlayer. In my opinion, the kthread complicated device removal and introduced potential races if the watchdog tried removing the device at the same time an ordinary device removal request occurred. At the time, the best solution I had was to leave the unused workqueue around until its PCI device was removed. Regards, -- Joe -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html