> -----Original Message----- > From: Bjorn Helgaas [mailto:bhelgaas@xxxxxxxxxx] > Sent: Wednesday, March 06, 2013 7:55 AM > To: Desai, Kashyap > Cc: Joe Lawrence; linux-scsi@xxxxxxxxxxxxxxx; DL-MPT Fusion Linux; > Support; Reddy, Sreekanth; Nandigama, Nagalakshmi; James E.J. Bottomley; > linux-pci@xxxxxxxxxxxxxxx > Subject: Re: [PATCH 1/2] mptfusion, mpt2sas, mpt3sas: Don't remove dead > IOC PCI device > > On Tue, Mar 5, 2013 at 6:46 PM, Desai, Kashyap <Kashyap.Desai@xxxxxxx> > wrote: > > > > > >> -----Original Message----- > >> From: Bjorn Helgaas [mailto:bhelgaas@xxxxxxxxxx] > >> Sent: Wednesday, March 06, 2013 4:43 AM > >> To: Desai, Kashyap > >> Cc: Joe Lawrence; linux-scsi@xxxxxxxxxxxxxxx; DL-MPT Fusion Linux; > >> Support; Reddy, Sreekanth; Nandigama, Nagalakshmi; James E.J. > >> Bottomley; linux-pci@xxxxxxxxxxxxxxx > >> Subject: Re: [PATCH 1/2] mptfusion, mpt2sas, mpt3sas: Don't remove > >> dead IOC PCI device > >> > >> [+cc linux-pci] > >> > >> On Mon, Mar 4, 2013 at 11:02 AM, Desai, Kashyap > >> <Kashyap.Desai@xxxxxxx> > >> wrote: > >> > > >> > > >> >> -----Original Message----- > >> >> From: linux-scsi-owner@xxxxxxxxxxxxxxx [mailto:linux-scsi- > >> >> owner@xxxxxxxxxxxxxxx] On Behalf Of Joe Lawrence > >> >> Sent: Monday, March 04, 2013 9:26 PM > >> >> To: linux-scsi@xxxxxxxxxxxxxxx > >> >> Cc: DL-MPT Fusion Linux; Support; Reddy, Sreekanth; Nandigama, > >> >> Nagalakshmi; James E.J. Bottomley; Bjorn Helgaas > >> >> Subject: [PATCH 1/2] mptfusion, mpt2sas, mpt3sas: Don't remove > >> >> dead IOC PCI device > >> >> > >> >> Device removal/addition is a PCI core function, not an HBA > function. > >> >> Calling pci_stop_and_remove_bus_device() from a SCSI LLD may > >> >> introduce device removal races with PCI hotplug. Remove these > >> >> calls from mptfusion, mpt2sas, and mpt3sas, but leave remaining > >> >> dead IOC code in place that flushes outstanding commands and sets > IOC state. > >> > > >> > Joe: I agree that only mptsas/mpt2sas/mpt3sas is calling " > >> pci_stop_and_remove_bus_device" from LLD. If through sysfs we can hot > >> add/remove the device, It would be OK to allow similar simulation > >> from LLD. (In case it is really require.) I agree that calling " > >> pci_stop_and_remove_bus_device" is PCI subsystem's responsibility. > >> > Please consider that finding bad IOC on field is very rare. > >> > > >> > In our case, user wants BAD IOC to be detected immediately (due to > >> performance and mission critical IOs) and it should be removed from > >> the topology ASAP to allow other good IOC and connected topology to > >> work well. > >> > >> It doesn't matter how rare bad IOCs are. But I certainly agree that > >> when we do find a bad IOC, we want to detect it and stop using it as > >> soon as possible. > >> > >> > With the new proposed change you posted with this patch half of the > >> problem still do not resolve. > >> > > >> > Driver set ioc->remove_host = 1 when bad IOC is detected, but IOs > >> > to > >> the connected topology will still continue to flood ( if we do not > >> detach bad IOC from topology) and Scsi Mid layer be busy sending > >> inactive IOs to the bad IOC, and IOs will be return to the OS with > >> "DID_NO_CONNECT". > >> > > >> > So finally to release the topology from BAD IOC from OS is also > >> equally important. This can be achieved either calling ".remove" > >> entry point or what currently driver is doing "remove the IOC from > >> PCI layer which will allow better cleanup. > >> > >> Joe's patch remove the call to pci_stop_and_remove_bus_device(), so > >> the IOC will remain attached to the driver. What if he *did* call > >> the driver's . remove() method instead of calling > >> pci_stop_and_remove_bus_device()? Would that resolve your concern? > >> If not, why not? > > > > I initially tried using Driver's .remove call instead of " > pci_stop_and_remove_bus_device", but found inconsistent view from PCI > layer and LLD will have its own disadvantage. > > With that approach, PCI layer sill observe the bad HBA, but instance > of the same is not visible at LLD. > > What's the problem with this? It is not a problem if the PCI device > exists but no driver has claimed it. That situation exists very often, > when the PCI core has enumerated a device but we haven't loaded a driver > for it. > > > Finally, when someone try to remove module/Do some manual PCI bus scan > through sysfs functionality etc, LLD will not be able to handle those. > > I don't understand this. The mptsas driver can detach from one device > while remaining attached to other similar devices. The driver can then > be removed, and the fact that it is only attached to some of the devices > is not a problem. Obviously, there could be mptsas defects that cause > problems, but we should be able to fix those. > > Why does a manual PCI bus scan via sysfs cause an issue? PCI system still assume is_attacched for that HBA, but LLD has already removed it. I don't really remember where it crashed, but it was inconsistent if HBA is removed only from LLD.. System crash after some times when user perform some extra operations. I understood that, we need to find alternate for this case, instead of calling "pci_stop_and_remove_bus_device".. But this Patch is not correct and working well. It will remove key functionality from Driver. I will be sending one more patch which will remove " pci_stop_and_remove_bus_device" from all three mptsas/mpt2sas/mpt3sas Driver. We need to dump some Critical message for user about problem in their setup and will recommend to remove bad HBA and restart the machine, instead of expecting normal functionality even after detecting bad HBA. Let me work on this and will revert back. > > > Calling .remove entry point from Driver and still keeping instance at > PCI layer will create inconsistency between two layers. > > > >> > >> I would really like to get rid of pci_stop_and_remove_bus_device() in > >> drivers. I don't think there's anything special about > >> mptsas/mpt2sas/mpt3sas that makes it useful for these drivers but not > >> others. > > > > I agree that calling " pci_stop_and_remove_bus_device" should be the > last option, but as of now I found that is a safest way to achieve > functionality. > > What specific functionality are you trying to achieve? We want less destruction to working system because of single BAD HBA. > > If you don't want to call the driver's .remove() method, you can keep > the driver bound to the device but just disconnect the device from the > SCSI midlayer. Then the PCI device exists, the LLD is still bound to > it, but nobody will be able to use it. > > > It is nothing special about mptsas/mpt2sas/mpt3sas related. > > > >> > >> Bjorn > >> > >> >> Signed-off-by: Joe Lawrence <joe.lawrence@xxxxxxxxxxx> > >> >> Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > >> >> Cc: James E.J. Bottomley <JBottomley@xxxxxxxxxxxxx> > >> >> Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@xxxxxxx> > >> >> Cc: Sreekanth Reddy <Sreekanth.Reddy@xxxxxxx> > >> >> Cc: support@xxxxxxx > >> >> Cc: DL-MPTFusionLinux@xxxxxxx > >> >> Cc: linux-scsi@xxxxxxxxxxxxxxx > >> >> --- > >> >> drivers/message/fusion/mptbase.c | 41 +----------------------- > --- > >> --- > >> >> ------ > >> >> drivers/scsi/mpt2sas/mpt2sas_base.c | 42 > >> >> ++---------------------------- > >> >> ------- > >> >> drivers/scsi/mpt3sas/mpt3sas_base.c | 40 > >> >> ++---------------------------- > >> >> ----- > >> >> 3 files changed, 5 insertions(+), 118 deletions(-) > >> >> > >> >> diff --git a/drivers/message/fusion/mptbase.c > >> >> b/drivers/message/fusion/mptbase.c > >> >> index fb69baa..28a421d 100644 > >> >> --- a/drivers/message/fusion/mptbase.c > >> >> +++ b/drivers/message/fusion/mptbase.c > >> >> @@ -63,7 +63,6 @@ > >> >> #ifdef CONFIG_MTRR > >> >> #include <asm/mtrr.h> > >> >> #endif > >> >> -#include <linux/kthread.h> > >> >> #include <scsi/scsi_host.h> > >> >> > >> >> #include "mptbase.h" > >> >> @@ -328,31 +327,6 @@ mpt_is_discovery_complete(MPT_ADAPTER *ioc) > >> >> > >> >> > >> >> /** > >> >> - * mpt_remove_dead_ioc_func - kthread context to remove dead ioc > >> >> - * @arg: input argument, used to derive ioc > >> >> - * > >> >> - * Return 0 if controller is removed from pci subsystem. > >> >> - * Return -1 for other case. > >> >> - */ > >> >> -static int mpt_remove_dead_ioc_func(void *arg) -{ > >> >> - MPT_ADAPTER *ioc = (MPT_ADAPTER *)arg; > >> >> - struct pci_dev *pdev; > >> >> - > >> >> - if ((ioc == NULL)) > >> >> - return -1; > >> >> - > >> >> - pdev = ioc->pcidev; > >> >> - if ((pdev == NULL)) > >> >> - return -1; > >> >> - > >> >> - pci_stop_and_remove_bus_device(pdev); > >> >> - return 0; > >> >> -} > >> >> - > >> >> - > >> >> - > >> >> -/** > >> >> * mpt_fault_reset_work - work performed on workq after ioc > fault > >> >> * @work: input argument, used to derive ioc > >> >> * > >> >> @@ -366,7 +340,6 @@ mpt_fault_reset_work(struct work_struct *work) > >> >> int rc; > >> >> unsigned long flags; > >> >> MPT_SCSI_HOST *hd; > >> >> - struct task_struct *p; > >> >> > >> >> if (ioc->ioc_reset_in_progress || !ioc->active) > >> >> goto out; > >> >> @@ -380,25 +353,13 @@ mpt_fault_reset_work(struct work_struct > *work) > >> >> /* > >> >> * Call mptscsih_flush_pending_cmds callback so that > we > >> >> * flush all pending commands back to OS. > >> >> - * This call is required to aovid deadlock at block > >> layer. > >> >> + * This call is required to avoid deadlock at block > >> layer. > >> >> * Dead IOC will fail to do diag reset,and this call > >> >> is > >> safe > >> >> * since dead ioc will never return any command back > >> >> from HW. > >> >> */ > >> >> hd = shost_priv(ioc->sh); > >> >> ioc->schedule_dead_ioc_flush_running_cmds(hd); > >> >> > >> >> - /*Remove the Dead Host */ > >> >> - p = kthread_run(mpt_remove_dead_ioc_func, ioc, > >> >> - "mpt_dead_ioc_%d", ioc->id); > >> >> - if (IS_ERR(p)) { > >> >> - printk(MYIOC_s_ERR_FMT > >> >> - "%s: Running mpt_dead_ioc thread > failed > >> !\n", > >> >> - ioc->name, __func__); > >> >> - } else { > >> >> - printk(MYIOC_s_WARN_FMT > >> >> - "%s: Running mpt_dead_ioc thread > >> success !\n", > >> >> - ioc->name, __func__); > >> >> - } > >> >> return; /* don't rearm timer */ > >> >> } > >> >> > >> >> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c > >> >> b/drivers/scsi/mpt2sas/mpt2sas_base.c > >> >> index ffd85c5..a97d10c 100644 > >> >> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c > >> >> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c > >> >> @@ -57,7 +57,6 @@ > >> >> #include <linux/sort.h> > >> >> #include <linux/io.h> > >> >> #include <linux/time.h> > >> >> -#include <linux/kthread.h> > >> >> #include <linux/aer.h> > >> >> > >> >> #include "mpt2sas_base.h" > >> >> @@ -115,29 +114,6 @@ module_param_call(mpt2sas_fwfault_debug, > >> >> _scsih_set_fwfault_debug, > >> >> param_get_int, &mpt2sas_fwfault_debug, 0644); > >> >> > >> >> /** > >> >> - * mpt2sas_remove_dead_ioc_func - kthread context to remove dead > >> >> ioc > >> >> - * @arg: input argument, used to derive ioc > >> >> - * > >> >> - * Return 0 if controller is removed from pci subsystem. > >> >> - * Return -1 for other case. > >> >> - */ > >> >> -static int mpt2sas_remove_dead_ioc_func(void *arg) -{ > >> >> - struct MPT2SAS_ADAPTER *ioc = (struct > MPT2SAS_ADAPTER > >> *)arg; > >> >> - struct pci_dev *pdev; > >> >> - > >> >> - if ((ioc == NULL)) > >> >> - return -1; > >> >> - > >> >> - pdev = ioc->pdev; > >> >> - if ((pdev == NULL)) > >> >> - return -1; > >> >> - pci_stop_and_remove_bus_device(pdev); > >> >> - return 0; > >> >> -} > >> >> - > >> >> - > >> >> -/** > >> >> * _base_fault_reset_work - workq handling ioc fault conditions > >> >> * @work: input argument, used to derive ioc > >> >> * Context: sleep. > >> >> @@ -152,7 +128,6 @@ _base_fault_reset_work(struct work_struct > *work) > >> >> unsigned long flags; > >> >> u32 doorbell; > >> >> int rc; > >> >> - struct task_struct *p; > >> >> > >> >> spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags); > >> >> if (ioc->shost_recovery) > >> >> @@ -166,29 +141,16 @@ _base_fault_reset_work(struct work_struct > >> >> *work) > >> >> > >> >> /* > >> >> * Call _scsih_flush_pending_cmds callback so that > >> >> we flush all > >> >> - * pending commands back to OS. This call is > required > >> to > >> >> aovid > >> >> + * pending commands back to OS. This call is > >> >> + required to > >> >> avoid > >> >> * deadlock at block layer. Dead IOC will fail to do > >> >> diag reset, > >> >> * and this call is safe since dead ioc will never > >> >> return any > >> >> * command back from HW. > >> >> */ > >> >> ioc->schedule_dead_ioc_flush_running_cmds(ioc); > >> >> /* > >> >> - * Set remove_host flag early since kernel thread > will > >> >> - * take some time to execute. > >> >> + * Indicate to scsi callbacks that the host has been > >> >> removed. > >> >> */ > >> >> ioc->remove_host = 1; > >> >> - /*Remove the Dead Host */ > >> >> - p = kthread_run(mpt2sas_remove_dead_ioc_func, ioc, > >> >> - "mpt2sas_dead_ioc_%d", ioc->id); > >> >> - if (IS_ERR(p)) { > >> >> - printk(MPT2SAS_ERR_FMT > >> >> - "%s: Running mpt2sas_dead_ioc thread failed > >> !!!!\n", > >> >> - ioc->name, __func__); > >> >> - } else { > >> >> - printk(MPT2SAS_ERR_FMT > >> >> - "%s: Running mpt2sas_dead_ioc thread success > >> !!!!\n", > >> >> - ioc->name, __func__); > >> >> - } > >> >> > >> >> return; /* don't rearm timer */ > >> >> } > >> >> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c > >> >> b/drivers/scsi/mpt3sas/mpt3sas_base.c > >> >> index 04f8010..24fd122 100644 > >> >> --- a/drivers/scsi/mpt3sas/mpt3sas_base.c > >> >> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c > >> >> @@ -57,7 +57,6 @@ > >> >> #include <linux/dma-mapping.h> > >> >> #include <linux/io.h> > >> >> #include <linux/time.h> > >> >> -#include <linux/kthread.h> > >> >> #include <linux/aer.h> > >> >> > >> >> > >> >> @@ -111,28 +110,6 @@ module_param_call(mpt3sas_fwfault_debug, > >> >> _scsih_set_fwfault_debug, > >> >> param_get_int, &mpt3sas_fwfault_debug, 0644); > >> >> > >> >> /** > >> >> - * mpt3sas_remove_dead_ioc_func - kthread context to remove dead > >> >> ioc > >> >> - * @arg: input argument, used to derive ioc > >> >> - * > >> >> - * Return 0 if controller is removed from pci subsystem. > >> >> - * Return -1 for other case. > >> >> - */ > >> >> -static int mpt3sas_remove_dead_ioc_func(void *arg) -{ > >> >> - struct MPT3SAS_ADAPTER *ioc = (struct MPT3SAS_ADAPTER *)arg; > >> >> - struct pci_dev *pdev; > >> >> - > >> >> - if ((ioc == NULL)) > >> >> - return -1; > >> >> - > >> >> - pdev = ioc->pdev; > >> >> - if ((pdev == NULL)) > >> >> - return -1; > >> >> - pci_stop_and_remove_bus_device(pdev); > >> >> - return 0; > >> >> -} > >> >> - > >> >> -/** > >> >> * _base_fault_reset_work - workq handling ioc fault conditions > >> >> * @work: input argument, used to derive ioc > >> >> * Context: sleep. > >> >> @@ -147,7 +124,6 @@ _base_fault_reset_work(struct work_struct > *work) > >> >> unsigned long flags; > >> >> u32 doorbell; > >> >> int rc; > >> >> - struct task_struct *p; > >> >> > >> >> > >> >> spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags); > >> >> @@ > >> >> - > >> >> 162,28 +138,16 @@ _base_fault_reset_work(struct work_struct *work) > >> >> > >> >> /* > >> >> * Call _scsih_flush_pending_cmds callback so that > >> >> we flush all > >> >> - * pending commands back to OS. This call is > required > >> to > >> >> aovid > >> >> + * pending commands back to OS. This call is > >> >> + required to > >> >> avoid > >> >> * deadlock at block layer. Dead IOC will fail to do > >> >> diag reset, > >> >> * and this call is safe since dead ioc will never > >> >> return any > >> >> * command back from HW. > >> >> */ > >> >> ioc->schedule_dead_ioc_flush_running_cmds(ioc); > >> >> /* > >> >> - * Set remove_host flag early since kernel thread > will > >> >> - * take some time to execute. > >> >> + * Indicate to scsi callbacks that the host has been > >> >> removed. > >> >> */ > >> >> ioc->remove_host = 1; > >> >> - /*Remove the Dead Host */ > >> >> - p = kthread_run(mpt3sas_remove_dead_ioc_func, ioc, > >> >> - "mpt3sas_dead_ioc_%d", ioc->id); > >> >> - if (IS_ERR(p)) > >> >> - pr_err(MPT3SAS_FMT > >> >> - "%s: Running mpt3sas_dead_ioc thread failed > >> !!!!\n", > >> >> - ioc->name, __func__); > >> >> - else > >> >> - pr_err(MPT3SAS_FMT > >> >> - "%s: Running mpt3sas_dead_ioc thread success > >> !!!!\n", > >> >> - ioc->name, __func__); > >> >> return; /* don't rearm timer */ > >> >> } > >> >> > >> >> -- > >> >> 1.8.1.2 > >> >> > >> >> -- > >> >> To unsubscribe from this list: send the line "unsubscribe linux- > scsi" > >> >> in the body of a message to majordomo@xxxxxxxxxxxxxxx More > >> >> majordomo info at http://vger.kernel.org/majordomo-info.html > >> > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html