Re: [PATCH v3] mpt2sas: setpci reset kernel oops fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nagarajkumar Narayanan <nagarajkumar.narayanan@xxxxxxxxxxx> writes:

> Can you please review the patch
>
> Thanks,
> Nagaraj
>
> On Tue, Jul 21, 2015 at 10:11 AM, Nagarajkumar Narayanan
> <nagarajkumar.narayanan@xxxxxxxxxxx> wrote:
>> I have fixed the lock imbalance could anyone please review the patch
>>
>> Thanks,
>> Nagarajkumar Narayanan
>>
>>
>> On Fri, Jul 17, 2015 at 11:55 AM, Nagarajkumar Narayanan
>> <nagarajkumar.narayanan@xxxxxxxxxxx> wrote:
>>> Patch Description:
>>>
>>> In mpt2sas driver due to lack of synchronization between ioctl,
>>> BRM status access through sysfs, pci resource removal kernel oops
>>> happen as ioctl path and BRM status sysfs access path still tries
>>> to access the removed resources
>>>
>>> kernel: BUG: unable to handle kernel paging request at ffffc900171e0000
>>>
>>> Oops: 0000 [#1] SMP
>>>
>>> Two locks added to provide syncrhonization
>>>
>>> 1. pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and
>>> pci resource handling. PCI resource freeing will lead to free
>>> vital hardware/memory resource, which might be in use by cli/sysfs
>>> path functions resulting in Null pointer reference followed by kernel
>>> crash. To avoid the above race condition we use mutex syncrhonization
>>> which ensures the syncrhonization between cli/sysfs_show path
>>>
>>> 2. spinlock on list operations over IOCs
>>>
>>> Case: when multiple warpdrive cards(IOCs) are in use
>>> Each IOC will added to the ioc list stucture on initialization.
>>> Watchdog threads run at regular intervals to check IOC for any
>>> fault conditions which will trigger the dead_ioc thread to
>>> deallocate pci resource, resulting deleting the IOC netry from list,
>>> this deletion need to protected by spinlock to enusre that
>>> ioc removal is syncrhonized, if not synchronized it might lead to
>>> list_del corruption as the ioc list is traversed in cli path
>>>

Why is the detailed description not included in the patch? This probably
should also be tagged for stable inclusion.

>>>
>>> From 9e54273555d9a34d0d375b91d41318b4ac69a13b Mon Sep 17 00:00:00 2001
>>> From: Nagarajkumar Narayanan <nagarajkumar.narayanan@xxxxxxxxxxx>
>>> Date: Fri, 17 Jul 2015 11:28:27 +0530
>>> Subject: [PATCH] mpt2sas setpci reset oops fix
>>>
>>> setpci reset on nytro warpdrive card along with sysfs access and
>>> cli ioctl access resulted in kernel oops
>>>
>>> 1. pci_access_mutex lock added to provide synchronization between IOCTL,
>>>    sysfs, PCI resource handling path
>>>
>>> 2. gioc_lock spinlock to protect list operations over multiple
>>>    controllers
>>>
>>> Signed-off-by: Nagarajkumar Narayanan <nagarajkumar.narayanan@xxxxxxxxxxx>
>>> ---
>>> * v3
>>> - fixed lock imbalance, moved acquiring mutex lock out of if condition
>>>
>>> * v2
>>> - removed is_warpdrive condition for pci_access_mutex lock
>>>
>>> * v1
>>> - using DEFINE_SPINLOCK() to initialize the lock at compile time instead
>>>   of using spin_lock_init
>>>
>>>  drivers/scsi/mpt2sas/mpt2sas_base.c  |    7 +++++++
>>>  drivers/scsi/mpt2sas/mpt2sas_base.h  |   19 ++++++++++++++++++-
>>>  drivers/scsi/mpt2sas/mpt2sas_ctl.c   |   33 +++++++++++++++++++++++++++++----
>>>  drivers/scsi/mpt2sas/mpt2sas_scsih.c |   15 ++++++++++++++-
>>>  4 files changed, 68 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
>>> index 11248de..c0d36b3 100644
>>> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
>>> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
>>> @@ -108,13 +108,17 @@ _scsih_set_fwfault_debug(const char *val, struct kernel_param *kp)
>>>  {
>>>         int ret = param_set_int(val, kp);
>>>         struct MPT2SAS_ADAPTER *ioc;
>>> +       unsigned long flags;
>>>
>>>         if (ret)
>>>                 return ret;
>>>
>>> +       /* global ioc spinlock to protect controller list on list operations */
>>>         printk(KERN_INFO "setting fwfault_debug(%d)\n", mpt2sas_fwfault_debug);
>>> +       spin_lock_irqsave(&gioc_lock, flags);
>>>         list_for_each_entry(ioc, &mpt2sas_ioc_list, list)
>>>                 ioc->fwfault_debug = mpt2sas_fwfault_debug;
>>> +       spin_unlock_irqrestore(&gioc_lock, flags);
>>>         return 0;
>>>  }
>>>

I don't think you need the irqsave version here, as mpt2sas_ioc_list is
never touched in an IRQ context. But feel free to prove me wrong.

>>> @@ -4435,6 +4439,8 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc)
>>>         dexitprintk(ioc, printk(MPT2SAS_INFO_FMT "%s\n", ioc->name,
>>>             __func__));
>>>
>>> +       /* synchronizing freeing resource with pci_access_mutex lock */
>>> +       mutex_lock(&ioc->pci_access_mutex);
>>>         if (ioc->chip_phys && ioc->chip) {
>>>                 _base_mask_interrupts(ioc);
>>>                 ioc->shost_recovery = 1;
>>> @@ -4454,6 +4460,7 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc)
>>>                 pci_disable_pcie_error_reporting(pdev);
>>>                 pci_disable_device(pdev);
>>>         }
>>> +       mutex_unlock(&ioc->pci_access_mutex);
>>>         return;
>>>  }
>>>
>>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
>>> index caff8d1..c82bdb3 100644
>>> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
>>> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
>>> @@ -799,6 +799,12 @@ typedef void (*MPT2SAS_FLUSH_RUNNING_CMDS)(struct MPT2SAS_ADAPTER *ioc);
>>>   * @delayed_tr_list: target reset link list
>>>   * @delayed_tr_volume_list: volume target reset link list
>>>   * @@temp_sensors_count: flag to carry the number of temperature sensors
>>> + * @pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and
>>> + * pci resource handling. PCI resource freeing will lead to free
>>> + * vital hardware/memory resource, which might be in use by cli/sysfs
>>> + * path functions resulting in Null pointer reference followed by kernel
>>> + * crash. To avoid the above race condition we use mutex syncrhonization
>>> + * which ensures the syncrhonization between cli/sysfs_show path
>>>   */
>>>  struct MPT2SAS_ADAPTER {
>>>         struct list_head list;
>>> @@ -1015,6 +1021,7 @@ struct MPT2SAS_ADAPTER {
>>>         u8              mfg_pg10_hide_flag;
>>>         u8              hide_drives;
>>>
>>> +       struct mutex pci_access_mutex;
>>>  };
>>>
>>>  typedef u8 (*MPT_CALLBACK)(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index,
>>> @@ -1023,6 +1030,17 @@ typedef u8 (*MPT_CALLBACK)(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index,
>>>
>>>  /* base shared API */
>>>  extern struct list_head mpt2sas_ioc_list;
>>> +/* spinlock on list operations over IOCs
>>> + * Case: when multiple warpdrive cards(IOCs) are in use
>>> + * Each IOC will added to the ioc list stucture on initialization.
>>> + * Watchdog threads run at regular intervals to check IOC for any
>>> + * fault conditions which will trigger the dead_ioc thread to
>>> + * deallocate pci resource, resulting deleting the IOC netry from list,
>>> + * this deletion need to protected by spinlock to enusre that
>>> + * ioc removal is syncrhonized, if not synchronized it might lead to
>>> + * list_del corruption as the ioc list is traversed in cli path
>>> + */
>>> +extern spinlock_t gioc_lock;
>>>  void mpt2sas_base_start_watchdog(struct MPT2SAS_ADAPTER *ioc);
>>>  void mpt2sas_base_stop_watchdog(struct MPT2SAS_ADAPTER *ioc);
>>>
>>> @@ -1099,7 +1117,6 @@ struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
>>>      struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
>>>
>>>  void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
>>> -
>>>  void mpt2sas_scsih_reset_handler(struct MPT2SAS_ADAPTER *ioc, int reset_phase);
>>>
>>>  /* config shared API */
>>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_ctl.c b/drivers/scsi/mpt2sas/mpt2sas_ctl.c
>>> index 4e50960..8f8f2ad 100644
>>> --- a/drivers/scsi/mpt2sas/mpt2sas_ctl.c
>>> +++ b/drivers/scsi/mpt2sas/mpt2sas_ctl.c
>>> @@ -427,13 +427,17 @@ static int
>>>  _ctl_verify_adapter(int ioc_number, struct MPT2SAS_ADAPTER **iocpp)
>>>  {
>>>         struct MPT2SAS_ADAPTER *ioc;
>>> -
>>> +       unsigned long flags;
>>> +       /* global ioc lock to protect controller on list operations */
>>> +       spin_lock_irqsave(&gioc_lock, flags);
>>>         list_for_each_entry(ioc, &mpt2sas_ioc_list, list) {
>>>                 if (ioc->id != ioc_number)
>>>                         continue;
>>> +               spin_unlock_irqrestore(&gioc_lock, flags);
>>>                 *iocpp = ioc;
>>>                 return ioc_number;
>>>         }
>>> +       spin_unlock_irqrestore(&gioc_lock, flags);
>>>         *iocpp = NULL;
>>>         return -1;
>>>  }


Same here.

>>> @@ -519,13 +523,19 @@ static unsigned int
>>>  _ctl_poll(struct file *filep, poll_table *wait)
>>>  {
>>>         struct MPT2SAS_ADAPTER *ioc;
>>> +       unsigned long flags;
>>>
>>>         poll_wait(filep, &ctl_poll_wait, wait);
>>>
>>> +       /* global ioc lock to protect controller on list operations */
>>> +       spin_lock_irqsave(&gioc_lock, flags);
>>>         list_for_each_entry(ioc, &mpt2sas_ioc_list, list) {
>>> -               if (ioc->aen_event_read_flag)
>>> +               if (ioc->aen_event_read_flag) {
>>> +                       spin_unlock_irqrestore(&gioc_lock, flags);
>>>                         return POLLIN | POLLRDNORM;
>>> +               }
>>>         }
>>> +       spin_unlock_irqrestore(&gioc_lock, flags);
>>>         return 0;
>>>  }
>>>

And here.


Why not do:

>>> @@ -2168,15 +2178,22 @@ _ctl_ioctl_main(struct file *file, unsigned int cmd, void __user *arg,
>>>
>>>         if (_ctl_verify_adapter(ioctl_header.ioc_number, &ioc) == -1 || !ioc)
>>>                 return -ENODEV;
>>> +       /* pci_access_mutex lock acquired by ioctl path */
>>> +       mutex_lock(&ioc->pci_access_mutex);
rc = -EAGAIN;
>>>         if (ioc->shost_recovery || ioc->pci_error_recovery ||
>>> -           ioc->is_driver_loading)
>>> +               ioc->is_driver_loading || ioc->remove_host) {
>>> +               mutex_unlock(&ioc->pci_access_mutex);
>>>                 return -EAGAIN;
>>> +       }

goto out_unlock_pciaccess;
>>>
>>>         state = (file->f_flags & O_NONBLOCK) ? NON_BLOCKING : BLOCKING;
>>>         if (state == NON_BLOCKING) {
>>> -               if (!mutex_trylock(&ioc->ctl_cmds.mutex))
>>> +               if (!mutex_trylock(&ioc->ctl_cmds.mutex)) {
>>> +                       mutex_unlock(&ioc->pci_access_mutex);
>>>                         return -EAGAIN;
goto out_unlock_pciaccess;
>>> +               }
>>>         } else if (mutex_lock_interruptible(&ioc->ctl_cmds.mutex)) {
>>> +               mutex_unlock(&ioc->pci_access_mutex);
>>>                 return -ERESTARTSYS;

rc = -ERESTARTSYS;
goto out_unlock_pciaccess;
>>>         }
>>>
>>> @@ -2258,6 +2275,7 @@ _ctl_ioctl_main(struct file *file, unsigned int cmd, void __user *arg,
>>>         }
>>>
>>>         mutex_unlock(&ioc->ctl_cmds.mutex);
out_unlock_pciaccess;
>>> +       mutex_unlock(&ioc->pci_access_mutex);
>>>         return ret;
>>>  }

This was just a quick look at the patch, not the code itself, so you'll
need to check if you accidently leave ioc->ctl_cmds.mutex locked if you
do it this way.

But it simplifies the exit points.

>>>
>>> @@ -2711,6 +2729,12 @@ _ctl_BRM_status_show(struct device *cdev, struct device_attribute *attr,
>>>                     "warpdrive\n", ioc->name, __func__);
>>>                 goto out;
>>>         }
>>> +       /* pci_access_mutex lock acquired by sysfs show path */
>>> +       mutex_lock(&ioc->pci_access_mutex);
>>> +       if (ioc->pci_error_recovery || ioc->remove_host) {
>>> +               mutex_unlock(&ioc->pci_access_mutex);
>>> +               return 0;
>>> +       }
>>>
>>>         /* allocate upto GPIOVal 36 entries */
>>>         sz = offsetof(Mpi2IOUnitPage3_t, GPIOVal) + (sizeof(u16) * 36);
>>> @@ -2749,6 +2773,7 @@ _ctl_BRM_status_show(struct device *cdev, struct device_attribute *attr,
>>>
>>>   out:
>>>         kfree(io_unit_pg3);
>>> +       mutex_unlock(&ioc->pci_access_mutex);
>>>         return rc;
>>>  }
>>>  static DEVICE_ATTR(BRM_status, S_IRUGO, _ctl_BRM_status_show, NULL);
>>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
>>> index 3f26147..9f7ed0f 100644
>>> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
>>> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
>>> @@ -79,7 +79,8 @@ static int _scsih_scan_finished(struct Scsi_Host *shost, unsigned long time);
>>>
>>>  /* global parameters */
>>>  LIST_HEAD(mpt2sas_ioc_list);
>>> -
>>> +/* global ioc lock for list operations */
>>> +DEFINE_SPINLOCK(gioc_lock);
>>>  /* local parameters */
>>>  static u8 scsi_io_cb_idx = -1;
>>>  static u8 tm_cb_idx = -1;
>>> @@ -288,13 +289,16 @@ _scsih_set_debug_level(const char *val, struct kernel_param *kp)
>>>  {
>>>         int ret = param_set_int(val, kp);
>>>         struct MPT2SAS_ADAPTER *ioc;
>>> +       unsigned long flags;
>>>
>>>         if (ret)
>>>                 return ret;
>>>
>>>         printk(KERN_INFO "setting logging_level(0x%08x)\n", logging_level);
>>> +       spin_lock_irqsave(&gioc_lock, flags);
>>>         list_for_each_entry(ioc, &mpt2sas_ioc_list, list)
>>>                 ioc->logging_level = logging_level;
>>> +       spin_unlock_irqrestore(&gioc_lock, flags);
>>>         return 0;
>>>  }
>>>  module_param_call(logging_level, _scsih_set_debug_level, param_get_int,
>>> @@ -7867,7 +7871,9 @@ _scsih_remove(struct pci_dev *pdev)
>>>         sas_remove_host(shost);
>>>         scsi_remove_host(shost);
>>>         mpt2sas_base_detach(ioc);
>>> +       spin_lock_irqsave(&gioc_lock, flags);
>>>         list_del(&ioc->list);
>>> +       spin_unlock_irqrestore(&gioc_lock, flags);
>>>         scsi_host_put(shost);
>>>  }
>>>
>>> @@ -8132,6 +8138,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>>         struct MPT2SAS_ADAPTER *ioc;
>>>         struct Scsi_Host *shost;
>>>         int rv;
>>> +       unsigned long flags;
>>>
>>>         shost = scsi_host_alloc(&scsih_driver_template,
>>>             sizeof(struct MPT2SAS_ADAPTER));
>>> @@ -8142,7 +8149,9 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>>         ioc = shost_priv(shost);
>>>         memset(ioc, 0, sizeof(struct MPT2SAS_ADAPTER));
>>>         INIT_LIST_HEAD(&ioc->list);
>>> +       spin_lock_irqsave(&gioc_lock, flags);
>>>         list_add_tail(&ioc->list, &mpt2sas_ioc_list);
>>> +       spin_unlock_irqrestore(&gioc_lock, flags);

Again, is the irqsave version really needed?

>>>         ioc->shost = shost;
>>>         ioc->id = mpt_ids++;
>>>         sprintf(ioc->name, "%s%d", MPT2SAS_DRIVER_NAME, ioc->id);
>>> @@ -8167,6 +8176,8 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>>         ioc->schedule_dead_ioc_flush_running_cmds = &_scsih_flush_running_cmds;
>>>         /* misc semaphores and spin locks */
>>>         mutex_init(&ioc->reset_in_progress_mutex);
>>> +       /* initializing pci_access_mutex lock */
>>> +       mutex_init(&ioc->pci_access_mutex);
>>>         spin_lock_init(&ioc->ioc_reset_in_progress_lock);
>>>         spin_lock_init(&ioc->scsi_lookup_lock);
>>>         spin_lock_init(&ioc->sas_device_lock);
>>> @@ -8269,7 +8280,9 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>>   out_attach_fail:
>>>         destroy_workqueue(ioc->firmware_event_thread);
>>>   out_thread_fail:
>>> +       spin_lock_irqsave(&gioc_lock, flags);
>>>         list_del(&ioc->list);
>>> +       spin_unlock_irqrestore(&gioc_lock, flags);
>>>         scsi_host_put(shost);
>>>         return rv;
>>>  }
>>> --
>>> 1.7.1
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=AwIBAg&c=IGDlg0lD0b-nebmJJ0Kp8A&r=pgTjrEDGfGqPu_zr6l0Sfha_2nqkTk8bj3lFa2LT_F1MrbglgV67gsujNu2SE4YG&m=ThFlxMm2uiQyfGyRg8DjB6ripUHLJepg9p5KAhR_nyw&s=W1BHmEd6ZxZX-lsL0P1VTbdAQ9zpAXLjCCl4LS3bUhI&e=
>>
>>
>>
>> --
>> Nagarajkumar Narayanan

Thanks,
Johannes

-- 
Johannes Thumshirn                                           Storage
jthumshirn@xxxxxxx                                 +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600  D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux