On Thu, 2012-05-17 at 08:55 +0000, Bart Van Assche wrote: > On 05/17/12 08:42, James Bottomley wrote: > > > On Wed, 2012-04-18 at 18:48 +0200, Tomas Henzl wrote: > >> On 04/12/2012 02:48 PM, Tomas Henzl wrote: > >>> Hi, > >>> > >>> the first patch waited in scsi_remove_host was until the scan thread ends, some flags were set > >>> so the thread could be aborted prematurely. > >>> > >>> This patch uses a try_module_get to lock the module and prevents a rmmod while it's locked. > >>> The disadvantage is that in the protection we use again the same host template (shost->hostt->module;), > >>> and when this were not valid I think and oops in try_module_get will follow. It seems to me > >>> that the LLDs use scsi_scan_host only in the initial attach function so this is safe now, but .. > >> Actually I did some testing and it is not so stable as it should be - after 'modprobe mpt2sas && rmmod --wait mpt2sas' > >> I sometimes see: > >> > >> [ 215.126197] mpt2sas version 12.100.00.00 loaded > >> [ 215.127509] scsi8 : Fusion MPT SAS Host > >> [ 215.128058] mpt2sas0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (6096496 kB) > >> [ 215.128167] mpt2sas 0000:03:00.0: irq 69 for MSI/MSI-X > >> [ 215.128190] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 69 > >> [ 215.128192] mpt2sas0: iomem(0x00000000f7040000), mapped(0xffffc90006010000), size(16384) > >> [ 215.128194] mpt2sas0: ioport(0x000000000000d000), size(256) > >> [ 215.241249] mpt2sas0: Allocated physical memory: size(3988 kB) > >> [ 215.241253] mpt2sas0: Current Controller Queue Depth(1753), Max Controller Queue Depth(2000) > >> [ 215.241256] mpt2sas0: Scatter Gather Elements per IO(128) > >> [ 215.300281] mpt2sas0: LSISAS2008: FWVersion(11.00.00.00), ChipRevision(0x01), BiosVersion(07.23.01.00) > >> [ 215.300284] mpt2sas0: Protocol=(Initiator), Capabilities=(Raid,TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) > >> [ 215.300602] mpt2sas0: sending port enable !! > >> [ 216.854720] mpt2sas0: host_add: handle(0x0001), sas_addr(0x500605b0006b8530), phys(8) > >> [ 222.978469] mpt2sas0: port enable: SUCCESS > >> [ 222.980104] scsi 8:0:0:0: Direct-Access SEAGATE ST9146852SS 0005 PQ: 0 ANSI: 5 > >> [ 222.980113] scsi 8:0:0:0: SSP: handle(0x0009), sas_addr(0x5000c5001ac10319), phy(2), device_name(0x5000c5001ac10318) > >> [ 222.980117] scsi 8:0:0:0: SSP: enclosure_logical_id(0x500605b0006b8530), slot(2) > >> [ 222.980121] scsi 8:0:0:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) > >> [ 222.981149] mpt2sas version 12.100.00.00 unloading > >> [ 222.981467] BUG: unable to handle kernel NULL pointer dereference at 0000000000000079 > >> [ 222.981483] IP: [<ffffffff811f0c45>] sysfs_create_dir+0x35/0xc0 > >> [ 222.981499] PGD 0 > >> [ 222.981507] Oops: 0000 [#1] SMP > >> [ 222.981518] CPU 0 > >> [ 222.981522] Modules linked in: mpt2sas(-) lockd bnep bluetooth be2iscsi iscsi_boot_sysfs bnx2i cnic uio ip6t_REJECT cxgb4i cxgb4 cxgb3i > >> [ 222.981571] mpt2sas0: removing handle(0x0009), sas_addr(0x5000c5001ac10319) > >> [ 222.981583] libcxgbi cxgb3 nf_conntrack_ipv6 nf_defrag_ipv6 mdio nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip6table_filter ip6_tables snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep i7core_edac edac_core snd_seq snd_seq_device hp_wmi iTCO_wdt iTCO_vendor_support snd_pcm snd_timer snd microcode soundcore snd_page_alloc sparse_keymap serio_raw tg3 rfkill uinput sunrpc firewire_ohci firewire_core crc_itu_t scsi_transport_sas raid_class nouveau ttm drm_kms_helper drm i2c_core mxm_wmi video wmi [last unloaded: mpt2sas] > >> [ 222.981790] > >> [ 222.981796] Pid: 1311, comm: scsi_scan_8 Not tainted 3.3.0 #21 Hewlett-Packard HP Z400 Workstation/0B4Ch > >> [ 222.981815] RIP: 0010:[<ffffffff811f0c45>] [<ffffffff811f0c45>] sysfs_create_dir+0x35/0xc0 > >> [ 222.981829] RSP: 0018:ffff8801929a7ce0 EFLAGS: 00010246 > >> [ 222.981837] RAX: ffff880190c65810 RBX: ffff880190d5a438 RCX: 0000000000000000 > >> [ 222.981846] RDX: ffff880199034c60 RSI: ffff880192a81018 RDI: ffff880190d5a438 > >> [ 222.981854] RBP: ffff8801929a7d10 R08: ffff880199034c60 R09: ffff880195516d80 > >> [ 222.981863] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 > >> [ 222.981872] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880190d5a400 > >> [ 222.981881] FS: 0000000000000000(0000) GS:ffff88019fc00000(0000) knlGS:0000000000000000 > >> [ 222.981891] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > >> [ 222.981899] CR2: 0000000000000079 CR3: 0000000001c05000 CR4: 00000000000006f0 > >> [ 222.981913] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> [ 222.981922] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > >> [ 222.981932] Process scsi_scan_8 (pid: 1311, threadinfo ffff8801929a6000, task ffff88018b6e8000) > >> [ 222.981941] Stack: > >> [ 222.981946] ffff8801929a7d30 ffffffff810583a5 ffff8801929a7d10 ffff880190d5a438 > >> [ 222.981965] mpt2sas0: sending diag reset !! > >> [ 222.983072] ffff880190c65810 0000000000000000 ffff8801929a7d60 ffffffff812c2e6e > >> [ 222.983644] ffff8801929a7d66 0044b82fa09b5a53 ffff8801929a7dd0 ffff880190d5a438 > >> [ 222.984215] Call Trace: > >> [ 222.984776] [<ffffffff810583a5>] ? console_unlock+0x1f5/0x270 > >> [ 222.985334] [<ffffffff812c2e6e>] kobject_add_internal+0xae/0x250 > >> [ 222.985888] [<ffffffff812c3417>] kobject_add+0x67/0xc0 > >> [ 222.986461] [<ffffffff8139eb32>] device_add+0x102/0x6c0 > >> [ 222.987050] [<ffffffff813c8288>] scsi_sysfs_add_sdev+0x198/0x340 > >> [ 222.987643] [<ffffffff813c6b24>] do_scan_async+0x84/0x170 > >> [ 222.988241] [<ffffffff813c6aa0>] ? do_scsi_scan_host+0xa0/0xa0 > >> [ 222.988845] [<ffffffff81079c63>] kthread+0x93/0xa0 > >> [ 222.989443] [<ffffffff815fd1e4>] kernel_thread_helper+0x4/0x10 > >> [ 222.990046] [<ffffffff81079bd0>] ? kthread_freezable_should_stop+0x70/0x70 > >> [ 222.990655] [<ffffffff815fd1e0>] ? gs_change+0x13/0x13 > >> [ 222.991256] Code: 83 ec 18 66 66 66 66 90 48 85 ff 48 89 fb 0f 84 98 00 00 00 48 8b 47 18 49 c7 c4 20 0e c5 81 48 85 c0 74 04 4c 8b 60 30 45 31 ed <41> 80 7c 24 79 00 75 5b 48 89 df e8 8b 2c 0d 00 48 85 c0 74 66 > >> [ 222.992012] RIP [<ffffffff811f0c45>] sysfs_create_dir+0x35/0xc0 > >> [ 222.992626] RSP <ffff8801929a7ce0> > >> [ 222.993265] CR2: 0000000000000079 > >> [ 222.993943] ---[ end trace 024b7be18310b205 ]--- > >> [ 224.049467] mpt2sas0: diag reset: SUCCESS > >> > >> This means that the module protection with try_module_get doesn't work, the driver removal starts again > >> in the middle of a scan. > > > > Not necessarily; it means that the remove proceeded too far before it > > got to the sync point, which is the scan mutex acquisition in > > scsi_remove_host(). The fix is to make sure outstanding async work is > > completed before beginning the removal. That can't really be done in > > SCSI, but it looks like this might be the trick. > > > > James > > > > --- > > diff --git a/drivers/base/driver.c b/drivers/base/driver.c > > index 3ec3896..246e652 100644 > > --- a/drivers/base/driver.c > > +++ b/drivers/base/driver.c > > @@ -10,6 +10,7 @@ > > * > > */ > > > > +#include <linux/async.h> > > #include <linux/device.h> > > #include <linux/module.h> > > #include <linux/errno.h> > > @@ -199,6 +200,13 @@ EXPORT_SYMBOL_GPL(driver_register); > > */ > > void driver_unregister(struct device_driver *drv) > > { > > + /* > > + * make sure all async bits that may be running have completed the way > > + * this works is that try_module_get() now fails, so new async work > > + * should take a reference to the module which now fails, so it should > > + * be safe to remove the driver. > > + */ > > + async_synchronize_full(); > > if (!drv || !drv->p) { > > WARN(1, "Unexpected driver unregister!\n"); > > return; > > > I'm hoping there exists another way to fix this. A patch like the above > will cause driver_unregister() to take forever if the rate at which new > devices are added is high enough. As far as I understand > async_synchronize_full() the time during which this function waits is > extended every time scanning of a newly added device starts. module removal isn't really supposed to be a common occurrence. Rusty even argued it's impossible to fix the races, which is why there's a kernel option to disallow it. I don't think inserting a checkpoint in the async domain for module removal is unreasonable. If it shows up as unreasonable in practise, we can label all the async work by THIS_MODULE and implement async_synchronize_module(THIS_MODULE) which would limit it, but I don't really think there'll be a huge problem. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html