RE: crash in iscsi/scsi initiator with linux-4.15.0-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2017-12-19 at 13:31 -0600, Steve Wise wrote:
> > > Hey,
> > >
> > > I'm  seeing this null pointer dereference with linux-4.15.0-rc1.  To reproduce
> > > it, I connect two ram disks via iscsi/TCP, and start an fio:
> > >
> > > iscsiadm -m discovery --op update --type sendtargets -p 172.16.1.10:3260
> > > iscsiadm -m node -p 172.16.1.10:3260 -l
> > > ISCSI_DISKS=/dev/sdd:/dev/sde; fio --rw=randrw --name=random --
> > norandommap
> > > --ioengine=libaio --size=400m --group_reporting --exitall --fsync_on_close=1
> > > --invalidate=1 --direct=1 --filename=$ISCSI_DISKS --time_based --runtime=300
> > > --iodepth=128 --numjobs=8 --unit_base=1 --bs=64k --kb_base=1000
> > >
> > > Then on the initiator node, while the fio test is running, I detach the devices:
> > >
> > > iscsiadm -m node -p 172.16.1.10:3260 -I iser -u
> > >
> > > Then I hit this crash.  Has anyone else encountered this issue?  Wondering if
> > > there is a fix handy. :)
> > >
> > 
> > This is the same problem that is being discussed under the thread:
> > "[PATCH] scsi: fix race condition when removing target".
> > 
> > We had good test results with both Jason Yan's patch and Bart's patch
> > applied, however the ultimate solution is still in progress, see James'
> > comments.
> > 
> > You could also try reverting fbce4d97fd "scsi: fixup kernel warning
> > during rmmod()" if you just need to get past this.
> > 
> > -Ewan
> > 
> 
> Hey Ewan, Yan, Bart, 
> 
> I'm still seeing this issue with 4.15-rc4.  Is the issue still outstanding?  
> 
> Steve.
> 

Please apply the following commit from the 4.15/scsi-fixes branch of

git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git

and advise if it does not fix your issue.  It should.

----

commit 81b6c999897919d5a16fedc018fe375dbab091c5
Author: Hannes Reinecke <hare@xxxxxxx>
Date:   Wed Dec 13 14:21:37 2017 +0100

    scsi: core: check for device state in __scsi_remove_target()
    
    As it turned out device_get() doesn't use kref_get_unless_zero(), so we
    will be always getting a device pointer.  Consequently, we need to check
    for the device state in __scsi_remove_target() to avoid tripping over
    deleted objects.
    
    Fixes: fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()")
    Reported-by: Jason Yan <yanaijie@xxxxxxxxxx>
    Signed-off-by: Hannes Reinecke <hare@xxxxxxxx>
    Reviewed-by: Bart Van Assche <bart.vanassche@xxxxxxx>
    Reviewed-by: Ewan D. Milne <emilne@xxxxxxxxxx>
    Signed-off-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx>

> ---
> 
> [ 1002.205103] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [ 1002.213022] IP: _raw_spin_lock_irqsave+0x1e/0x40
> [ 1002.217740] PGD 0 P4D 0
> [ 1002.220382] Oops: 0002 [#1] SMP
> [ 1002.223637] Modules linked in: iw_cxgb4 cxgb4 nvme_rdma nvme_fabrics rdma_ktest(O) rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core libcxgb vfat intel_rapl fat iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt iTCO_vendor_support mxm_wmi mei_me ipmi_si lpc_ich mei pcspkr i2c_i801 mfd_core ipmi_devintf shpchp sg ipmi_msghandler wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod igb drm ahci libahci dca mlx4_core
> [ 1002.295663]  ptp libata pps_core crc32c_intel nvme i2c_algo_bit i2c_core nvme_core [last unloaded: cxgb4]
> [ 1002.305563] CPU: 4 PID: 5156 Comm: fio Tainted: G           O     4.15.0-rc4 #3
> [ 1002.313223] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
> [ 1002.320555] RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40
> [ 1002.326077] RSP: 0018:ffffc900070cbd10 EFLAGS: 00010046
> [ 1002.331692] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
> [ 1002.339225] RDX: 0000000000000001 RSI: ffff88085fd0e038 RDI: 0000000000000000
> [ 1002.346763] RBP: ffff880855a65f18 R08: 0000000000000000 R09: 0000000000000744
> [ 1002.354315] R10: 00000000000003ff R11: 0000000000000001 R12: ffff88084992e180
> [ 1002.361873] R13: ffff880855a67000 R14: ffff880855a65800 R15: ffff880856d7d5a8
> [ 1002.369447] FS:  0000000000000000(0000) GS:ffff88085fd00000(0000) knlGS:0000000000000000
> [ 1002.377995] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1002.384209] CR2: 0000000000000000 CR3: 0000000001c09005 CR4: 00000000000606e0
> [ 1002.391826] Call Trace:
> [ 1002.394774]  scsi_device_dev_release_usercontext+0x40/0x230
> [ 1002.400858]  execute_in_process_context+0x58/0x60
> [ 1002.406085]  device_release+0x2d/0x80
> [ 1002.410277]  kobject_cleanup+0x5e/0x180
> [ 1002.414659]  scsi_disk_put+0x2b/0x40 [sd_mod]
> [ 1002.419559]  __blkdev_put+0x1b5/0x1d0
> [ 1002.423777]  ? disk_flush_events+0x24/0x60
> [ 1002.428430]  blkdev_close+0x21/0x30
> [ 1002.432484]  __fput+0xd5/0x210
> [ 1002.436111]  task_work_run+0x82/0xa0
> [ 1002.440262]  do_exit+0x2be/0xb20
> [ 1002.444074]  ? syscall_trace_enter+0x1af/0x290
> [ 1002.449110]  do_group_exit+0x39/0xa0
> [ 1002.453287]  SyS_exit_group+0x10/0x10
> [ 1002.457557]  do_syscall_64+0x61/0x1a0
> [ 1002.461829]  entry_SYSCALL64_slow_path+0x25/0x25
> [ 1002.467064] RIP: 0033:0x7f9abb1c8529
> [ 1002.471266] RSP: 002b:00007ffe53be40d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
> [ 1002.479482] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007f9abb1c8529
> [ 1002.487279] RDX: 0000000000000005 RSI: 000000000000000a RDI: 0000000000000005
> [ 1002.495079] RBP: 00007f9a9c9de818 R08: 000000000000003c R09: 00000000000000e7
> [ 1002.502882] R10: ffffffffffffff60 R11: 0000000000000206 R12: 0000000000000006
> [ 1002.510690] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000172a440
> [ 1002.518497] Code: f4 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 9c 58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 77 06 9e ff eb
> [ 1002.538742] RIP: _raw_spin_lock_irqsave+0x1e/0x40 RSP: ffffc900070cbd10
> [ 1002.546055] CR2: 0000000000000000
> 
> 
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
> 





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux