Oops on scsi_remove_target

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Running 2.6.12 (or one of several descendents of it), someone just let
loose a new device on our fabric, it is causing one of our hosts no
end of grief:

scsi: unknown device type 12
  Vendor: ADIC      Model: SNC               Rev: 42dF
  Type:   RAID                               ANSI SCSI revision: 03
qla2300 0000:18:01.1: Waiting for LIP to complete...
qla2300 0000:18:01.1: LIP reset occured (f7f7).
qla2300 0000:18:01.1: LOOP UP detected (2 Gbps).
qla2300 0000:18:01.1: Topology - (F_Port), Host Loop address 0xffff
qla2300 0000:18:01.0: scsi(3:16:1): Abort command issued -- 197 2002.

and a while later:

Starting udev: Unable to handle kernel NULL pointer dereference at virtual address 0000004c
 printing eip:
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: sg qla2300 qla2xxx scsi_transport_fc aic7xxx scsi_transport_spi sd_mod scsi_mod
CPU:    2
EIP:    0060:[<c0191fe3>]    Not tainted VLI
EFLAGS: 00010282   (2.6.12-kdb)
EIP is at sysfs_hash_and_remove+0xc/0xfe
eax: 00000000   ebx: f7e096b0   ecx: 00000000   edx: f885f6b4
esi: f7e096a8   edi: f885f6ac   ebp: f7feee68   esp: f7feee4c
ds: 007b   es: 007b   ss: 0068
Process events/2 (pid: 12, threadinfo=f7fee000 task=f7fef530)
Stack: 00000002 00000180 f7e09400 00000000 f7e096b0 f7e096a8 f885f6ac f7feee78 c0193aaa 00000000 c02f41fe f7feee9c c0227691 f7e096b0 c02f41fe f885f640 f885f6b4 f7e096a8 c1a1fff8 c1a20030 f7feeeac c0227702 f7e096a8 f7e09400
Call Trace:
 [<c0103ec2>] show_stack+0x9a/0xd0
 [<c010408d>] show_registers+0x175/0x209
 [<c01042ac>] die+0xfa/0x19c
 [<c0115200>] do_page_fault+0x239/0x6ee
 [<c0103ad7>] error_code+0x4f/0x54
 [<c0193aaa>] sysfs_remove_link+0x1b/0x1d
 [<c0227691>] class_device_del+0x8e/0xed
 [<c0227702>] class_device_unregister+0x12/0x20
 [<f884d083>] scsi_remove_device+0x4e/0x97 [scsi_mod]
 [<f884d156>] __scsi_remove_target+0x8a/0xc9 [scsi_mod]
 [<f884d1b6>] __remove_child+0x21/0x29 [scsi_mod]
 [<c02255bb>] device_for_each_child+0x32/0x53
 [<f884d209>] scsi_remove_target+0x4b/0x5a [scsi_mod]
 [<f883bc54>] fc_timeout_blocked_rport+0x4f/0x55 [scsi_transport_fc]
 [<c012d2ee>] worker_thread+0x18f/0x238
 [<c0131367>] kthread+0xb1/0xb5
 [<c010141d>] kernel_thread_helper+0x5/0xb
Code: c0 e8 29 b2 13 00 89 5c 24 04 8b 45 0c 8b 40 0c 89 04 24 e8 1f b8 fe ff 83 c4 08 5b 5e 5d c3 55 89 e5 57 56 53 83 ec 10 8b 45 08 <8b> 50 4c 8b 48 0c f0 ff 49 74 0f 88 e2 00 00 00 8b 42 0c 8d 58

Entering kdb (current=0xf7fef530, pid 12) on processor 2 Oops: Oops
due to oops @ 0xc0191fe3
eax = 0x00000000 ebx = 0xf7e096b0 ecx = 0x00000000 edx = 0xf885f6b4
esi = 0xf7e096a8 edi = 0xf885f6ac esp = 0xf7feee4c eip = 0xc0191fe3
ebp = 0xf7feee68 xss = 0xc0260068 xcs = 0x00000060 eflags = 0x00010282
xds = 0xf885007b xes = 0x0000007b origeax = 0xffffffff &regs = 0xf7feee18
[2]kdb> bt
Stack traceback for pid 12
0xf7fef530       12        1  1    2   R  0xf7fef6f0 *events/2
EBP        EIP        Function (args)
0xf7feee68 0xc0191fe3 sysfs_hash_and_remove+0xc (0x0, 0xc02f41fe)
0xf7feee78 0xc0193aaa sysfs_remove_link+0x1b (0xf7e096b0, 0xc02f41fe, 0xf885f640, 0xf885f6b4, 0xf7e096a8)
0xf7feee9c 0xc0227691 class_device_del+0x8e (0xf7e096a8, 0xf7e09400)
0xf7feeeac 0xc0227702 class_device_unregister+0x12 (0xf7e096a8, 0x3, 0xf7e09400, 0xc1a1fff8, 0xc1a20000) 0xf7feeec8 0xf884d083 [scsi_mod]scsi_remove_device+0x4e (0xf7e09400, 0xf78fb214, 0xf7feef00, 0xf884d195)
0xf7feeee0 0xf884d156 [scsi_mod]__scsi_remove_target+0x8a (0xf78fb200, 0x0)
0xf7feeef0 0xf884d1b6 [scsi_mod]__remove_child+0x21 (0xf78fb214, 0x0, 0xf7e17840, 0xf7e17844, 0xf78fb220) 0xf7feef18 0xc02255bb device_for_each_child+0x32 (0xf7e17840, 0x0, 0xf884d195, 0xf7e17840, 0xf7e17958) 0xf7feef34 0xf884d209 [scsi_mod]scsi_remove_target+0x4b (0xf7e17840, 0xf883bece, 0xf7e178e4, 0xf7e17800) 0xf7feef4c 0xf883bc54 [scsi_transport_fc]fc_timeout_blocked_rport+0x4f (0xf7e17800, 0xf7feef7c, 0x0, 0xc193090c, 0xc1930914) 0xf7feefb8 0xc012d2ee worker_thread+0x18f (0xc1930900, 0xff, 0x0, 0xc012d15f, 0xffffffff)
0xf7feefe4 0xc0131367 kthread+0xb1
           0xc010141d kernel_thread_helper+0x5


Here is another example:
scsi: unknown device type 12
  Vendor: ADIC      Model: SNC               Rev: 42dF
  Type:   RAID                               ANSI SCSI revision: 03
qla2300 0000:18:01.1: scsi(4:16:1): Abort command issued -- 197 2002.
qla2300 0000:18:01.1: scsi(4:16:1): Abort command issued -- 198 2002.
qla2300 0000:18:01.1: scsi(4:16:1): Abort command issued -- 198 2002.
scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 16 lun 1 scsi: Unexpected response from host 4 channel 0 id 16 lun 1 while scanning, scan aborted

followed by the same oops.

I zoned the fabric to get around the problem for now


-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux