RE: question on block-layer timeout change

"Shi, Harris" <Harris.Shi@xxxxxxx> · Sat, 24 Jan 2009 09:34:05 -0700

Christie, 

Sorry for the late response. I was off-track by other issues for a while. On controller sysReboot, it actually did not send any asynchronous event/message for iscsid to destroy the session. It simply went offline for a while and then back online. Now I was able to enable iscsid debug trace and capture a log for your reference. My serial console was disconnected during the weekend for some reasons, whenever it's back online, I might try some more cases. 

Thanks.
Harris

-----Original Message-----
From: Mike Christie [mailto:michaelc@xxxxxxxxxxx] 
Sent: Wednesday, January 07, 2009 2:46 PM
To: Shi, Harris
Cc: Mike Anderson; Hannes Reinecke; malahal@xxxxxxxxxx; SCSI development list
Subject: Re: question on block-layer timeout change

Shi, Harris wrote:
> What we did is to sysReboot one controller while I/O is up and running, we expected to have controller failover and did not explicitly run iscsiadm to logout the session in this case. Yes. We got the oop after conn error message. If controller reboot resulted in session gone, that might explain we had a kernel panic. However, MPP has been running on iscsi array for quite a while on SLES10/REHL distribution, it's the first time we got hit by this kernel panic. That's why we had a some concerns on iscsi stack. 
> 

The failover should not have resulted in the session getting destroyed. 
It might depend on what is in SLES though. In older tools if the chap 
values changed (we logged in with one user/pass and the target no wants 
diff values) or for some errors the initiator thought was fatal. For 
this though you should have seen a error in logs indicating that we got 
a fatal login error.

What does the portal return when you were rebooting it if the initiator 
tries to log in? Look at the open-iscsi userspace tools in SLES. Is your 
target returning a error values that 
initiator.c:__check_iscsi_status_class() would destroy the session for? 
Is it now changed upstream?

> -----Original Message-----
> From: Mike Christie [mailto:michaelc@xxxxxxxxxxx] 
> Sent: Sunday, January 04, 2009 11:12 AM
> To: Mike Anderson
> Cc: Shi, Harris; Hannes Reinecke; malahal@xxxxxxxxxx; SCSI development list
> Subject: Re: question on block-layer timeout change
> 
> Mike Anderson wrote:
>> Shi, Harris <Harris.Shi@xxxxxxx> wrote:
>>> Information from /var/log/messages:
>>> ===================================
>>> Dec 17 15:58:14 timon kernel: sd 6:0:0:2: [sdd] Sense Key : Recovered Error [current]
>>> Dec 17 15:58:14 timon kernel: sd 6:0:0:2: [sdd] <<vendor>> ASC=0x95 ASCQ=0x1ASC=0x95 ASCQ=0x1
>>> Dec 17 15:58:25 timon kernel:  connection2:0: ping timeout of 15 secs expired, last rx 19237, last ping 20487, now 24237
>>> Dec 17 15:58:25 timon kernel:  connection2:0: detected conn error (1011)
>>> Dec 17 15:58:26 timon iscsid: Kernel reported iSCSI connection 2:0 error (1011) state (3)
>>>
>>>
>>>
>>> Information from Serial output:
>>> ===============================
>>> Oops: 0002 [#1] SMP
>>> last sysfs file: /sys/devices/system/cpu/cpu3/cache/index1/shared_cpu_map
>>> Modules linked in: radeon drm agpgart crc32c libcrc32c ib_iser rdma_cm ib_cm nfs iw_cm lockd ib_sa ib_mad nfs_acl ib_core i6
>>> IP: [<c011a274>] __ticket_spin_lock+0x8/0x19
>>> *pdpt = 00000000319fe001 *pde = 0000000000000000
>>> BUG: unable to handle kernel NULL pointer dereference at 00000086
>>> IP: [<c011a274>] __ticket_spin_lock+0x8/0x19
>>> *pdpt = 0000000000546001 *pde = 0000000000000000
>>>  ipv6 af_packet microcode fuse loop dm_mod mptctl e1000 iTCO_wdt sr_mod video iTCO_vendor_support e752x_edac output shpchp ]
>>>
>>> Pid: 0, comm: swapper Not tainted (2.6.28-rc8-test-1-pae #1) PowerEdge 2850
>>> EIP: 0060:[<c011a274>] EFLAGS: 00010086 CPU: 3
>>> EIP is at __ticket_spin_lock+0x8/0x19
>>> EAX: 00000086 EBX: f10f6380 ECX: f20b5400 EDX: 00000100
>>> ESI: f18223b0 EDI: 00000000 EBP: f38a5e78 ESP: f38a5e78
>>>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>>> Process swapper (pid: 0, ti=f38a4000 task=f38a2fd0 task.ti=f38a4000)
>>> Stack:
>>>  f38a5e80 c0328e0f f38a5e98 f9298389 00000002 f10f6380 f18223b0 00000000
>>>  f38a5ea4 f7e13396 f11b9300 f38a5eb0 c0212539 f11b9300 f38a5ed4 c02125f2
>>>  f18225b8 00000282 f389c000 f18224f4 00000100 f389c000 c0212573 f38a5f08
>>> Call Trace:
>>>  [<c0328e0f>] ? _spin_lock+0x15/0x18
>>>  [<f9298389>] ? iscsi_eh_cmd_timed_out+0x24/0xb0 [libiscsi]
>>>  [<f7e13396>] ? scsi_times_out+0x35/0x61 [scsi_mod]
>>>  [<c0212539>] ? blk_rq_timed_out+0xc/0x46
>> I could not match my listing exactly with this output, but it appears that
>> the session is NULL when we call into iscsi_eh_cmd_timed_out. An addr2line
>> would help verify the iscsi_eh_cmd_timed_out line.
>>
>> I added Mike C to the email cc for possible comments on the error messages
>> displayed above and if that would lead to cleanup of structures referenced
>> in iscsi_eh_cmd_timed_out.
>>
> 
> Sorry for the late reply. I have been on vacation.
> 
> The iscsi error message just indicates that the initiator tried to send 
> a iscsi ping and it did not get a response, so the initiator dropped the 
> session. The error was reported as a generic connection error (1011) and 
> when the error was fired the initiator was in the logged in / full 
> feature phase (this basically means normal old use and nothing special).
> 
> Due to the sles use and MPP driver, I am not sure what exactly is 
> running, but for this code path the iscsi driver does this:
> 
>          cls_session = starget_to_session(scsi_target(scmd->device));
>          session = cls_session->dd_data;
> 
> to get the session in iscsi_eh_cmd_timed_out (we do this in all kernels 
> do that has not changed).
> 
> The session pointers are only changed when a session is destroyed, and 
> that only happens if you do a logout of the session (iscsiadm -m .... 
> -u), and at that time when the session is destroyed we should have 
> flushed all IO.
> 
> Are you guys doing a logout of the session with iscsiadm at this time?
> 
> Does MPP clone commands and is it doing something with the command's 
> pointers to the device?
> 
> Another possibilty is that commands are not getting cleaned up 
> correctly. When you see this "connection2:0: detected conn error 
> (1011)", the driver is going to kill all outstadnding commands and and 
> call scsi_done on them to requeue them with the scsi layer, so we should 
> be getting any commands timed out after you see that message (maybe only 
> in some race case where the session's commands are getting flushed at 
> the exact same time the scsi eh was firing). There was no time stamp on 
> the oops output but I doubt this happened. Did the oops happen after the 
> conn error message though?

=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2009.01.24 09:37:15 =~=~=~=~=~=~=~=~=~=~=~=

iscsid: poll result 1
iscsid: in ctldev_handle
iscsid: in nl_read
iscsid: ctldev_handle got event type 102

iscsid: get conn context 0x6d7b30
iscsid: message real length is 72 bytes, recv_handle 0x6d9430
iscsid: in nlpayload_read
iscsid: sched conn context 0x6d7b30 event 3, tmo 0
iscsid: thread 0x6d7b30 schedule: delay 0 state 3
iscsid: exec thread 006d7b30 callback
iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
iscsid: put conn context 0x6d7b30
iscsid: re-opening session 1 (reopen_cnt 0)
iscsid: thread 006d59d0 delete: state 3
iscsid: thread 006d5a08 delete: state 3
iscsid: disconnecting conn 0x6d3508, fd 5
iscsid: in kstop_conn
iscsid: in __kipc_call
iscsid: in kwritev
BUG: unable to handle kernel NULL pointer dereference at 00000000000000ba
IP: [<ffffffff8049bd5f>] _spin_lock+0x5/0x1b
PGD 23dc067 PUD 238c067 PMD 0 
Oops: 0002 [1] SMP 
last sysfs file: /sys/devices/mppvirtual/mppadapter0/host6/target6:0:0/6:0:0:8/block/sdj/dev
CPU 0 
Modules linked in: crc32c libcrc32c radeon drm iscsi_tcp libiscsi scsi_transport_iscsi nfs lockd nfs_acl sunrpc microcode ipv6 fuse loop dm_mod dcdbas(X) rtc_cmos rtc_core rtc_lib e1000 video mptctl output floppy iTCO_wdt iTCO_vendor_support e752x_edac button pcspkr edac_core shpchp sr_mod pci_hotplug cdrom uhci_hcd ehci_hcd usbcore mppVhba edd ext3 mbcache jbd fan ide_pci_generic piix ide_core ata_generic ata_piix libata dock mptspi mptscsih mptbase scsi_transport_spi thermal processor thermal_sys hwmon mppUpper sg sd_mod crc_t10dif scsi_mod
Supported: Yes, External
Pid: 8, comm: kblockd/0 Tainted: G          2.6.27.11-1-default #1
RIP: 0010:[<ffffffff8049bd5f>]  [<ffffffff8049bd5f>] _spin_lock+0x5/0x1b
RSP: 0018:ffffffff80a36e38  EFLAGS: 00010082
RAX: 0000000000010000 RBX: 0000000000000002 RCX: ffff8800070b2a38
RDX: ffff880002c16000 RSI: 0000000000002007 RDI: 00000000000000ba
RBP: ffff88000472a240 R08: ffff88000472a5a0 R09: 0000000000000000
R10: 0000000000000000 R11: ffff8800010dc228 R12: 0000000000000000
R13: ffff880005482dc8 R14: ffffffff80a36ed0 R15: ffff88000472a5b8
FS:  0000000000000000(0000) GS:ffffffff808cac80(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000000000ba CR3: 00000000023d4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kblockd/0 (pid: 8, threadinfo ffff880008628000, task ffff880008626200)
Stack:  ffffffffa037172f ffff8800068e5380 ffffffffa0004f99 ffff8800070b28f8
 ffffffff8033e773 ffff8800070b28f8 ffffffff8033e91a 0000000000000282
 ffff88000472a240 0000000000000100 ffffffff80ab7680 ffffffff8033e887
Call Trace:
 [<ffffffffa037172f>] iscsi_eh_cmd_timed_out+0x22/0xa3 [libiscsi]
 [<ffffffffa0004f99>] scsi_times_out+0x46/0x72 [scsi_mod]
 [<ffffffff8033e773>] blk_rq_timed_out+0xe/0x4a
 [<ffffffff8033e91a>] blk_rq_timed_out_timer+0x93/0x119
 [<ffffffff802465f0>] run_timer_softirq+0x18d/0x204
 [<ffffffff80242b4d>] __do_softirq+0x7d/0x107
 [<ffffffff8020d2dc>] call_softirq+0x1c/0x28
 [<ffffffff8020e583>] do_softirq+0x2c/0x68
 [<ffffffff8024287a>] irq_exit+0x3f/0x85
 [<ffffffff8021b4f8>] smp_apic_timer_interrupt+0x95/0xae
 [<ffffffff8020ccb6>] apic_timer_interrupt+0x66/0x70
 [<ffffffff8049bd99>] _spin_unlock_irqrestore+0x5/0x6
 [<ffffffffa0006ed1>] scsi_queue_insert+0x94/0xda [scsi_mod]
 [<ffffffffa000090a>] scsi_dispatch_cmd+0x1ec/0x235 [scsi_mod]
 [<ffffffffa0005fda>] scsi_request_fn+0x391/0x4d3 [scsi_mod]
 [<ffffffff8033ad53>] generic_unplug_device+0x21/0x2e
 [<ffffffff80339b39>] blk_unplug_work+0x5e/0x65
 [<ffffffff8024cdf9>] run_workqueue+0x7a/0x100
 [<ffffffff8024cf57>] worker_thread+0xd8/0xe7
 [<ffffffff8024ff67>] kthread+0x47/0x73
 [<ffffffff8020cf79>] child_rip+0xa/0x11

Code: a2 ec ff c3 9c 58 fa 90 81 2f 00 00 00 01 74 05 e8 a7 a2 ec ff c3 fa 90 81 2f 00 00 00 01 74 05 e8 97 a2 ec ff c3 b8 00 00 01 00 <f0> 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 07 f3 90 0f b7 17 eb f5 
RIP  [<ffffffff8049bd5f>] _spin_lock+0x5/0x1b
 RSP <ffffffff80a36e38>
CR2: 00000000000000ba
---[ end trace 1330da6161ef128a ]---
Kernel panic - not syncing: Aiee, killing interrupt handler!
------------[ cut here ]------------
WARNING: at kernel/smp.c:331 smp_call_function_mask+0x32/0x1ec()
Modules linked in: crc32c libcrc32c radeon drm iscsi_tcp libiscsi scsi_transport_iscsi nfs lockd nfs_acl sunrpc microcode ipv6 fuse loop dm_mod dcdbas(X) rtc_cmos rtc_core rtc_lib e1000 video mptctl output floppy iTCO_wdt iTCO_vendor_support e752x_edac button pcspkr edac_core shpchp sr_mod pci_hotplug cdrom uhci_hcd ehci_hcd usbcore mppVhba edd ext3 mbcache jbd fan ide_pci_generic piix ide_core ata_generic ata_piix libata dock mptspi mptscsih mptbase scsi_transport_spi thermal processor thermal_sys hwmon mppUpper sg sd_mod crc_t10dif scsi_mod
Supported: Yes, External
Pid: 8, comm: kblockd/0 Tainted: G      D   2.6.27.11-1-default #1

Call Trace:
 [<ffffffff8020da29>] show_trace_log_lvl+0x41/0x58
 [<ffffffff80499eaf>] dump_stack+0x69/0x6f
 [<ffffffff8023d9b2>] warn_on_slowpath+0x51/0x77
 [<ffffffff8025d609>] smp_call_function_mask+0x32/0x1ec
 [<ffffffff8025d7ec>] smp_call_function+0x29/0x2e
 [<ffffffff8021a34a>] native_smp_send_stop+0x1a/0x26
 [<ffffffff80499f71>] panic+0xbc/0x169
 [<ffffffff80241256>] do_exit+0x67/0x325
 [<ffffffff8049c6dc>] oops_begin+0x0/0x90
 [<ffffffff8049e42e>] do_page_fault+0x5fa/0x6ca
 [<ffffffff8049bfb9>] error_exit+0x0/0x51
 [<ffffffff8049bd5f>] _spin_lock+0x5/0x1b
 [<ffffffffa037172f>] iscsi_eh_cmd_timed_out+0x22/0xa3 [libiscsi]
 [<ffffffffa0004f99>] scsi_times_out+0x46/0x72 [scsi_mod]
 [<ffffffff8033e773>] blk_rq_timed_out+0xe/0x4a
 [<ffffffff8033e91a>] blk_rq_timed_out_timer+0x93/0x119
 [<ffffffff802465f0>] run_timer_softirq+0x18d/0x204
 [<ffffffff80242b4d>] __do_softirq+0x7d/0x107
 [<ffffffff8020d2dc>] call_softirq+0x1c/0x28
 [<ffffffff8020e583>] do_softirq+0x2c/0x68
 [<ffffffff8024287a>] irq_exit+0x3f/0x85
 [<ffffffff8021b4f8>] smp_apic_timer_interrupt+0x95/0xae
 [<ffffffff8020ccb6>] apic_timer_interrupt+0x66/0x70
 [<ffffffff8049bd99>] _spin_unlock_irqrestore+0x5/0x6
 [<ffffffffa0006ed1>] scsi_queue_insert+0x94/0xda [scsi_mod]
 [<ffffffffa000090a>] scsi_dispatch_cmd+0x1ec/0x235 [scsi_mod]
 [<ffffffffa0005fda>] scsi_request_fn+0x391/0x4d3 [scsi_mod]
 [<ffffffff8033ad53>] generic_unplug_device+0x21/0x2e
 [<ffffffff80339b39>] blk_unplug_work+0x5e/0x65
 [<ffffffff8024cdf9>] run_workqueue+0x7a/0x100
 [<ffffffff8024cf57>] worker_thread+0xd8/0xe7
 [<ffffffff8024ff67>] kthread+0x47/0x73
 [<ffffffff8020cf79>] child_rip+0xa/0x11

---[ end trace 1330da6161ef128a ]---