Hi Nab, On 6/20/2014 12:32 PM, Nicholas A. Bellinger wrote: > On Thu, 2014-06-19 at 23:43 -0700, Nicholas A. Bellinger wrote: >> On Fri, 2014-06-20 at 11:30 +0530, Santosh Kulkarni wrote: >>> Hi Nab, >>> >>> On Friday 20 June 2014 01:47 AM, Nicholas A. Bellinger wrote: >>>> On Thu, 2014-06-19 at 14:16 +0530, Santosh Kulkarni wrote: >>>>> Hi Nab, >>>>> >>>>> On Thursday 19 June 2014 01:32 AM, Nicholas A. Bellinger wrote: >>>>>> Hi Santosh, >>>>>> >>>>>> On Wed, 2014-06-18 at 15:34 +0530, Santosh Kulkarni wrote: >>>>>>> Hi Nab, >>>>>>> >>>>>>> Facing Kernel OOP when i am issuing Writes across Multiple connections >>>>>>> in a single session. >>>>>>> The crash happens immediately when the DataOut PDU is sent on the second >>>>>>> connection. >>>>>>> >>>>>> To confirm, this occurs when the initiator sends a ISCSI_OP_SCSI_CMD >>>>>> with ITT=X setting it's allegiance to one connection, and then >>>>>> (illegally) sends a ISCSI_OP_SCSI_DATA_OUT with ITT=X on another >>>>>> connection, right..? >>>>> There was a slight mistake in the way i formed my sentence earlier. >>>>> The WRITE10 request and DataOut PDUs are being issued on the "same >>>>> connection" within a session. >>>>> The Crash happens only when this is issued on multiple connections.ITT >>>>> remains same across all connections >>>>> If i carry out WRITE only on single connections the target is behaving fine. >>>>> And the behavior is not persistent.It happens intermittently.Here is a >>>>> new crash dump >>>>> >>>> Mmmmm, I need the pr_debug output to be enabled in order to make further >>>> sense of what's going on here. >>>> >>>> Please enable CONFIG_DYNAMIC_DEBUG=y, mount debugfs at /debug and do: >>>> >>>> 'echo iscsi_target_mod +p' > /debug/dynamic_debug/cpmtrp; >>> >>> Thanks for the quick response. >>> >>> The output was with dynamic debug mode ON. >>> However i double checked with the above procedure. And i see the same >>> debug logs. >> When dynamic debug is enabled, the pr_debug() messages appear in dmesg >> output, and not in the normal syslog / klog messages. >> >> Eg: You should see normal PDUs being processed from pr_debug() output in >> dmesg when enabled, along with the pr_err() output from the previous >> email. >> >>> The crash happens even with incremental ITT and also while using same >>> ITT throughout the session. >>> >>> Do let me know if you need anything else. >>> Adding Arshad in Cc. >> Can you send along the specific test off-list please..? >> > One other quick thing to try is the following patch to dump the Data-out > payload when a ITT received is not associated with a WRITE (following > the log), instead of generating a Reject for the existing ITT. > > I can't quite tell from wireshark output why your test is sending > Data-out with an ITT that is associated with a READ that has not been > acknowledged by StatSN, but the following should at least leave the READ > ITT alone. > > --nab > > diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c > index b87721a..049fe14 100644 > --- a/drivers/target/iscsi/iscsi_target.c > +++ b/drivers/target/iscsi/iscsi_target.c > @@ -1308,7 +1308,7 @@ iscsit_check_dataout_hdr(struct iscsi_conn *conn, unsigned char *buf, > if (cmd->data_direction != DMA_TO_DEVICE) { > pr_err("Command ITT: 0x%08x received DataOUT for a" > " NON-WRITE command.\n", cmd->init_task_tag); > - return iscsit_reject_cmd(cmd, ISCSI_REASON_PROTOCOL_ERROR, buf); > + return iscsit_dump_data_payload(conn, payload_length, 1); > } > se_cmd = &cmd->se_cmd; > iscsit_mod_dataout_timer(cmd); > > Santosh is traveling and I will help out with any answers you need. The ITT is unique within a session, should READ here be a problem.? And what we have observed that our the crash is intermittent as we have mention before. Tried out your patch on 3.14.0-rc6+ and 3.15.0-rc3+. With 3.15.0-rc3 the target is not crashing. The dmesg shows the line below. [root@wfsc iscsi]# dmesg -c [ 210.623207] Command ITT: 0xa3a3a3a3 received DataOUT for a NON-WRITE command. [root@wfsc iscsi]# uname -a Linux wfsc 3.15.0-rc3+ #3 SMP Fri May 30 06:11:42 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux With 3.14.0-rc6+. It is crashing with the below output. [ 124.571009] Command ITT: 0xa3a3a3a3 received DataOUT for a NON-WRITE command. [ 124.623127] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 124.632252] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 124.676209] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 124.736480] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 124.924880] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 124.940144] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 125.040862] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 125.050377] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 125.066584] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 125.078853] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 125.081819] ISCSI_FLAG_CMD_WRITE & ISCSI_FLAG_CMD_FINAL not set. Bad iSCSI Initiator. [ 127.576143] general protection fault: 0000 [#1] SMP [ 127.580095] Modules linked in: nls_utf8 isofs vboxsf(O) tcm_loop tcm_fc libfc scsi_transport_fc scsi_tgt iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod configfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bnep rfcomm bluetooth snd_intel8x0 snd_ac97_codec 6lowpan_iphc ac97_bus snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd joydev vboxguest(O) psmouse mac_hid i2c_piix4 ppdev serio_raw soundcore parport_pc lp parport hid_generic usbhid hid e1000 [ 127.580095] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 3.14.0-rc6+ #4 [ 127.580095] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 127.580095] task: ffffffff81c104a0 ti: ffffffff81c00000 task.ti: ffffffff81c00000 [ 127.580095] RIP: 0010:[<ffffffffa02f7c10>] [<ffffffffa02f7c10>] iscsit_tpg_get_node_attrib+0x10/0x20 [iscsi_target_mod] [ 127.580095] RSP: 0018:ffff88003fc03dd8 EFLAGS: 00010246 [ 127.580095] RAX: 617941580000000d RBX: ffff880037baa640 RCX: 0000000000000000 [ 127.580095] RDX: 0000000000000002 RSI: 0000000000000200 RDI: ffff88001a1d2800 [ 127.580095] RBP: ffff88003fc03dd8 R08: 0000000000016ab1 R09: 0000000081e5b240 [ 127.580095] R10: 0000000000000020 R11: 0000000000000000 R12: ffff880037b65800 [ 127.580095] R13: ffff880037baa724 R14: ffff88001a1d2800 R15: ffffffffa02f17d0 [ 127.580095] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 127.580095] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 127.580095] CR2: 00007fe77c00b208 CR3: 000000002fdad000 CR4: 00000000000006f0 [ 127.580095] Stack: [ 127.580095] ffff88003fc03e28 ffffffffa02f183b 0000000000000000 0000000000000000 [ 127.580095] ffff88003fc0d1a0 0000000000000000 0000000000000100 ffffffffa02f17d0 [ 127.580095] ffff880037baa758 ffffffff81e5c268 ffff88003fc03e78 ffffffff8105cc15 [ 127.580095] Call Trace: [ 127.580095] <IRQ> [ 127.580095] [<ffffffffa02f183b>] iscsit_handle_dataout_timeout+0x6b/0x250 [iscsi_target_mod] [ 127.580095] [<ffffffffa02f17d0>] ? iscsit_start_dataout_timer+0xa0/0xa0 [iscsi_target_mod] [ 127.580095] [<ffffffff8105cc15>] call_timer_fn+0x45/0x160 [ 127.580095] [<ffffffffa02f17d0>] ? iscsit_start_dataout_timer+0xa0/0xa0 [iscsi_target_mod] [ 127.580095] [<ffffffff8105dfa4>] run_timer_softirq+0x244/0x290 [ 127.580095] [<ffffffff810bc4e4>] ? clockevents_program_event+0x74/0x100 [ 127.580095] [<ffffffff8105591f>] __do_softirq+0xef/0x2e0 [ 127.580095] [<ffffffff81055d0e>] irq_exit+0x7e/0xa0 [ 127.580095] [<ffffffff816c2cda>] smp_apic_timer_interrupt+0x4a/0x60 [ 127.580095] [<ffffffff816c1a8a>] apic_timer_interrupt+0x6a/0x70 [ 127.580095] <EOI> [ 127.580095] [<ffffffff8100b830>] ? default_idle+0x20/0xe0 [ 127.580095] [<ffffffff8100c02e>] arch_cpu_idle+0x1e/0x30 [ 127.580095] [<ffffffff810a8770>] cpu_startup_entry+0xa0/0x260 [ 127.580095] [<ffffffff816a3357>] rest_init+0x77/0x80 [ 127.580095] [<ffffffff81d05e68>] start_kernel+0x3c9/0x3d6 [ 127.580095] [<ffffffff81d0589f>] ? repair_env_string+0x5a/0x5a [ 127.580095] [<ffffffff81d055b2>] x86_64_start_reservations+0x2a/0x2c [ 127.580095] [<ffffffff81d056a4>] x86_64_start_kernel+0xf0/0xf7 [ 127.580095] Code: 01 0f b7 73 08 48 c7 c7 e8 df 30 a0 31 c0 41 83 cd ff e8 7d 58 3b e1 eb 9f 66 90 0f 1f 44 00 00 48 8b 87 f8 01 00 00 55 48 89 e5 <48> 8b 40 10 5d 48 2d 20 05 00 00 c3 0f 1f 40 00 0f 1f 44 00 00 [ 127.580095] RIP [<ffffffffa02f7c10>] iscsit_tpg_get_node_attrib+0x10/0x20 [iscsi_target_mod] [ 127.580095] RSP <ffff88003fc03dd8> [ 127.857183] ---[ end trace 4185931fc757e117 ]--- [ 127.857838] Kernel panic - not syncing: Fatal exception in interrupt [ 127.858779] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) Thanks, Arshad -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html