On Tue, 2014-01-28 at 12:22 -0800, Nicholas A. Bellinger wrote: > On Tue, 2014-01-28 at 13:01 +0530, Arshad Hussain wrote: > > Hi Nicholas, > > <SNIP> > > I tried out you patch. I ran 3 cases with it. The case with runlength>0 > > and without > > htonl() conversion still crashes the machine. > > > > 1) Case 1 : begrun=0,runlength=0. It works. > > Jan 28 01:35:10 wfs kernel: [ 136.416906] Debug #1 begrun: 0 runlength: > > 0 cmd->r2t_sn: 1 > > Jan 28 01:35:10 wfs kernel: [ 136.416907] Debug #2 last_r2tsn: 1 > > > > 2) Case 2: begrun=0 and (RunLength >0 ) However, it is converted via > > htonl(1). It works. > > Jan 28 01:49:02 wfs kernel: [ 968.100236] Debug #1 begrun: 0 runlength: > > 1 cmd->r2t_sn: 1 > > Jan 28 01:49:02 wfs kernel: [ 968.100237] Debug #2 last_r2tsn: 1 > > > > 3) Case 3: begrun=0 and Runlength=1. No conversion done with htonl. It > > dump_stack(). > > Attached it the screen shot of the dump_stack(). It looks like is > > hitting the new case (last_r2tsn>0x0000FFFF) > > you have put in. > > > > So without full dmesg output it's kinda hard to tell. ;) > > In any event, I'm pretty sure it's hitting the same original sanity > check in iscsit_handle_r2t_snack() mentioned above for bogus RunLengths, > and it's the generation of the REJECT that is triggering the bug which > is likely a >= v3.10 regression. > > I'd really like to get this resolved, and the quickest way to do that > would be to send a copy of your test-case code (offlist). Otherwise, > I'll have to put together something similar. > So after generating a few large SNACK RunLengths within iscsit_handle_recovery_datain_or_r2t() code, this OOPs is reproducible for me in ERL=2 operation. It's occurring when iscsit_build_conn_drop_async_message() is called during connection recovery to send a ASYNC PDU w/ error status for the failed CID on (any available) active connection. This is hitting a iscsi_session NULL pointer dereference in iscsit_allocate_cmd(), which has been added for >= v3.12 to obtain a pre-allocated percpu_ida tag. The following fix avoids NULL session pointers in iscsit_build_conn_drop_async_message() and should handle all SNACK errors generating REJECT PDUs + causing connection recovery to occur. Btw, I'm still looking at this code, but please go ahead and give this first patch a shot. Thanks, --nab diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 1ab9b18..4123893 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -2484,6 +2498,12 @@ static void iscsit_build_conn_drop_async_message(struct iscsi_conn *conn) if (!conn_p) return; + if (!conn_p->sess) { + dump_stack(); + iscsit_dec_conn_usage_count(conn_p); + return; + } + cmd = iscsit_allocate_cmd(conn_p, TASK_RUNNING); if (!cmd) { iscsit_dec_conn_usage_count(conn_p); -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html