Hi Nicholas, On 1/24/2014 1:25 AM, Nicholas A. Bellinger wrote:
<SNIP>Looking at the code in iscsi_target_erl1.c:iscsit_handle_r2t_snack(), it is checking the validity of received BegRun + RunLength values, in that they don't exceed what R2TSN has already been put on the wire: if (runlength) { if ((begrun + runlength) > cmd->r2t_sn) { pr_err("Command ITT: 0x%08x received R2T SNACK" " with BegRun: 0x%08x, RunLength: 0x%08x, exceeds" " current R2TSN: 0x%08x, protocol error.\n", cmd->init_task_tag, begrun, runlength, cmd->r2t_sn); return iscsit_reject_cmd(cmd, ISCSI_REASON_BOOKMARK_INVALID, buf); } last_r2tsn = (begrun + runlength); } else last_r2tsn = cmd->r2t_sn; Can you confirm your bug is with R2T SNACKs, and not a different type of SNACK..?Yes I confirm it is with R2T SNACK. My two cents. Without htonl() conversion the value 0x1 (intel) is getting converted into large value 0x10000000 on assignment. Which in turn is getting assigned to last_r2tsn. And further down your code... while (begrun < last_r2tsn) { ... // it may be looping in uncontrolled manner and failing. begrun++ }So the above check in iscsit_handle_r2t_snack():if (runlength) {if ((begrun + runlength) > cmd->r2t_sn) { <SEND REJECT> } } is what's supposed to be protecting against the reception of a bogus runlength exceeding the last R2TSN sent out by the command. How about testing with the following quick debug patch to see why runlength=0x10000000 is not hitting the above sanity check..? Thanks, --nab diff --git a/drivers/target/iscsi/iscsi_target_erl1.c b/drivers/target/iscsi/iscsi_target_erl1.c index e048d64..b5b0044 100644 --- a/drivers/target/iscsi/iscsi_target_erl1.c +++ b/drivers/target/iscsi/iscsi_target_erl1.c @@ -164,6 +164,9 @@ static int iscsit_handle_r2t_snack( ISCSI_REASON_PROTOCOL_ERROR, buf); }+ printk("Debug #1 begrun: %u runlength: %u cmd->r2t_sn: %u\n",+ begrun, runlength, cmd->r2t_sn); + if (runlength) { if ((begrun + runlength) > cmd->r2t_sn) { pr_err("Command ITT: 0x%08x received R2T SNACK" @@ -177,6 +180,14 @@ static int iscsit_handle_r2t_snack( } else last_r2tsn = cmd->r2t_sn;+ printk("Debug #2 last_r2tsn: %u\n", last_r2tsn);+ + if (last_r2tsn > 0x0000ffff) { + printk("Ignoring last_r2tsn that's too large: %u\n", last_r2tsn); + dump_stack(); + return 0; + } + while (begrun < last_r2tsn) { r2t = iscsit_get_holder_for_r2tsn(cmd, begrun); if (!r2t)
I tried out you patch. I ran 3 cases with it. The case with runlength>0 and without
htonl() conversion still crashes the machine. 1) Case 1 : begrun=0,runlength=0. It works.Jan 28 01:35:10 wfs kernel: [ 136.416906] Debug #1 begrun: 0 runlength: 0 cmd->r2t_sn: 1
Jan 28 01:35:10 wfs kernel: [ 136.416907] Debug #2 last_r2tsn: 12) Case 2: begrun=0 and (RunLength >0 ) However, it is converted via htonl(1). It works. Jan 28 01:49:02 wfs kernel: [ 968.100236] Debug #1 begrun: 0 runlength: 1 cmd->r2t_sn: 1
Jan 28 01:49:02 wfs kernel: [ 968.100237] Debug #2 last_r2tsn: 13) Case 3: begrun=0 and Runlength=1. No conversion done with htonl. It dump_stack(). Attached it the screen shot of the dump_stack(). It looks like is hitting the new case (last_r2tsn>0x0000FFFF)
you have put in. [root@wfs linux-3.12.6]# uname -aLinux wfs 3.12.6 #1 SMP Sat Dec 21 06:13:09 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@wfs linux-3.12.6]# [root@wfs iscsi]# cat /sys/kernel/config/target/iscsi/lio_version Datera Inc. iSCSI Target v4.1.0 [root@wfs iscsi]# Thanks, Arshad
Attachment:
run_length_lio.png
Description: PNG image