Re: Kernel OOP - Writes across Multiple Connections within Session

Arshad Hussain <arshad.hussain@xxxxxxxxxxxxxx> · Fri, 27 Jun 2014 11:40:44 +0530



On 6/27/2014 10:34 AM, Nicholas A. Bellinger wrote:
> On Thu, 2014-06-26 at 15:30 +0530, Arshad Hussain wrote:
>> On 6/26/2014 2:06 PM, Arshad Hussain wrote:
>>> Hi Nab,
>>> On 6/25/2014 6:21 PM, Arshad Hussain wrote:
>>>> On 6/20/2014 11:17 PM, Nicholas A. Bellinger wrote:
> <SNIP>
>
>>>>> Thanks for confirming.  I think this is the correct fix, but would like
>>>>> to verify some more on my end.  Please send me the test off-list.
>>> This is what we are trying to achieve. Read from socket1 and socket2 and
>>> then write to socket1 and socket2. I have detailed the flow below. The
>>> code is modified to read and write with different ITT. And this work
>>> just fine. It is only when we read & write on two different sockets with
>>> same ITT it fails.
>>>
>>> Please notice after READ we are receiving stat_sn and incrementing the
>>> counters. This indicates that the stat_sn is being acknowledge. IMHO,
>>> the reads and writes on multiple sockets with same ITT should work. 
>>> While we can change the test case to make it pass, by reading and
>>> writing on different ITT. This however will shadow a potential bug in
>>> target. As always your valuable comments.
>>>
>>> READ10 CMD [socket1,ITT=A3A3A3A3] -> [T]
>>> /* Update the counters. At this point stat_sn is ack. */
>>> next_cmdsn = recvpdu->exp_cmd_sn;
>>> next_expstatsn = (recvpdu->stat_sn) + 0x01;
>>> maxcmdsn = recvpdu->max_cmd_sn;
>>>
>>> READ10 CMD [socket2,ITT=A3A3A3A3] -> [T]
>>> /* Update counters. At this point stat_sn is ack.*/
>>> next_cmdsn = recvpdu->exp_cmd_sn;
>>> next_expstatsn = (recvpdu->stat_sn) + 0x01;
>>> maxcmdsn = recvpdu->max_cmd_sn;
>>>
>>> WRITE10 CMD [socket1,ITT=A5A5A5A5] -> [T]
>>> R2T <- [T]
>>> DATA OUT [socket1,ITT=A5A5A5A5] -> [T]
>>>
>>> WRITE10 CMD [socket2,ITT=A5A5A5A5] -> [T]
>>> R2T <- [T]
>>> DATA OUT [socket1,ITT=A5A5A5A5] -> [T]
>>>
>>> Thanks!
>>> Arshad
>> Just tried another set of cases with multiple connections.
>>
>> // First Write to socket1
>> WRITE10 CMD [socket1,ITT=A5A5A5A5] -> [T]
>> R2T <- [T]
>> DATA OUT [socket1,ITT=A5A5A5A5] -> [T]
>>
>> // Second write to socked2, with different ITT. This works.
>> WRITE10 CMD [socket2,ITT=A3A3A3A3] -> [T]
>> R2T <- [T]
>> DATA OUT [socket1,ITT=A3A3A3A3] -> [T]
>>
>> // Third write again to socket1. ITT reuse. Please Note the writes are
>> ack at this point.
>> WRITE10 CMD [socket1,ITT=A5A5A5A5] -> [T]
>> R2T <- [T]
>>
>> Observation: First and second write works. The third write fails (No
>> crash). Printing below the dmesg.
>>
>> [16066.334410] Command ITT: 0xa5a5a5a5 received DataOUT after last
>> DataOUT received, dumping payload.
>>
> Ok, I think your hitting a race where the reused ITT is being received
> before TX thread context can set ISTATE_SENT_STATUS in
> iscsit_response_queue(), effectively preventing the first ITT from being
> removed via iscsit_ack_from_expstatsn() from the per connection command
> list used by the subsequent Data-Out to locate the second ITT.
>
> Please try the following work-around to ignore descriptors during the
> Data-Out ITT lookup if they have already received all expected Data-Out,
> instead of explicitly dumping payload in iscsit_check_dataout_hdr().
Yes your patch works. Also, I will be running the whole regression suite
tonight. Will let you know the results.

Thanks,
Arshad
>
> diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
> index fd90b28..73355f4 100644
> --- a/drivers/target/iscsi/iscsi_target_util.c
> +++ b/drivers/target/iscsi/iscsi_target_util.c
> @@ -400,6 +400,8 @@ struct iscsi_cmd *iscsit_find_cmd_from_itt_or_dump(
>  
>  	spin_lock_bh(&conn->cmd_lock);
>  	list_for_each_entry(cmd, &conn->conn_cmd_list, i_conn_node) {
> +		if (cmd->cmd_flags & ICF_GOT_LAST_DATAOUT)
> +			continue;
>  		if (cmd->init_task_tag == init_task_tag) {
>  			spin_unlock_bh(&conn->cmd_lock);
>  			return cmd;
>
>> PS: Can you help me where to put BUG()/panic() as I am not being able to
>> provide you with bt. I cannot also get vmcore on crash (previous 3.14
>> without patch) since it is locking I am unable to enter into
>> dump-capture kernel. Overall, let me know what all more I can get you to
>> help you debug.
>>
> No need for the vmcore without the previous patch.  It's clear that the
> first patch addresses the original OOPsen, and it will be included in
> the next PULL request.
>
> Thanks again,
>
> --nab
>
>

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html