Re: [PATCH] qla_target: Check refcount in find_sess_by_*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2012-05-17 at 12:26 -0400, Jörn Engel wrote:
> As the comment explains, we have a race between dropping the refcount to
> zero and actually removing the session from lookup tables.  Plug this by
> checking the refcount on lookup.
> 
> Signed-off-by: Joern Engel <joern@xxxxxxxxx>
> ---
>  drivers/scsi/qla2xxx/tcm_qla2xxx.c |   28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 

Hi Joern!

> diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> index 09a70aa..b1bfa08 100644
> --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> @@ -1300,6 +1300,20 @@ static struct qla_tgt_sess *tcm_qla2xxx_find_sess_by_s_id(
>  		return NULL;
>  	}
>  
> +	/*
> +	 * We drop the hardware_lock between dropping the reference count to
> +	 * zero and removing the session from the btree.  Therefore it
> +	 * is possible to successfully lookup a session after it has been
> +	 * scheduled for removal, but before actual removal.  Result is pretty
> +	 * good chance of dereferencing freed memory in qlt_issue_task_mgmt.
> +	 *
> +	 * Proper fix would be to atomically drop the refcount and remove the
> +	 * session from all data structures that would allow lookup.  Until
> +	 * then this tasteless hack shall suffice.
> +	 */
> +	if (atomic_read(&nacl->qla_tgt_sess->se_sess->sess_kref.refcount) == 0)
> +		return NULL;
> +
>  	return nacl->qla_tgt_sess;
>  }
>  

Mmmmm, I'd still like to try to avoid (another) atomic op in the fast
path w/ hardware_lock held if at all possible..

Also thinking more about the case in your comment, I'm starting to
wonder if tcm_qla2xxx_free_session() should be obtaining hardware_lock +
clearing the s_id and loop_id lookup entries earlier than currently
done..

That is, target_wait_for_sess_cmds() is called before clearing the
nodeacl pointers from s_id + loop_id in tcm_qla2xxx_free_session(),
which means that we still perform se_node_acl lookups in the case where
the individual tcm_qla2xxx endpoint was not also being shutdown as
well..

So that said, I think your work-around is OK to address the current BUG,
but the s_id + loop_id clearing order in tcm_qla2xxx_free_session()
would be having an effect here as well..

--nab


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux