To me just removing the h->sdev = NULL seems strange because then this looks strange to me: pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h->pg_lock)); rcu_assign_pointer(h->pg, NULL); Then saying If (pg) Since we just assigned that pointer, h->pg to NULL. Thanks, Brian Brian Bunker SW Eng brian@xxxxxxxxxxxxxxx > On Sep 11, 2020, at 6:44 AM, Ewan D. Milne <emilne@xxxxxxxxxx> wrote: > > On Thu, 2020-09-10 at 14:22 -0700, Brian Bunker wrote: >> A race exists where the BUG_ON(!h->sdev) will fire if the detach >> device handler >> from one thread runs removing a list entry while another thread is >> trying to >> evaluate the target portal group state. >> >> The order of the detach operation is now changed to delete the list >> entry >> before modifying the pointer and setting h->sdev to NULL. >> >> Signed-off-by: Brian Bunker <brian@xxxxxxxxxxxxxxx> >> Acked-by: Krishna Kant <krishna.kant@xxxxxxxxxxxxxxx> >> ___ >> diff -Naur a/scsi/drivers/scsi/device_handler/scsi_dh_alua.c >> b/scsi/drivers/scsi/device_handler/scsi_dh_alua.c >> --- a/scsi/drivers/scsi/device_handler/scsi_dh_alua.c 2020-09-10 >> 12:29:03.000000000 -0700 >> +++ b/scsi/drivers/scsi/device_handler/scsi_dh_alua.c 2020-09-10 >> 12:41:34.000000000 -0700 >> @@ -1146,16 +1146,18 @@ >> >> spin_lock(&h->pg_lock); >> pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h- >>> pg_lock)); >> - rcu_assign_pointer(h->pg, NULL); >> - h->sdev = NULL; >> - spin_unlock(&h->pg_lock); >> if (pg) { >> spin_lock_irq(&pg->lock); >> list_del_rcu(&h->node); >> spin_unlock_irq(&pg->lock); >> - kref_put(&pg->kref, release_port_group); >> } >> + rcu_assign_pointer(h->pg, NULL); >> + h->sdev = NULL; >> + spin_unlock(&h->pg_lock); >> sdev->handler_data = NULL; >> + if (pg) { >> + kref_put(&pg->kref, release_port_group); >> + } >> kfree(h); >> } >> > > Good catch. > > This makes the code hold the h->pg_lock while holding the pg->lock > though. > > It seems like all that is needed is to remove the h->sdev = NULL > assignment, since it has to remain valid until the alua_ah_data is > removed from the pg->dh_list, and the object is going to be freed > right afterwards anyway? > > Might as well also remove the BUG_ON(!h->sdev) in 2 places since the > kernel will crash when h is dereferenced anyway if it is NULL. > > -Ewan > >