On Thu, 2017-02-23 at 11:46 -0600, Bryant G. Ly wrote: > > From: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> > > > > When transport_clear_lun_ref() is shutting down a se_lun via > > configfs with new I/O in-flight, it's possible to trigger a > > NULL pointer dereference in transport_lookup_cmd_lun() due > > to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD > > checking before incrementing lun->lun_ref.count after > > lun->lun_ref has switched to atomic_t mode. > > > > This results in a NULL pointer dereference as LUN shutdown > > code in core_tpg_remove_lun() continues running after the > > existing ->release() -> core_tpg_lun_ref_release() callback > > completes, and clears the RCU protected se_lun->lun_se_dev > > pointer. > > > > During the OOPs, the state of lun->lun_ref in the process > > which triggered the NULL pointer dereference looks like > > the following on v4.1.y stable code: > > > > struct se_lun { > > lun_link_magic = 4294932337, > > lun_status = TRANSPORT_LUN_STATUS_FREE, > > > > ..... > > > > lun_se_dev = 0x0, > > lun_sep = 0x0, > > > > ..... > > > > lun_ref = { > > count = { > > counter = 1 > > }, > > percpu_count_ptr = 3, > > release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>, > > confirm_switch = 0x0, > > force_atomic = false, > > rcu = { > > next = 0xffff88154fa1a5d0, > > func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu> > > } > > } > > } > > > > To address this bug, use percpu_ref_tryget_live() to ensure > > once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref > > has switched to atomic_t, all new I/Os will fail to obtain > > a new lun->lun_ref reference. > > > > Also use an explicit percpu_ref_kill_and_confirm() callback > > to block on ->lun_ref_comp to allow the first stage and > > associated RCU grace period to complete, and then block on > > ->lun_ref_shutdown waiting for the final percpu_ref_put() > > to drop the last reference via transport_lun_remove_cmd() > > before continuing with core_tpg_remove_lun() shutdown. > > > > Reported-by: Rob Millner <rlm@xxxxxxxxxxxxx> > > Tested-by: Rob Millner <rlm@xxxxxxxxxxxxx> > > Cc: Rob Millner <rlm@xxxxxxxxxxxxx> > > Tested-by: Vaibhav Tandon <vst@xxxxxxxxx> > > Cc: Vaibhav Tandon <vst@xxxxxxxxx> > > Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> > > --- > > drivers/target/target_core_device.c | 10 ++++++++-- > > drivers/target/target_core_tpg.c | 3 ++- > > drivers/target/target_core_transport.c | 31 ++++++++++++++++++++++++++++++- > > include/target/target_core_base.h | 1 + > > 4 files changed, 41 insertions(+), 4 deletions(-) > > > I have seen this and have tested this with our custom kernel. > > So this looks good from me! > Added your Tested-by to the patch. Thanks Bryant.