On 8/3/23 6:28 AM, Huang Cun wrote: > When a disk fails to attach, the struct rdac_dh_data is released, > but it is not removed from the ctlr->dh_list. When attaching another > disk, the released rdac_dh_data will be accessed and the following > BUG_ON() may be observed: > > [ 414.696167] scsi 5:0:0:7: rdac: Attach failed (8) > ... > [ 423.615364] kernel BUG at drivers/scsi/device_handler/scsi_dh_rdac.c:427! > [ 423.615731] invalid opcode: 0000 [#1] SMP NOPTI > ... > [ 423.623247] Call Trace: > [ 423.623598] rdac_bus_attach+0x203/0x4c0 > [ 423.623949] ? scsi_dh_handler_attach+0x2d/0x90 > [ 423.624300] scsi_dh_handler_attach+0x2d/0x90 > [ 423.624652] scsi_sysfs_add_sdev+0x88/0x270 > [ 423.625004] scsi_probe_and_add_lun+0xc47/0xd50 > [ 423.625354] scsi_report_lun_scan+0x339/0x3b0 > [ 423.625705] __scsi_scan_target+0xe9/0x220 > [ 423.626056] scsi_scan_target+0xf6/0x100 > [ 423.626404] fc_scsi_scan_rport+0xa5/0xb0 > [ 423.626757] process_one_work+0x15e/0x3f0 > [ 423.627106] worker_thread+0x4c/0x440 > [ 423.627453] ? rescuer_thread+0x350/0x350 > [ 423.627804] kthread+0xf8/0x130 > [ 423.628153] ? kthread_destroy_worker+0x40/0x40 > [ 423.628509] ret_from_fork+0x1f/0x40 > > Fixes: 1a5dc166cd88 ("scsi_dh_rdac: update 'access_state' field") > Signed-off-by: Huang Cun <huangcun@xxxxxxxxxxxxxx> > Signed-off-by: Ding Hui <dinghui@xxxxxxxxxxxxxx> > Cc: Donglin Peng <pengdonglin@xxxxxxxxxxxxxx> > --- > drivers/scsi/device_handler/scsi_dh_rdac.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c > index c5538645057a..9d487c2b7708 100644 > --- a/drivers/scsi/device_handler/scsi_dh_rdac.c > +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c > @@ -762,8 +762,10 @@ static int rdac_bus_attach(struct scsi_device *sdev) > > clean_ctlr: > spin_lock(&list_lock); > + list_del_rcu(&h->node); > kref_put(&h->ctlr->kref, release_controller); > spin_unlock(&list_lock); > + synchronize_rcu(); > Should this be: spin_lock(&list_lock); list_del_rcu(&h->node); spin_unlock(&list_lock); synchronize_rcu(); kref_put(&h->ctlr->kref, release_controller); ? If you do the synchronize_rcu after the kref_put, then the kref_put could free the rdac_dh_data, while check_ownership is still accessing the rdac_dh_data, right?