On 9/19/20 12:07 AM, Brian Bunker wrote:
Internally we had discussions about adding a synchronize_rcu here:
--- a/scsi/drivers/scsi/device_handler/scsi_dh_alua.c 2020-09-10 12:29:03.000000000 -0700
+++ b/scsi/drivers/scsi/device_handler/scsi_dh_alua.c 2020-09-18 14:59:59.000000000 -0700
@@ -658,8 +658,6 @@
rcu_read_lock();
list_for_each_entry_rcu(h,
&tmp_pg->dh_list, node) {
- /* h->sdev should always be valid */
- BUG_ON(!h->sdev);
h->sdev->access_state = desc[0];
}
rcu_read_unlock();
@@ -705,7 +703,6 @@
pg->expiry = 0;
rcu_read_lock();
list_for_each_entry_rcu(h, &pg->dh_list, node) {
- BUG_ON(!h->sdev);
h->sdev->access_state =
(pg->state & SCSI_ACCESS_STATE_MASK);
if (pg->pref)
@@ -1147,7 +1144,6 @@
spin_lock(&h->pg_lock);
pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h->pg_lock));
rcu_assign_pointer(h->pg, NULL);
- h->sdev = NULL;
spin_unlock(&h->pg_lock);
if (pg) {
spin_lock_irq(&pg->lock);
@@ -1156,6 +1152,7 @@
kref_put(&pg->kref, release_port_group);
}
sdev->handler_data = NULL;
+ synchronize_rcu();
kfree(h);
}
We were thinking that this would allow any running readers to finish before the object is freed. Would
that make sense there? From what I can tell the only RCU implementation in this kernel version in
scsi is here, so I couldn’t find many examples to follow.
That actually looks quite viable (if the intendation gets fixed :-)
Initially I was thinking of using 'h->sdev = NULL' as a marker for the
other code to figure out that the device has about to be deleted; that
worked, but with rather unintended consequences.
Have you tried with your patch?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer