Hi Benjamin, Apologies for the extended delayed follow-up. Comments below. On Thu, 2016-09-01 at 11:58 +0100, Benjamin ESTRABAUD wrote: > Hi, > > I have a system where I need to configure a lot of ACLs on many targets > rather quickly. To achieve this speed I configure those ACLs using > rtslib in multiple processes (I use a framework akin to Python > multiprocessing). > Cool. > Intermittently, I hit a BUG_ON on line 365 of target_core_device.c which > is as follow when creating those ACLs: > > BUG_ON(orig->se_lun_acl != NULL); > This BUG_ON was originally added during the v4.2 RCU conversion, to catch the case where implicit demo-mode NodeACLs + LUN ACLs are being created driven by fabric login, while explicit configfs NodeACLs + LUN ACLs are being created (or converted to explicit) by user-space. This specific scenario is: 1a) Fabric login creates implicit se_node_acl for $INITIATOR_NAME when generate_node_acls=1 (demo-mode) is set. 1b) Implicit se_node_acl begins adding implicit se_lun_acls based on current /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/lun/ layout. 2a) Concurrently, user-space creates the same explicit $INITIATOR_NAME in /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/acls/, located from existing implicit se_node_acl $INITIATOR_NAME in target-core. 2b) User-space subsequently creates explicit se_lun_acl symlinks in ../acls/$INITIATOR_NAME/lun_*/ As observed, 1[a,b] can happen concurrently with 2[a,b], which causes the implicit se_lun_acl creation to detect the existing explicit se_lun_acl attached to the implicitly created se_node_acl. > We use Linux 4.5.1 on a custom distro and a slightly old rtslib version > 6666d30fb1d90b9195d98c8fe8fcec2708362873 from the RTS repo. > > Has anybody seen this issue while doing concurrent ACL operations? Nope. Thanks for reporting. > Is concurrency discouraged? Certainly not. ;) > Any idea where we should start looking to fix > the problem (where se_lun_acl is set or could have been set before) > and/or the reason for this BUG_ON? > > Thanks a lot in advance for your help! > After thinking about this further, the scenario is legal in pre v4.2 RCU code and should be non-fatal. Please reproduce on your setup with the following patch, and confirm the specific $INITIATOR_NAME + MappedLUN is able to function as expected once the scenario + warning occurs. diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c index 6b42348..a04148d 100644 --- a/drivers/target/target_core_device.c +++ b/drivers/target/target_core_device.c @@ -351,7 +351,15 @@ int core_enable_device_list_for_node( kfree(new); return -EINVAL; } - BUG_ON(orig->se_lun_acl != NULL); + if (orig->se_lun_acl != NULL) { + pr_warn_ratelimited("Detected existing explicit" + " se_lun_acl->se_lun_group reference for %s" + " mapped_lun: %llu, ignoring\n", + nacl->initiatorname, mapped_lun); + mutex_unlock(&nacl->lun_entry_mutex); + kfree(new); + return -EINVAL; + } rcu_assign_pointer(new->se_lun, lun); rcu_assign_pointer(new->se_lun_acl, lun_acl); -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html