On 8/18/20 3:43 PM, Stephen Smalley wrote:
Convert the policy read-write lock to RCU. This is significantly simplified by the earlier work to encapsulate the policy data structures and refactor the policy load and boolean setting logic. Since the latest_granting sequence number is no longer read/written under the policy rwlock, convert it to an atomic_t counter. Alternatively it could be left as a u32, moved into the selinux_policy structure, and updated atomically upon the pointer update. At that point struct selinux_ss would contain nothing but a pointer to struct selinux_policy and we could drop selinux_ss altogether? If we leave it as an atomic_t, possibly it should be atomic_long_t instead to reduce likelihood of overflow. At present this change merely passes a hardcoded 1 to rcu_dereference_check() in the cases where we know we do not need to take rcu_read_lock(), with the preceding comment explaining why. Alternatively we could pass fsi->mutex down from selinuxfs and apply a lockdep check on it instead. We have to hand off the actual freeing of the policy data structure to a worker thread instead of doing it directly from the rcu callback because the avtab and certain other structures may be allocated/freed via vmalloc/vfree and vfree isn't permitted from rcu callback. I previously used this approach in the selinuxns series for freeing the selinux namespace, and it is based on the approach used for user namespaces in mainline. This change does not specifically do anything special with respect to the sidtab live convert; I am unclear as to whether it is safe as a result. Comments welcome. Based in part on earlier attempts to convert the policy rwlock to RCU by Kaigai Kohei [1] and by Peter Enderborg [2]. [1] https://lore.kernel.org/selinux/6e2f9128-e191-ebb3-0e87-74bfccb0767f@xxxxxxxxxxxxx/ [2] https://lore.kernel.org/selinux/20180530141104.28569-1-peter.enderborg@xxxxxxxx/ Signed-off-by: Stephen Smalley <stephen.smalley.work@xxxxxxxxx>
Looking through the RCU documentation again, it occurs to me that this would be simpler if I just called synchronize_rcu() after updating the policy pointer and then freed things directly rather than from rcu callback? I took the call_rcu() approach in this patch because that is what KaiGai did and what was done in the AVC but it seems like synchronize_rcu() is preferred when possible.