Re: How to abuse RCU to scan the children of a mount?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 04, 2020 at 11:28:16AM -0800, Paul E. McKenney wrote:

> Huh.  The mount structure isn't suffering from a shortage of list_head
> structures, is it?
> 
> So the following can happen, then?
> 
> o	The __attach_mnt() function adds a struct mount to its parent
> 	list, but in a non-RCU manner.	Unless there is some other
> 	safeguard, the list_add_tail() in this function needs to be
> 	list_add_tail_rcu().
> 
> o	I am assuming that the various non-RCU traversals that I see,
> 	for example, next_mnt(), are protected by lock_mount_hash().
> 	Especially skip_mnt_tree(), which uses mnt_mounts.prev.  (I didn't
> 	find any exceptions, but I don't claim an exhaustive search.)
> 
> o	The umount_tree() function's use of list_del_init() looks like
> 	it could trap an RCU reader in the newly singular list formed
> 	by the removal.  It appears that there are other functions that
> 	use list_del_init() on this list, though I cannot claim any sort
> 	of familiarity with this code.
> 
> 	So, do you need to add a check for child->mnt_child being in this
> 	self-referential state within fsinfo_generic_mount_children()?
> 
> 	Plus list_del_init() doesn't mark its stores, though
> 	some would argue that unmarked stores are OK in this situation.
> 
> o	There might be other operations in need of RCU-ification.
> 
> 	Maybe the list_add_tail() in umount_tree(), but it is not
> 	immediately clear that this is adding a new element instead of
> 	re-inserting an element already exposed to readers.

IMO all of that is a good argument *against* trying to pull any kind of RCU
games here.  Access to these lists is assumed to be serialized on
mount_lock spinlock component held exclusive and unless there is a very
good reason to go for something trickier, let's not.

Hash chains are supposed to be walked under rcu_read_lock(), requiring
to recheck the mount_lock seqcount *AND* with use of legitimize_mnt()
if you are to try and get a reference out of that.  ->mnt_parent chains
also can be walked in the same conditions (subject to the same
requirements).  The policy with everything else is
	* get mount_lock spinlock exclusive or
	* get namespace_sem or
	* just fucking don't do it.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux