On Fri, Mar 23, 2018 at 05:52:03PM -0700, Darrick J. Wong wrote: > On Thu, Mar 22, 2018 at 11:19:45AM +1100, Dave Chinner wrote: > > On Wed, Mar 21, 2018 at 02:07:38PM -0400, Brian Foster wrote: > > > On Wed, Mar 21, 2018 at 10:27:28AM -0700, Darrick J. Wong wrote: > > > > On Wed, Mar 21, 2018 at 12:13:25PM -0400, Brian Foster wrote: > > > > > Most of the generic data structures embedded in xfs_mount are > > > > > dynamically initialized immediately after mp is allocated. A few > > > > > fields are left out and initialized during the xfs_mountfs() > > > > > sequence, after mp has been attached to the superblock. > > > > > > > > > > To clean this up and help prevent premature access of associated > > > > > fields, refactor xfs_mount allocation and all dependent init calls > > > > > into a new helper. This self-documents that all low level data > > > > > structures (i.e., locks, trees, etc.) should be initialized before > > > > > xfs_mount is attached to the superblock. > > > > > > > > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > > > > --- > > > > > > > > > > I realized after the fact that this steps on the ability to instrument > > > > > the use-before-init variant[1] of the problematic radix tree access on > > > > > failed mount issue that Dave tracked down. This is because we'd > > > > > immediately initialize the structure and thus a subsequent memset() > > > > > would ultimately require another proper initialization. > > > > > > > > That's simply a matter of re-calling > > > > > > > > INIT_RADIX_TREE(&mp->m_perag_tree, GFP_ATOMIC); > > > > > > > > after the msleep, right? > > > > > > > > > > I suppose that may work (I'll let Dave chime in)... > > > > Well, to tell the truth, this whole "VFS accesses the fs during > > mount: problem is isolated to only the inode cache shrinker, and it > > while a mount is in progress it does not run inode cache scans > > (can't lock s_umount because the mount holds it locked). > > > > IOWs, this crash can be fixed entirely by modifying the VFS code as > > I mentioned to avoid super_cache_count() doing any work during mount > > and unmount. > > > > So, really, none of these XFS changes are necessary to fix the > > problem that was reported. I'm not opposed to it as a cleanup, but > > let's not conflate it with the VFS superblock shrinker bug that > > needs to be fixed.... > > Are you working on such a patch? Yes. > I guessed that one could change the > vfs to avoid register_shrinker until after fill_super returns? Or > possibly just have super_cache_count bail out if !SB_ACTIVE, though that > could be messy given that xfs (and others) mess with that flag. I think it's fine for the shrinker to check SB_ACTIVE, because if it's not set then then inodes are not placed on thr LRUs that the shrinker works on. The problem is that the shrinker can run before filesystems assert SB_ACTIVE, even though there will be nothing for it to do and the filesystem has not set everything up. As such, I think it's the right thing to do to match SB_ACTIVE checks from iput_final() to super_cache_cache(). FWIW, if a filesystem is setting SB_ACTIVE earlier than the VFS, then it better make sure it's already done all the setup it needs to. The filesystem has to do that setup anyway (as the VFS requires it), so it's just a matter of what order we do things in ->fill_super. In XFS we have finished all the setup of the structures the shrinker is dependent on by the time we set SB_ACTIVE in log recovery. Hence I don't really see any problems with XFS or any of the other filesystems that set SB_ACTIVE in their mount routines. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html