Hi Dave, On Tue, Mar 27, 2012 at 08:57:09AM +1100, Dave Chinner wrote: > On Mon, Mar 26, 2012 at 10:10:50AM -0500, Mark Tinguely wrote: > > On 03/25/12 18:22, Dave Chinner wrote: > > >On Fri, Mar 23, 2012 at 08:34:31AM -0500, Mark Tinguely wrote: > > >>> On 03/22/12 16:07, Dave Chinner wrote: > > >>>> >On Thu, Mar 22, 2012 at 10:15:48AM -0500, Ben Myers wrote: > > >>>>> >>On Thu, Mar 22, 2012 at 04:15:08PM +1100, Dave Chinner wrote: > > >>>>>> >>>From: Dave Chinner<dchinner@xxxxxxxxxx> > > >>>>>> >>> > > >>>>>> >>>Because the mount process can run a quotacheck and consume lots of > > >>>>>> >>>inodes, we need to be able to run periodic inode reclaim during the > > >>>>>> >>>mount process. This will prevent running the system out of memory > > >>>>>> >>>during quota checks. > > >>>>>> >>> > > >>>>>> >>>This essentially reverts 2bcf6e97, but that is safe to do now that > > >>>>>> >>>the quota sync code that was causing problems during long quotacheck > > >>>>>> >>>executions is now gone. > > >>>>> >> > > >>>>> >>Dave, I've held off on #s 3 and 4 because they appear to be racy. Being > > >>>> > > > >>>> >What race? > > >>>> > > > >>>> >Cheers, > > >>>> > > > >>>> >Dave > > >>> > > >>> > > >>> 2 of the sync workers use iterators > > >>> xfs_inode_ag_iterator() > > >>> xfs_perag_get() > > >>> radix_tree_lookup(&mp->m_perag_tree, agno) > > >>> > > >>> The race I was worried about was in xfs_mount() to initialize the > > >>> mp->m_perag_lock, and the radix tree initialization: > > >>> INIT_RADIX_TREE(&mp->m_perag_tree, GFP_ATOMIC)). > > >>> > > >>> There is a lock and 2 or 3 unbuffered I/O are performed in xfs_mountfs() > > >>> before the mp->m_perag_tree is initialized. > > >Yes they are uncached IOs so do not utilise the cache that > > >requires the mp->m_perag_tree to be initialised. > > > > The point I was trying to make is the sync workers use iterators. > > The race is to get the mp->m_perag_tree initialized before one of > > the sync workers tries to do a xfs_perag_get(). > > Firstly, xfs_sync_worker does not iterate AGs at all anymore - it > pushes the log and the AIL, and nothing else. So there is no > problems there. xfs_sync_worker forces the log and pushes the ail, and a sync is queued in xfs_syncd_init before either the log or ail are initialized. A sync should not be queued before the log and ail are initialized regardless of the value of xfs_syncd_centisecs. In the error path of xfs_fs_fill_super a sync could still be running after xfs_unmount is called, so there is a window there where it could dereference m_log which had been set to NULL. > Secondly xfs_flush_worker() is only triggered by ENOSPC, and that > can't happen until the filesystem is mounted and real work starts. xfs_flush_worker is triggered by xfs_flush_inodes on ENOSPC in xfs_create and xfs_iomap_write_delay. I agree that in upon startup one would not be able to trigger an ENOSPC event from either of these codepaths until the filesystem has mounted. Further, if we hit any error in fill_super you could not possibly trigger ENOSPC because the root inode had not been allocated yet. > Finally, the reclaim worker does iterate the perag tree, xfs_reclaim_worker has a similar issue with the perag tree in this patch. It could look at the tree before it has been initialized. > but the next patch in the series ensures that is started on demand, not > from xfs_syncd_init(). This ensures that iteration does not occur until > after the first inode is placed into reclaim, and that must happen after > the perag tree is initialised because otherwise we can't read in inodes, > let alone put them into a reclaim state.... I suggest 3 and 4 should be combined into one patch. You've reordered xfs_syncd_stop with respect to xfs_unmount in the error path of xfs_fs_fill_super, but not in the regular unmount path xfs_fs_put_super. I think for consistency they should not be reordered in the error path of xfs_fs_fill_super. As long as workers can run before xfs_mountfs is run, they need to protect themselves to ensure that the structures they are using are initialized. It looks like xfs_reclaim_worker would do this in the next patch by using MS_ACTIVE, but FWICS xfs_sync_worker still does not protect itself. -Ben _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs