On Wed, May 12, 2021 at 04:11:14PM -0700, Darrick J. Wong wrote: > On Thu, May 06, 2021 at 05:20:50PM +1000, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > xfs_dialloc_select_ag() does a lot of repetitive work. It first > > calls xfs_ialloc_ag_select() to select the AG to start allocation > > attempts in, which can do up to two entire loops across the perags > > that inodes can be allocated in. This is simply checking if there is > > spce available to allocate inodes in an AG, and it returns when it > > finds the first candidate AG. > > > > xfs_dialloc_select_ag() then does it's own iterative walk across > > all the perags locking the AGIs and trying to allocate inodes from > > the locked AG. It also doesn't limit the search to mp->m_maxagi, > > so it will walk all AGs whether they can allocate inodes or not. > > > > Hence if we are really low on inodes, we could do almost 3 entire > > walks across the whole perag range before we find an allocation > > group we can allocate inodes in or report ENOSPC. > > > > Because xfs_ialloc_ag_select() returns on the first candidate AG it > > finds, we can simply do these checks directly in > > xfs_dialloc_select_ag() before we lock and try to allocate inodes. > > This reduces the inode allocation pass down to 2 perag sweeps at > > most - one for aligned inode cluster allocation and if we can't > > allocate full, aligned inode clusters anywhere we'll do another pass > > trying to do sparse inode cluster allocation. > > > > This also removes a big chunk of duplicate code. > > Soooooo... we did an AG walk to pick the optimal starting point of an AG > walk? Heh. yup. > > @@ -1734,16 +1616,23 @@ xfs_dialloc_select_ag( > > struct xfs_perag *pag; > > struct xfs_ino_geometry *igeo = M_IGEO(mp); > > bool okalloc = true; > > + int needspace; > > + int flags; > > > > *IO_agbp = NULL; > > > > /* > > - * We do not have an agbp, so select an initial allocation > > - * group for inode allocation. > > + * Files of these types need at least one block if length > 0 > > + * (and they won't fit in the inode, but that's hard to figure out). > > Uh, what is length here? Seeing as most directories and symlinks > probably end up in local format, is this needspace computation really > true? Ah, that's the exact text from the comment in xfs_ialloc_ag_select() above the needspace calculation - I didn't actually change it at all. :/ > I guess it doesn't hurt to be cautious and assume that directories can > expand and that people are aggressively symlinking. But maybe this > comment could be rephrased to something like: > > /* > * Directories, symlinks, and regular files frequently allocate > * at least one block, so factor that potential expansion when > * we examine whether an AG has enough space for file creation. > */ > needspace = S_ISDIR(mode)...; > > With that clarified, > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> Changed. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx