Re: [PATCH 02/42] xfs: prefer free inodes at ENOSPC over chunk allocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2023-01-19 at 09:44 +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> When an XFS filesystem has free inodes in chunks already allocated
> on disk, it will still allocate new inode chunks if the target AG
> has no free inodes in it. Normally, this is a good idea as it
> preserves locality of all the inodes in a given directory.
> 
> However, at ENOSPC this can lead to using the last few remaining
> free filesystem blocks to allocate a new chunk when there are many,
> many free inodes that could be allocated without consuming free
> space. This results in speeding up the consumption of the last few
> blocks and inode create operations then returning ENOSPC when there
> free inodes available because we don't have enough block left in the
> filesystem for directory creation reservations to proceed.
> 
> Hence when we are near ENOSPC, we should be attempting to preserve
> the remaining blocks for directory block allocation rather than
> using them for unnecessary inode chunk creation.
> 
> This particular behaviour is exposed by xfs/294, when it drives to
> ENOSPC on empty file creation whilst there are still thousands of
> free inodes available for allocation in other AGs in the filesystem.
> 
> Hence, when we are within 1% of ENOSPC, change the inode allocation
> behaviour to prefer to use existing free inodes over allocating new
> inode chunks, even though it results is poorer locality of the data
> set. It is more important for the allocations to be space efficient
> near ENOSPC than to have optimal locality for performance, so lets
> modify the inode AG selection code to reflect that fact.
> 
> This allows generic/294 to not only pass with this allocator rework
> patchset, but to increase the number of post-ENOSPC empty inode
> allocations to from ~600 to ~9080 before we hit ENOSPC on the
> directory create transaction reservation.
> 
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
Ok, makes sense
Reviewed-by: Allison Henderson <allison.henderson@xxxxxxxxxx>
 
> ---
>  fs/xfs/libxfs/xfs_ialloc.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> index 5118dedf9267..e8068422aa21 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.c
> +++ b/fs/xfs/libxfs/xfs_ialloc.c
> @@ -1737,6 +1737,7 @@ xfs_dialloc(
>         struct xfs_perag        *pag;
>         struct xfs_ino_geometry *igeo = M_IGEO(mp);
>         bool                    ok_alloc = true;
> +       bool                    low_space = false;
>         int                     flags;
>         xfs_ino_t               ino;
>  
> @@ -1767,6 +1768,20 @@ xfs_dialloc(
>                 ok_alloc = false;
>         }
>  
> +       /*
> +        * If we are near to ENOSPC, we want to prefer allocation
> from AGs that
> +        * have free inodes in them rather than use up free space
> allocating new
> +        * inode chunks. Hence we turn off allocation for the first
> non-blocking
> +        * pass through the AGs if we are near ENOSPC to consume free
> inodes
> +        * that we can immediately allocate, but then we allow
> allocation on the
> +        * second pass if we fail to find an AG with free inodes in
> it.
> +        */
> +       if (percpu_counter_read_positive(&mp->m_fdblocks) <
> +                       mp->m_low_space[XFS_LOWSP_1_PCNT]) {
> +               ok_alloc = false;
> +               low_space = true;
> +       }
> +
>         /*
>          * Loop until we find an allocation group that either has
> free inodes
>          * or in which we can allocate some inodes.  Iterate through
> the
> @@ -1795,6 +1810,8 @@ xfs_dialloc(
>                                 break;
>                         }
>                         flags = 0;
> +                       if (low_space)
> +                               ok_alloc = true;
>                 }
>                 xfs_perag_put(pag);
>         }





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux