Re: [PATCH V3 2/2] common/populate: Ensure that S_IFDIR.FMT_BTREE is in btree format

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Tue, Dec 06, 2022 at 09:29:54AM -0800, Darrick J. Wong wrote:
> On Tue, Dec 06, 2022 at 06:05:17PM +0800, Ziyang Zhang wrote:
> > Sometimes "$((128 * dblksz / 40))" dirents cannot make sure that
> > S_IFDIR.FMT_BTREE could become btree format for its DATA fork.
> > 
> > Actually we just observed it can fail after apply our inode
> > extent-to-btree workaround. The root cause is that the kernel may be
> > too good at allocating consecutive blocks so that the data fork is
> > still in extents format.
> > 
> > Therefore instead of using a fixed number, let's make sure the number
> > of extents is large enough than (inode size - inode core size) /
> > sizeof(xfs_bmbt_rec_t).
> > 
> > Reviewed-by: Allison Henderson <allison.henderson@xxxxxxxxxx>
> > Suggested-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>
> > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx>
> > Signed-off-by: Ziyang Zhang <ZiyangZhang@xxxxxxxxxxxxxxxxx>
> > ---
> >  common/populate | 28 +++++++++++++++++++++++++++-
> >  common/xfs      |  9 +++++++++
> >  2 files changed, 36 insertions(+), 1 deletion(-)
> > 
> > diff --git a/common/populate b/common/populate
> > index 6e004997..1ca76459 100644
> > --- a/common/populate
> > +++ b/common/populate
> > @@ -71,6 +71,31 @@ __populate_create_dir() {
> >  	done
> >  }
> >  
> > +# Create a large directory and ensure that it's a btree format
> > +__populate_xfs_create_btree_dir() {
> > +	local name="$1"
> > +	local isize="$2"
> > +	local icore_size="$(_xfs_inode_core_bytes)"
> > +	# We need enough extents to guarantee that the data fork is in
> > +	# btree format.  Cycling the mount to use xfs_db is too slow, so
> > +	# watch for when the extent count exceeds the space after the
> > +	# inode core.
> > +	local max_nextents="$(((isize - icore_size) / 16))"
> > +
> > +	mkdir -p "${name}"
> > +	d=0
> > +	while true; do
> > +		creat=mkdir
> > +		test "$((d % 20))" -eq 0 && creat=touch
> > +		$creat "${name}/$(printf "%.08d" "$d")"
> > +		if [ "$((d % 40))" -eq 0 ]; then
> > +			nextents="$(_xfs_get_fsxattr nextents $name)"
> > +			[ $nextents -gt $max_nextents ] && break
> > +		fi
> > +		d=$((d+1))
> > +	done
> > +}
> > +
> >  # Add a bunch of attrs to a file
> >  __populate_create_attr() {
> >  	name="$1"
> > @@ -176,6 +201,7 @@ _scratch_xfs_populate() {
> >  
> >  	blksz="$(stat -f -c '%s' "${SCRATCH_MNT}")"
> >  	dblksz="$(_xfs_get_dir_blocksize "$SCRATCH_MNT")"
> > +	isize="$(_xfs_inode_size "$SCRATCH_MNT")"
> >  	crc="$(_xfs_has_feature "$SCRATCH_MNT" crc -v)"
> >  	if [ $crc -eq 1 ]; then
> >  		leaf_hdr_size=64
> > @@ -226,7 +252,7 @@ _scratch_xfs_populate() {
> >  
> >  	# - BTREE
> >  	echo "+ btree dir"
> > -	__populate_create_dir "${SCRATCH_MNT}/S_IFDIR.FMT_BTREE" "$((128 * dblksz / 40))" true
> > +	__populate_xfs_create_btree_dir "${SCRATCH_MNT}/S_IFDIR.FMT_BTREE" "$isize"
> 
> The new helper function omits the "missing" parameter, which means that
> it no longer creates the directory entry blocks with a lot of free space
> in them, unlike current TOT.

Hi Ziyang,

Is there any reason we must drop this "missing" step. If no special reason, I'd
like to keep the original operation (the missing parameter is "true") to free
some space sparsely. I think that only removes some entries, won't remove the
dir-blocks (generally), so it won't affect the btree format you create.

Thanks,
Zorro

> 
> --D
> 
> >  
> >  	# Symlinks
> >  	# - FMT_LOCAL
> > diff --git a/common/xfs b/common/xfs
> > index 5180b9d3..744f0040 100644
> > --- a/common/xfs
> > +++ b/common/xfs
> > @@ -1487,6 +1487,15 @@ _require_xfsrestore_xflag()
> >  			_notrun 'xfsrestore does not support -x flag.'
> >  }
> >  
> > +# Number of bytes reserved for a full inode record, which includes the
> > +# immediate fork areas.
> > +_xfs_inode_size()
> > +{
> > +	local mntpoint="$1"
> > +
> > +	$XFS_INFO_PROG "$mntpoint" | grep 'meta-data=.*isize' | sed -e 's/^.*isize=\([0-9]*\).*$/\1/g'
> > +}
> > +
> >  # Number of bytes reserved for only the inode record, excluding the
> >  # immediate fork areas.
> >  _xfs_inode_core_bytes()
> > -- 
> > 2.18.4
> > 
> 




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux