Re: [PATCH 16/16] xfs: ATTR_REPLACE algorithm with LARP enabled needs rework

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 03, 2022 at 06:30:23PM -0700, Alli wrote:
> On Tue, 2022-05-03 at 17:40 +1000, Dave Chinner wrote:
> > On Thu, Apr 28, 2022 at 12:02:17AM -0700, Alli wrote:
> > > On Thu, 2022-04-14 at 19:44 +1000, Dave Chinner wrote:
> > > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > > > 
> > > > We can't use the same algorithm for replacing an existing
> > > > attribute
> > > > when logging attributes. The existing algorithm is essentially:
> > > > 
> > > > 1. create new attr w/ INCOMPLETE
> > > > 2. atomically flip INCOMPLETE flags between old + new attribute
> > > > 3. remove old attr which is marked w/ INCOMPLETE
> > > > 
> > > > This algorithm guarantees that we see either the old or new
> > > > attribute, and if we fail after the atomic flag flip, we don't
> > > > have
> > > > to recover the removal of the old attr because we never see
> > > > INCOMPLETE attributes in lookups.
> > ....
> > > > diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> > > > index 39af681897a2..a46379a9e9df 100644
> > > > --- a/fs/xfs/xfs_attr_item.c
> > > > +++ b/fs/xfs/xfs_attr_item.c
> > > > @@ -490,9 +490,14 @@ xfs_attri_validate(
> > > >  	if (attrp->__pad != 0)
> > > >  		return false;
> > > >  
> > > > -	/* alfi_op_flags should be either a set or remove */
> > > > -	if (op != XFS_ATTR_OP_FLAGS_SET && op !=
> > > > XFS_ATTR_OP_FLAGS_REMOVE)
> > > > +	switch (op) {
> > > > +	case XFS_ATTR_OP_FLAGS_SET:
> > > > +	case XFS_ATTR_OP_FLAGS_REMOVE:
> > > > +	case XFS_ATTR_OP_FLAGS_REPLACE:
> > > > +		break;
> > > > +	default:
> > > >  		return false;
> > > > +	}
> > > >  
> > > >  	if (attrp->alfi_value_len > XATTR_SIZE_MAX)
> > > >  		return false;
> > > > @@ -553,11 +558,27 @@ xfs_attri_item_recover(
> > > >  	args->namelen = attrp->alfi_name_len;
> > > >  	args->hashval = xfs_da_hashname(args->name, args->namelen);
> > > >  	args->attr_filter = attrp->alfi_attr_flags;
> > > > +	args->op_flags = XFS_DA_OP_RECOVERY;
> > > >  
> > > > -	if (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET) {
> > > > +	switch (attr->xattri_op_flags) {
> > > > +	case XFS_ATTR_OP_FLAGS_SET:
> > > > +	case XFS_ATTR_OP_FLAGS_REPLACE:
> > > >  		args->value = attrip->attri_value;
> > > >  		args->valuelen = attrp->alfi_value_len;
> > > >  		args->total = xfs_attr_calc_size(args, &local);
> > > > +		if (xfs_inode_hasattr(args->dp))
> > > I ran into a test failure and tracked it down to the above line.  I
> > > suspect because xfs_inode_hasattr only checks to see if the inode
> > > has
> > > an attr fork, it doesnt actually check to see if it has the attr
> > > we're
> > > replacing.
> > 
> > Right, that was intentional. It is based on the fact that if we
> > are recovering a set or a replace operation, we have to remove the
> > INCOMPLETE xattr first. However, if the attr fork is empty, there's
> > no INCOMPLETE xattr to remove, and so we can just go straight to the
> > set operation to create the new value.
> > 
> > Hmmm - what was the failure? Was it a null pointer dereference
> > on ip->i_afp? I wonder if you hit the corner case where attr removal
> > can remove the attr fork, and that's when it crashed and we've tried
> > to recover from?
> No, the actual shutdown was from the error inject that the test case
> uses.  The unexpected part was a set operation returning -ENODATA
> because we incorrectly fell down the rename path. 

That's not correct - recovery of a set operation has to remove
the INCOMPLETE attr that is was in the process of being built. If
we are in recovery and we don't find an existing entry, we should
just then fall through the state machine to the set operation
as there was nothing to remove.

-ENOATTR/-ENODATA in that case is valid, it just sounds like we
didn't find an incomplete attr to remove and didn't handle the case
correctly.

> I suspect the reason
> the parent pointers exposed it was because the presence of the parent
> pointer caused the attr fork to not be empty and so xfs_inode_hasattr
> succeeds. 

I think it's the case where we fail immediately after logging the
first SET intent to the journal, and haven't actually logged
any other changes yet. That first intent can only be logged after
the attr fork has been created and logged, so -ENODATA is a case
recovery is supposed to handle for a SET operation.

> > Oh, I think I might have missed a case there. If you look at
> > xfs_attr_sf_removename() I added a case to avoid removing the attr
> > fork when XFS_DA_OP_RENAME is set because we don't want to remove it
> > when we are about to add to it again. But I didn't add the same
> > logic to xfs_attr3_leaf_to_shortform() which can also trash the attr
> > fork if the last attr we remove from the attr fork is larger than
> > would fit in a sf attr fork. Hence we go straight from leaf form to
> > no attr fork at all.
> > 
> > Ok, that's definitely a bug, I'll need to fix that, and it could be
> > the cause of this issue as removing the attr fork will set
> > forkoff to 0 and so the inode will not have an attr fork
> > instantiated when it is read into memory...
> > 
> > 
> Ah, that could be it then.  The last failing test case is: expanding
> the fork into node form, setting the inject, and attempting a rename.

Ah, so that's likely how we get the situation I suggested above - we
commit the node form expansion without other modifications but
include the SET intent, so if we recover that node form
transformation, we most definitely have a SET intent without an
INCOMPLETE attr to remove...

I'll add it to the list.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux