Re: [PATCH] xfs: do not log/recover swapext extent owner changes for deleted inodes

Eric Sandeen <sandeen@xxxxxxxxxxx> · Wed, 7 Mar 2018 13:58:31 -0600

On 2/26/18 2:56 PM, Brian Foster wrote:
> On Mon, Feb 26, 2018 at 11:39:51AM -0500, Brian Foster wrote:
>> On Fri, Feb 23, 2018 at 05:49:41PM -0600, Eric Sandeen wrote:
>>> Today if we run swapext and crash, log replay can fail because
>>> the recovery code tries to instantiate the donor inode from
>>> disk to replay the swapext, but it's been deleted and we throw
>>> corruption failures if we try to get an inode off disk with
>>> i_mode == 0.
>>>
>>> This fixes both sides: We don't log the swapext change if the
>>> inode has been deleted, and we don't try to recover it either.
>>>
>>> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
>>> ---
>>>
>>> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
>>> index 26f2413..de48eb8 100644
>>> --- a/fs/xfs/xfs_inode_item.c
>>> +++ b/fs/xfs/xfs_inode_item.c
>>> @@ -436,6 +436,12 @@ xfs_inode_item_format(
>>>  			~(XFS_ILOG_ADATA | XFS_ILOG_ABROOT | XFS_ILOG_AEXT);
>>>  	}
>>>  
>>> +	/* If this inode has been deleted do not log swapext owner changes */
>>> +	if (VFS_I(ip)->i_mode == 0) {
>>> +		ilf->ilf_fields &= ~(XFS_ILOG_DOWNER | XFS_ILOG_AOWNER);
>>> +		iip->ili_fields &= ~(XFS_ILOG_DOWNER | XFS_ILOG_AOWNER);
>>> +	}
>>> +
>>
>> Do you have any more details on the context that leads to this issue?
>> More specifically, is the problem limited to/because of the case where
>> the inode is relogged and the owner change flag carries forward to the
>> transaction that ultimately frees it (which seems to me is what the
>> above prevents)? Or is there some other scenario that can lead to this?
>>
>> I guess I'm kind of wondering if this can still happen in spite of the
>> above, if the extent swap -> unlink happens in separate log formats and
>> the inode happens to be written back before a crash and the log tail
>> being unpinned. Now that I think of it I suppose the log recovery lsn
>> ordering should prevent that kind of thing on v5 filesystems, at least.
>>
> 
> After playing around a bit I think I managed to set myself straight on
> this. Indeed, I think the above recovery LSN ordering rules hold for any
> separately logged extent swap and subsequent inode free on v5
> filesystems. It essentially doesn't matter on v4 filesystems because
> there is no metadata owner update on extent swap, since that format
> doesn't have the owner info in the bmbt buffers.
> 
> So I think this covers everything. My only remaining comments are to
> perhaps add a bit more detail in the commit log and/or code comments to
> document the situation. Also, have you considered defining a new
> function to perform this update on the inode item explicitly from
> xfs_ifree() rather than burying it down in xfs_inode_item_format() (more
> for clarity than any technical reason that I can think of)?

Sorry for the late reply.

I'm not sure I see a way to do this in xfs_ifree, because we don't have
access to the inode log format to make the necessary changes at that point.
Or am I missing something?

And, um, you've probably been more methodical than I have in checking out
the change - can I ask for a suggestion of what sorts of comments you'd
like to see to make things more clear?  I fear I'm in "unknown unknowns"
territory.

Thanks,
-Eric

> Brian

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html