Re: [PATCH V1.1] xfs_repair: Search for conflicts in inode_tree_ptrs[] when processing uncertain inodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 26, 2022 at 09:57:59 AM -0700, Darrick J. Wong wrote:
> On Thu, May 26, 2022 at 05:34:41PM +0530, Chandan Babu R wrote:
>> On Tue, May 24, 2022 at 09:08:13 AM +1000, Dave Chinner wrote:
>> > On Mon, May 23, 2022 at 01:28:47PM -0700, Darrick J. Wong wrote:
>> >> On Mon, May 23, 2022 at 02:04:10PM +0530, Chandan Babu R wrote:
>> >> > When processing an uncertain inode chunk record, if we lose 2 blocks worth of
>> >> > inodes or 25% of the chunk, xfs_repair decides to ignore the chunk. Otherwise,
>> >> > xfs_repair adds a new chunk record to inode_tree_ptrs[agno], marking each
>> >> > inode as either free or used. However, before adding the new chunk record,
>> >> > xfs_repair has to check for the existance of a conflicting record.
>> >> > 
>> >> > The existing code incorrectly checks for the conflicting record in
>> >> > inode_uncertain_tree_ptrs[agno]. This check will succeed since the inode chunk
>> >> > record being processed was originally obtained from
>> >> > inode_uncertain_tree_ptrs[agno].
>> >> > 
>> >> > This commit fixes the bug by changing xfs_repair to search
>> >> > inode_tree_ptrs[agno] for conflicts.
>> >> 
>> >> Just out of curiosity -- how did you come across this bug?  I /think/ it
>> >> looks reasonable, but want to know more context...
>> >> 
>> >> > Signed-off-by: Chandan Babu R <chandan.babu@xxxxxxxxxx>
>> >> > ---
>> >> > Changelog:
>> >> > V1 -> V1.1:
>> >> >    1. Fix commit message.
>> >> >    
>> >> >  repair/dino_chunks.c | 3 +--
>> >> >  1 file changed, 1 insertion(+), 2 deletions(-)
>> >> > 
>> >> > diff --git a/repair/dino_chunks.c b/repair/dino_chunks.c
>> >> > index 11b0eb5f..80c52a43 100644
>> >> > --- a/repair/dino_chunks.c
>> >> > +++ b/repair/dino_chunks.c
>> >> > @@ -229,8 +229,7 @@ verify_inode_chunk(xfs_mount_t		*mp,
>> >> >  		/*
>> >> >  		 * ok, put the record into the tree, if no conflict.
>> >> >  		 */
>> >> > -		if (find_uncertain_inode_rec(agno,
>> >> > -				XFS_AGB_TO_AGINO(mp, start_agbno)))
>> >> > +		if (find_inode_rec(mp, agno, XFS_AGB_TO_AGINO(mp, start_agbno)))
>> >> 
>> >> ...because the big question I have is: why not check both the certain
>> >> and the uncertain records for confliects?
>> >
>> > Yeah, that was my question, too.
>> 
>> I came across this issue while reading code in order to better understand
>> xfs_repair.
>> 
>> The following steps illustrate how the code flows from phase 2 and 3 of
>> xfs_repair.
>> 
>> During phase 2,
>> 1. Scan inobt records.
>> 2. For valid records, add corresponding entries to certain inode tree
>>    (i.e. inode_tree_ptrs[agno]).
>> 3. For suspect records (e.g. Inobt leaf blocks which have a CRC mismatch), add
>>    entries to uncertain inode tree (i.e. inode_uncertain_tree_ptrs[agno]).
>> 
>> Uncertain inode chunk records are processed at the beginning of Phase 3
>> (please refer to check_uncertain_aginodes()). We pick one inode chunk at a
>> time from the uncertain inode tree and verify each inode's ondisk contents. If
>> most of the chunk's inodes turn out to be valid, we would want to treat the
>> chunk's inodes as certain i.e. move them over to the certain inode tree.
>> 
>> Existing code would check for the presence of the inode chunk in the uncertain
>> inode tree and when such an entry is found, we skip further processing of the
>> inode chunk. Since the inode chunk was obtained from the uncertain inode tree
>> in the first place, this check succeeds and the code ended up ignoring
>> uncertain inodes which were actually valid inodes.
>> 
>> I think checking uncertain inode tree for conflicts is a programming error. We
>> should actually be checking only the certain inode tree for conflicts before
>> moving the inode chunk to certain inode tree.
>
> Oh, ok, so repair is walking the uncertain inode chunks to see if they
> really correspond to inodes.  Having decided that the chunk is good, the
> last little piece is to check that the uncertain chunk doesn't overlap
> with any of the known-good chunks, and if /that/ passes, repair moves
> the uncertain chunk to inode_tree_ptrs[]?  And therefore it makes no
> sense at all to compare one uncertain chunk against the rest of the
> uncertain chunks, because (a) that's where it just came from and

Here are the exact steps involved in processing inode chunks obtained from
uncertain inode chunk tree:
1. For each inode chunk in the uncertain inode chunk tree
   1.1. Verify inodes in the chunk.
   1.2. If most of the inodes in the chunk are found to be valid,
        1.2.1. If there are no overlapping inode chunks in the uncertain inode
               chunk tree.
               1.2.1.1. Add inode chunk to certain inode tree.
   1.3. Remove inode chunk from uncertain inode chunk tree.

The check in 1.2.1 is bound to fail since the inode chunk being processed was
obtained from the uncertain inode chunk tree and it continues to be there
until step 1.3 is executed.

This patch changes 1.2.1 to check for overlapping inode chunks in the certain
inode chunk tree.

> (b) we could discard any of the remaining uncertain chunks?
>
> If the answers are yes and yes, then:
> Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx>
>
> Though you might want to augment the commit message to include that last
> sentence about why it doesn't make sense to check the uncertain ichunk
> list, since that's where I tripped up. :/
>

I think I should will add the sequence of steps described above and the cause
of failure as part of the commit message.

>> I wrote the script
>> (https://gist.github.com/chandanr/5ad2da06a7863c2918ad793636537536) to
>> illustrate the problem. This script create an inobt with two fully populated
>> leaves. It then changes 2nd leaf's lsn value to cause a CRC check
>> failure. This causes phase 2 of xfs_repair to add inodes in the 2nd leaf to
>> uncertain inode tree.
>
> Looks like a reasonably good candidate for an fstest :)
>

Ok. I will create an fstest script and post it on fstests mailing list soon.

>> Without the fix provided by the patch, phase 3 will skip converting inodes
>> from the 2nd leaf into certain inodes and hence xfs_repair ends up trashing
>> these inodes.
>
> <nod>
>

-- 
chandan



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux