Re: [PATCH] xfs: prevent multi-fsb dir readahead from reading random blocks

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Thu, 20 Apr 2017 10:17:29 -0700

On Thu, Apr 20, 2017 at 09:29:04AM -0400, Brian Foster wrote:
> Directory block readahead uses a complex iteration mechanism to map
> between high-level directory blocks and underlying physical extents.
> This mechanism attempts to traverse the higher-level dir blocks in a
> manner that handles multi-fsb directory blocks and simultaneously
> maintains a reference to the corresponding physical blocks.
> 
> This logic doesn't handle certain (discontiguous) physical extent
> layouts correctly with multi-fsb directory blocks. For example,
> consider the case of a 4k FSB filesystem with a 2 FSB (8k) directory
> block size and a directory with the following extent layout:
> 
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>    0: [0..7]:          88..95            0 (88..95)             8
>    1: [8..15]:         80..87            0 (80..87)             8
>    2: [16..39]:        168..191          0 (168..191)          24
>    3: [40..63]:        5242952..5242975  1 (72..95)            24
> 
> Directory block 0 spans physical extents 0 and 1, dirblk 1 lies
> entirely within extent 2 and dirblk 2 spans extents 2 and 3. Because
> extent 2 is larger than the directory block size, the readahead code
> erroneously assumes the block is contiguous and issues a readahead
> based on the physical mapping of the first fsb of the dirblk. This
> results in read verifier failure and a spurious corruption or crc
> failure, depending on the filesystem format.
> 
> Further, the subsequent readahead code responsible for walking
> through the physical table doesn't correctly advance the physical
> block reference for dirblk 2. Instead of advancing two physical
> filesystem blocks, the first iteration of the loop advances 1 block
> (correctly), but the subsequent iteration advances 2 more physical
> blocks because the next physical extent (extent 3, above) happens to
> cover more than dirblk 2. At this point, the higher-level directory
> block walking is completely off the rails of the actual physical
> layout of the directory for the respective mapping table.
> 
> Update the contiguous dirblock logic to consider the current offset
> in the physical extent to avoid issuing directory readahead to
> unrelated blocks. Also, update the mapping table advancing code to
> consider the current offset within the current dirblock to avoid
> advancing the mapping reference too far beyond the dirblock.
> 
> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> ---
> 
> This so far survives a couple xfstests runs with default and 2xfsb
> directory block sizes. I have another run in progress with max sized
> directory blocks. This applies on top of Eric's original overrun fix.

Looks ok, will test...

Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>

--D

> 
> Brian
> 
>  fs/xfs/xfs_dir2_readdir.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> index c45de72..20b7a5c 100644
> --- a/fs/xfs/xfs_dir2_readdir.c
> +++ b/fs/xfs/xfs_dir2_readdir.c
> @@ -405,7 +405,8 @@ xfs_dir2_leaf_readbuf(
>  		 * Read-ahead a contiguous directory block.
>  		 */
>  		if (i > mip->ra_current &&
> -		    map[mip->ra_index].br_blockcount >= geo->fsbcount) {
> +		    (map[mip->ra_index].br_blockcount - mip->ra_offset) >=
> +		    geo->fsbcount) {
>  			xfs_dir3_data_readahead(dp,
>  				map[mip->ra_index].br_startoff + mip->ra_offset,
>  				XFS_FSB_TO_DADDR(dp->i_mount,
> @@ -438,7 +439,7 @@ xfs_dir2_leaf_readbuf(
>  			 * The rest of this extent but not more than a dir
>  			 * block.
>  			 */
> -			length = min_t(int, geo->fsbcount,
> +			length = min_t(int, geo->fsbcount - j,
>  					map[mip->ra_index].br_blockcount -
>  							mip->ra_offset);
>  			mip->ra_offset += length;
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html