Re: [PATCH 2/7] xfs: always allocate blocks as unwritten for file data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 01, 2018 at 05:37:36AM -0700, Christoph Hellwig wrote:
> XFS historically had a small race that could lead to exposing
> uninitialized data in case of a crash.  If we are filling holes using
> buffered I/O we convert the delayed allocation to a real allocation
> before writing out the data.  If we crash after the blocks were
> allocated, but before the data was written this could lead to reading
> uninitialized blocks (or leaked data from a previous allocation that was
> reused).  Now that we have the CIL logging extent format changes is
> cheap, so we can switch to always allocating blocks as unwritten.
> Note that this is not be strictly necessary for writes that append
> beyond i_size, but given that we have to log a transaction in that
> case anyway we might as well give all block allocations a uniform
> treatment.
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> ---

It's great that we can finally fix this, particularly with such a simple
change. IIRC, the only real thing standing in the way was the buffer
head delalloc state management mess.

>  fs/xfs/xfs_aops.c  | 3 +--
>  fs/xfs/xfs_aops.h  | 2 --
>  fs/xfs/xfs_iomap.c | 4 ++--
>  3 files changed, 3 insertions(+), 6 deletions(-)
> 
...
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 6320aca39f39..10fc93cebc42 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -662,11 +662,11 @@ xfs_iomap_write_allocate(
>  	xfs_trans_t	*tp;
>  	int		nimaps;
>  	int		error = 0;
> -	int		flags = XFS_BMAPI_DELALLOC;
> +	int		flags = XFS_BMAPI_DELALLOC | XFS_BMAPI_PREALLOC;

... though I don't quite think this is sufficient. xfs_bmapi_allocate()
has this snippet of code:

        if ((!bma->wasdel || (bma->flags & XFS_BMAPI_COWFORK)) &&
            (bma->flags & XFS_BMAPI_PREALLOC) &&
            xfs_sb_version_hasextflgbit(&mp->m_sb))
                bma->got.br_state = XFS_EXT_UNWRITTEN;

... which looks like it explicitly bypasses the PREALLOC flag for
delalloc extents. I figured this would just be an inefficiency since
prealloc conversion comes later, but if you look at
xfs_bmapi_convert_unwritten():

        /* check if we need to do real->unwritten conversion */
        if (mval->br_state == XFS_EXT_NORM &&
            (flags & (XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT)) !=
                        (XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT))
                return 0;

... it sees this as an existing extent and so doesn't change the state
unless the CONVERT flag is also passed. A quick test to shut down
immediately after the xfs_iomap_write_allocate() transaction commits
seems to confirm this behavior:

# xfs_io -c "fiemap -v" /mnt/file
/mnt/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          72..79               8   0x1

I think the right fix here is to remove the referenced logic from
xfs_bmapi_allocate(). I also think this demonstrates the need for an
xfstest. ;) Expected behavior should be easy to confirm with a new error
tag, for example.

Brian

>  	int		nres;
>  
>  	if (whichfork == XFS_COW_FORK)
> -		flags |= XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC;
> +		flags |= XFS_BMAPI_COWFORK;
>  
>  	/*
>  	 * Make sure that the dquots are there.
> -- 
> 2.19.0
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux