答复: [PATCH] exfat: check disk status during buffer write

崔东亮 (Dongliang Cui) <Dongliang.Cui@xxxxxxxxxx> · Mon, 15 Jul 2024 08:51:17 +0000

> We found that when writing a large file through buffer write, if the 
> disk is inaccessible, exFAT does not return an error normally, which 
> leads to the writing process not stopping properly.
>
> To easily reproduce this issue, you can follow the steps below:
>
> 1. format a device to exFAT and then mount (with a full disk erase) 2. 
> dd if=/dev/zero of=/exfat_mount/test.img bs=1M count=8192 3. eject the 
> device
>
> You may find that the dd process does not stop immediately and may 
> continue for a long time.
>
> We compared it with the FAT, where FAT would prompt an EIO error and 
> immediately stop the dd operation.
>
> The root cause of this issue is that when the exfat_inode contains the 
> ALLOC_NO_FAT_CHAIN flag, exFAT does not need to access the disk to 
> look up directory entries or the FAT table (whereas FAT would do) 
> every time data is written. Instead, exFAT simply marks the buffer as 
> dirty and returns, delegating the writeback operation to the writeback 
> process.
>
> If the disk cannot be accessed at this time, the error will only be 
> returned to the writeback process, and the original process will not 
> receive the error, so it cannot be returned to the user side.
>
> Therefore, we think that when writing files with ALLOC_NO_FAT_CHAIN, 
> it is necessary to continuously check the status of the disk.
>
> When the disk cannot be accessed normally, an error should be returned 
> to stop the writing process.
>
> Signed-off-by: Dongliang Cui <dongliang.cui@xxxxxxxxxx>
> Signed-off-by: Zhiguo Niu <zhiguo.niu@xxxxxxxxxx>
> ---
>  fs/exfat/exfat_fs.h | 5 +++++
>  fs/exfat/inode.c    | 5 +++++
>  2 files changed, 10 insertions(+)
>
> diff --git a/fs/exfat/exfat_fs.h b/fs/exfat/exfat_fs.h index 
> ecc5db952deb..c5f5a7a8b672 100644
> --- a/fs/exfat/exfat_fs.h
> +++ b/fs/exfat/exfat_fs.h
> @@ -411,6 +411,11 @@ static inline unsigned int 
> exfat_sector_to_cluster(struct exfat_sb_info *sbi,
>               EXFAT_RESERVED_CLUSTERS;  }
>
> +static inline bool exfat_check_disk_error(struct block_device *bdev) 
> +{
> +     return blk_queue_dying(bdev_get_queue(bdev));
Why don't you check it like ext4?

static int block_device_ejected(struct super_block *sb) {
       struct inode *bd_inode = sb->s_bdev->bd_inode;
       struct backing_dev_info *bdi = inode_to_bdi(bd_inode);

       return bdi->dev == NULL;
}

The block_device->bd_inode has been removed in the latest code.
We might be able to use super_block->s_bdi->dev for the judgment,
or perhaps use blk_queue_dying?

> +}
> +
>  static inline bool is_valid_cluster(struct exfat_sb_info *sbi,
>               unsigned int clus)
>  {
> diff --git a/fs/exfat/inode.c b/fs/exfat/inode.c index 
> dd894e558c91..efd02c1c83a6 100644
> --- a/fs/exfat/inode.c
> +++ b/fs/exfat/inode.c
> @@ -147,6 +147,11 @@ static int exfat_map_cluster(struct inode *inode, 
> unsigned int clu_offset,
>       *clu = last_clu = ei->start_clu;
>
>       if (ei->flags == ALLOC_NO_FAT_CHAIN) {
> +             if (exfat_check_disk_error(sb->s_bdev)) {
> +                     exfat_fs_error(sb, "device inaccessiable!\n");
> +                     return -EIO;
This patch looks useful when using removable storage devices.
BTW, in case of "ei->flags != ALLOC_NO_FAT_CHAIN", There could be the same problem if it can be found from lru_cache. So, it would be nice to check disk_error regardless ei->flags. Also, Calling exfat_fs_error() seems unnecessary. Instead, let's return -ENODEV instead of -EIO.
I believe that these errors will be handled on exfat_get_block()

Thanks.
> +             }
> +
>               if (clu_offset > 0 && *clu != EXFAT_EOF_CLUSTER) {
>                       last_clu += clu_offset - 1;
>
> --
> 2.25.1