Re: [Race] data race between do_mpage_readpage() and set_blocksize()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 30, 2020 at 03:41:53PM +0000, Gong, Sishuai wrote:
> We found a data race in linux kernel 5.3.11 that we are able to reproduce in x86 under specific interleavings. Currently, we are not sure about the consequence of this race so we would like to confirm with the community if this can be a harmful bug.

How are you able to reproduce it?  Normally mpage_readpage() is only called
from a filesystem, and you shouldn't be able to change the size of the
blocks in a block device while there's a mounted filesystem.

> ------------------------------------------
> Writer site
> 
> /tmp/tmp.B7zb7od2zE-5.3.11/extract/linux-5.3.11/fs/block_dev.c:135
>         120
>         121  int set_blocksize(struct block_device *bdev, int size)
>         122  {
>         123      /* Size must be a power of two, and between 512 and PAGE_SIZE */
>         124      if (size > PAGE_SIZE || size < 512 || !is_power_of_2(size))
>         125          return -EINVAL;
>         126
>         127      /* Size cannot be smaller than the size supported by the device */
>         128      if (size < bdev_logical_block_size(bdev))
>         129          return -EINVAL;
>         130
>         131      /* Don't change the size if it is same as current */
>         132      if (bdev->bd_block_size != size) {
>         133          sync_blockdev(bdev);
>         134          bdev->bd_block_size = size;
>  ==>    135          bdev->bd_inode->i_blkbits = blksize_bits(size);
>         136          kill_bdev(bdev);
>         137      }
>         138      return 0;
>         139  }
> 
> ------------------------------------------
> Reader site
> 
>  /tmp/tmp.B7zb7od2zE-5.3.11/extract/linux-5.3.11/fs/mpage.c:160
>         147  /*
>         148   * This is the worker routine which does all the work of mapping the disk
>         149   * blocks and constructs largest possible bios, submits them for IO if the
>         150   * blocks are not contiguous on the disk.
>         151   *
>         152   * We pass a buffer_head back and forth and use its buffer_mapped() flag to
>         153   * represent the validity of its disk mapping and to decide when to do the next
>         154   * get_block() call.
>         155   */
>         156  static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
>         157  {
>         158      struct page *page = args->page;
>         159      struct inode *inode = page->mapping->host;
>  ==>    160      const unsigned blkbits = inode->i_blkbits;
>         161      const unsigned blocks_per_page = PAGE_SIZE >> blkbits;
>         162      const unsigned blocksize = 1 << blkbits;
>         163      struct buffer_head *map_bh = &args->map_bh;
>         164      sector_t block_in_file;
>         165      sector_t last_block;
>         166      sector_t last_block_in_file;
>         167      sector_t blocks[MAX_BUF_PER_PAGE];
>         168      unsigned page_block;
>         169      unsigned first_hole = blocks_per_page;
>         170      struct block_device *bdev = NULL;
>         171      int length;
>         172      int fully_mapped = 1;
>         173      int op_flags;
>         174      unsigned nblocks;
>         175      unsigned relative_block;
>         176      gfp_t gfp;
>         177
>         178      if (args->is_readahead) {
>         179          op_flags = REQ_RAHEAD;
>         180          gfp = readahead_gfp_mask(page->mapping);
> 
> 
> ------------------------------------------
> Writer calling trace
> 
> - ksys_mount
> -- do_mount
> --- vfs_get_tree
> ---- mount_bdev
> ----- sb_min_blocksize
> ------ sb_set_blocksize
> ------- set_blocksize
> 
> ------------------------------------------
> Reader calling trace
> 
> - ksys_read
> -- vfs_read
> --- __vfs_read
> ---- generic_file_read_iter
> ----- page_cache_sync_readahead
> ------ force_page_cache_readahead
> ------- __do_page_cache_readahead
> -------- read_pages
> --------- mpage_readpages
> ---------- do_mpage_readpage
> 
> 
> 
> Thanks,
> Sishuai
> 



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux