Re: PROBLEM: potential concurrency bug between do_vfs_ioctl() and do_readv()

Jan Kara <jack@xxxxxxx> · Tue, 2 Feb 2021 19:17:39 +0100

On Tue 26-01-21 02:18:03, Gong, Sishuai wrote:
> Hello,
> 
> We have found out why this bug happens. When a kernel thread is executing
> loop_clr_fd(), it will release the loop_ctl_mutex lock for a short period
> of time, before calling __loop_clr_fd(). However, another kernel thread
> may take use of this small gap, open the loop device, read it and cause a
> BLK_STS_IOERR eventually. This bug may lead to error messages on the
> kernel console, as mentioned in the previous email.
> 
> The following interleavings of this bug is shown below:
> 
> Thread 1				Thread 2
> // Execute loop_clr_fd()
> lo->lo_state = Lo_rundown
> mutex_unlock(&loop_ctl_mutex);
> 					// Execute lo_open()
> 					err = mutex_lock_killable(&loop_ctl_mutex);
> 					…
> 					lo = bdev->bd_disk->private_data;
> 					// lo_open return a success
> 					// User makes a ksys_read() request
> 					// loop_queue_rq()
> 					if (lo->lo_state != Lo_bound)
> 										return BLK_STS_IOERR;
> // Execute __loop_clr_fd()
> mutex_lock(&loop_ctl_mutex);
> ...

So you are right a race like this can happen. However generally it is
expected that you will see IO errors when you run loop_clr_fd() while there
may be other processes submitting IO. The particular scenario you've shown
above could possibly be handled by more careful checking in lo_open() but if
we just slightly modify it like:

Thread 1				Thread 2
					// Execute lo_open()
					err = mutex_lock_killable(&loop_ctl_mutex);
					…
					lo = bdev->bd_disk->private_data;
					// lo_open return a success
// Execute loop_clr_fd()
lo->lo_state = Lo_rundown
mutex_unlock(&loop_ctl_mutex);
					// User makes a ksys_read() request
					// loop_queue_rq()
					if (lo->lo_state != Lo_bound)
// Execute __loop_clr_fd()
mutex_lock(&loop_ctl_mutex);

Then it's kind of obvious that no checking in lo_open() can prevent IO
errors when loop device is being torn down. So to summarize the error
messages are expected when loop device is torn down while IO is in
progress...

								Honza

> > On Sep 28, 2020, at 10:44 AM, Gong, Sishuai <sishuai@xxxxxxxxxx> wrote:
> > 
> > Hi,
> > 
> > We found a potential concurrency bug in linux kernel 5.3.11. We are able to reproduce this bug in x86 under specific thread interleavings. This bug causes a blk_update_request I/O error.
> > 
> > ------------------------------------------
> > Kernel console output
> > blk_update_request: I/O error, dev loop0, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
> > 
> > ------------------------------------------
> > Test input
> > This bug occurs when kernel functions do_vfs_ioctl() and do_readv() are executed with certain parameters in two separate threads and run concurrently.
> > 
> > The test program is generated in Syzkaller’s format as follows:
> > Test 1 [run in thread 1]
> > syz_read_part_table(0x0, 0x1, &(0x7f00000006c0)=[{0x0, 0x0, 0x100}])
> > Test 2 [run in thread 2]
> > r0 = syz_open_dev$loop(&(0x7f0000000000)='/dev/loop#\x00', 0x0, 0x0)
> > readv(r0, &(0x7f0000000340)=[{&(0x7f0000000440)=""/4096, 0x1000}], 0x1)
> > 
> > ------------------------------------------
> > Interleaving
> > Thread 1													Thread 2
> > 														do_readv()
> > 														-vfs_readv()
> > 														--do_iter_read()
> > 														---do_iter_readv_writev()
> > 														----blkdev_read_iter()
> > do_vfs_ioctl()						
> > --vfs_ioctl()	
> > --blkdev_ioctl()
> > ---blkdev_driver_ioctl()				
> > ----loop_set_fd()
> > -----bd_set_size()
> > 															(fs/blk_dev.c:1999)
> > 															loff_t size = i_size_read(bd_inode);
> > 															loff_t pos = iocb->ki_pos;
> > 															if (pos >= size)
> > 																return 0;
> > 															size -= pos;
> > 
> > 														----generic_file_read_iter()
> > 															(mm/filemap.c:2069)	
> > 															page = find_get_page(mapping, index);
> > 							  								if (!page) {
> > 																if (iocb->ki_flags & IOCB_NOWAIT)
> > 																	goto would_block;
> > 															page_cache_sync_readahead(mapping,
> > 
> > 															-----page_cache_sync_readahead()
> > 															------ondemand_readahead()
> > 															…
> > 															-----------...blk_update_request()
> > 															(error)
> > -----loop_sysfs_init()
> > …
> > 
> > ------------------------------------------
> > Analysis
> > We observed that when thread 2 is executed alone without thread 1, i_size_read() at fs/blk_dev.c:1999 returns a size of 0, thus in sequential mode blkdev_read_iter() returns directly at “return 0;” However, when two threads are executed concurrently, thread 1 changes the size of the same inode that thread 2 is concurrently accessing, then thread 2 goes into a different path, eventually causing the blk_update_request I/O error.
> > 
> > 
> > Thanks,
> > Sishuai
> > 
> 
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR