Re: BUG: iomap_dio_rw() accesses freed memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 15, 2019 at 07:27:45PM +0100, Christoph Hellwig wrote:
> So after review that thread and this thread I still don't really
> see a refcounting problem, just a use after free of the
> wait_for_completion member for fast aio completions.
> 
> Chandan, can you give this patch a spin?
> 
> diff --git a/fs/iomap.c b/fs/iomap.c
> index cb184ff68680..47362397cb82 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -1813,6 +1813,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  	loff_t pos = iocb->ki_pos, start = pos;
>  	loff_t end = iocb->ki_pos + count - 1, ret = 0;
>  	unsigned int flags = IOMAP_DIRECT;
> +	bool wait_for_completion = is_sync_kiocb(iocb);
>  	struct blk_plug plug;
>  	struct iomap_dio *dio;
>  
> @@ -1832,7 +1833,6 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  	dio->end_io = end_io;
>  	dio->error = 0;
>  	dio->flags = 0;
> -	dio->wait_for_completion = is_sync_kiocb(iocb);
>  
>  	dio->submit.iter = iter;
>  	dio->submit.waiter = current;
> @@ -1887,7 +1887,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  		dio_warn_stale_pagecache(iocb->ki_filp);
>  	ret = 0;
>  
> -	if (iov_iter_rw(iter) == WRITE && !dio->wait_for_completion &&
> +	if (iov_iter_rw(iter) == WRITE && !wait_for_completion &&
>  	    !inode->i_sb->s_dio_done_wq) {
>  		ret = sb_init_dio_done_wq(inode->i_sb);
>  		if (ret < 0)
> @@ -1903,7 +1903,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  		if (ret <= 0) {
>  			/* magic error code to fall back to buffered I/O */
>  			if (ret == -ENOTBLK) {
> -				dio->wait_for_completion = true;
> +				wait_for_completion = true;
>  				ret = 0;
>  			}
>  			break;
> @@ -1925,8 +1925,24 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  	if (dio->flags & IOMAP_DIO_WRITE_FUA)
>  		dio->flags &= ~IOMAP_DIO_NEED_SYNC;
>  
> +	/*
> +	 * We are about to drop our additional submission reference, which
> +	 * might be the last reference to the dio.  There are three three
> +	 * different ways we can progress here:
> +	 *
> +	 *  (a) If this is the last reference we will always complete and free
> +	 *	the dio ourselves. right here.
> +	 *  (b) If this is not the last reference, and we serve an asynchronous
> +	 *	iocb, we must never touch the dio after the decrement, the
> +	 *	I/O completion handler will complete and free it.
> +	 *  (c) If this is not the last reference, but we serve a synchronous
> +	 *	iocb, the I/O completion handler will wake us up on the drop
> +	 *	of the final reference, and we will complete and free it here
> +	 *	after we got woken by the I/O completion handler.
> +	 */
> +	dio->wait_for_completion = wait_for_completion;
>  	if (!atomic_dec_and_test(&dio->ref)) {

Atomic operations don't imply a memory barrier for dependent data,
right? So in completion we do:

	if (atomic_dec_and_test(&dio->ref)) {
		if (dio->wait_for_completion) {
			/* do wakeup */
		} else ....

So if we get:

CPU 0 (submission)		CPU 1 (completion)
set dio->wfc
dec dio->ref, val = 1		dec dio->ref, val = 0
enter wait loop			check dio->wfc

Where's the memory barrier to ensure that CPU 1 sees the value
that CPU 0 set in dio->wait_for_completion?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux