Re: [PATCH v2 2/2] loop: use task_work for autoclear operation

Jan Kara <jack@xxxxxxx> · Fri, 14 Jan 2022 17:05:45 +0100

On Fri 14-01-22 20:05:31, Tetsuo Handa wrote:
> On 2022/01/14 0:23, Jan Kara wrote:
> > Well, we cannot guarantee what will be state of the loop device when you
> > open it but I think we should guarantee that once you have loop device
> > open, it will not be torn down under your hands. And now that I have
> > realized there are those lo_state checks, I think everything is fine in
> > that regard. I wanted to make sure that sequence such as:
> 
> Well, we could abort __loop_clr_fd() if somebody called "open()", something
> like below. But
> 
> ----------------------------------------
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index b1b05c45c07c..960db2c484ab 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1082,7 +1082,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
>  	return error;
>  }
>  
> -static void __loop_clr_fd(struct loop_device *lo)
> +static bool __loop_clr_fd(struct loop_device *lo, int expected_refcnt)
>  {
>  	struct file *filp;
>  	gfp_t gfp = lo->old_gfp_mask;
> @@ -1104,9 +1104,19 @@ static void __loop_clr_fd(struct loop_device *lo)
>  	 * Since this function is called upon "ioctl(LOOP_CLR_FD)" xor "close()
>  	 * after ioctl(LOOP_CLR_FD)", it is a sign of something going wrong if
>  	 * lo->lo_state has changed while waiting for lo->lo_mutex.
> +	 *
> +	 * However, if somebody called "open()" after lo->lo_state became
> +	 * Lo_rundown, we should abort rundown in order to avoid unexpected
> +	 * I/O error.
>  	 */
>  	mutex_lock(&lo->lo_mutex);
>  	BUG_ON(lo->lo_state != Lo_rundown);
> +	if (atomic_read(&lo->lo_refcnt) != expected_refcnt) {
> +		lo->lo_state = Lo_bound;
> +		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> +		mutex_unlock(&lo->lo_mutex);
> +		return false;
> +	}
>  	mutex_unlock(&lo->lo_mutex);

Yeah, but as I wrote in my email, I don't think this is needed anymore (and
I even think it would be counterproductive). There can be new opens
happening before __loop_clr_fd() but any ioctl querying loop device state
will return error due to lo->lo_state == Lo_rundown. So from userspace POV
the loop device is already invalidated.

> > Currently, we may hold both. With your async patch we hold only lo_mutex.
> > Now that I better understand the nature of the deadlock, I agree that
> > holding either still creates the deadlock possibility because both are
> > acquired on loop device open. But now that I reminded myself the lo_state
> > handling, I think the following should be safe in __loop_clr_fd:
> > 
> > 	/* Just a safety check... */
> > 	if (WARN_ON_ONCE(data_race(lo->lo_state) != Lo_rundown))
> > 		return -ENXIO;
> > 
> 
> this is still racy, for somebody can reach lo_open() right after this check.

Yes, somebody can open but he cannot change lo->lo_state. Also this should
be just a safety check. We should never reach __loop_clr_fd() with
different lo_state.

> Anyway, since the problem that umount() immediately after close() (reported by
> kernel test robot) remains, we need to make sure that __loop_clr_fd() completes
> before close() returns to user mode.

I agree with this. But using task_work for __loop_clr_fd() is enough for
that. I was just arguing that we don't need extra waiting in lo_open().

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR