Re: [patch 2/3 v2] dm-writecache: convert wait queue to wake_up_process

Mikulas Patocka <mpatocka@xxxxxxxxxx> · Fri, 8 Jun 2018 17:06:05 -0400 (EDT)

On Fri, 8 Jun 2018, Mike Snitzer wrote:

> On Fri, Jun 08 2018 at 11:13P -0400,
> Mike Snitzer <snitzer@xxxxxxxxxx> wrote:
> 
> I'd prefer the following, so please help me understand why you aren't
> doing it this way.  Thanks.
> 
> diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
> index 5961c7794ef3..17cd81ce6ec3 100644
> --- a/drivers/md/dm-writecache.c
> +++ b/drivers/md/dm-writecache.c
> @@ -1103,9 +1103,9 @@ static int writecache_flush_thread(void *data)
>  
>  static void writecache_offload_bio(struct dm_writecache *wc, struct bio *bio)
>  {
> -	if (bio_list_empty(&wc->flush_list))
> -		wake_up_process(wc->flush_thread);
> +	lockdep_assert_held(&wc->lock);
>  	bio_list_add(&wc->flush_list, bio);
> +	wake_up_process(wc->flush_thread);
>  }
>  
>  static int writecache_map(struct dm_target *ti, struct bio *bio)
> @@ -1295,10 +1295,9 @@ static void writecache_writeback_endio(struct bio *bio)
>  	unsigned long flags;
>  
>  	raw_spin_lock_irqsave(&wc->endio_list_lock, flags);
> -	if (unlikely(list_empty(&wc->endio_list)))
> -		wake_up_process(wc->endio_thread);
>  	list_add_tail(&wb->endio_entry, &wc->endio_list);
>  	raw_spin_unlock_irqrestore(&wc->endio_list_lock, flags);
> +	wake_up_process(wc->endio_thread);
>  }
>  
>  static void writecache_copy_endio(int read_err, unsigned long write_err, void *ptr)
> @@ -1309,10 +1308,9 @@ static void writecache_copy_endio(int read_err, unsigned long write_err, void *p
>  	c->error = likely(!(read_err | write_err)) ? 0 : -EIO;
>  
>  	raw_spin_lock_irq(&wc->endio_list_lock);
> -	if (unlikely(list_empty(&wc->endio_list)))
> -		wake_up_process(wc->endio_thread);
>  	list_add_tail(&c->endio_entry, &wc->endio_list);
>  	raw_spin_unlock_irq(&wc->endio_list_lock);
> +	wake_up_process(wc->endio_thread);
>  }
>  
>  static void __writecache_endio_pmem(struct dm_writecache *wc, struct list_head *list)

This is incorrect.

When you drop the spinlock, the endio thread may already take the item (it 
may take it even before wake_up_process is called). When the endio thread 
consumes all the items, the user may unload the device. When the user 
unloads the device, wc is freed and wc->endio_thread points to a 
non-existing process - and now you dereference freed "wc" structure and 
call wake_up_process on non-existing process and cause a crash.

wake_up_process must be inside the spinlock to avoid this race condition.

Mikulas

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel