Re: [PATCH 3/8] fsfreeze: freeze_super and thaw_bdev don't play well together

Jan Kara <jack@xxxxxxx> · Fri, 13 Jul 2012 15:45:37 +0200



On Thu 12-07-12 18:05:43, Fernando Luis Vázquez Cao wrote:
> Changes from Dave Chinner's version:
>  - Remove s_frozen check in freeze_super which is not needed now that it is
>    re-entrant.
>  - Decrement freeze counter if the freeze_fs callback fails.
> 
> ---
> 
> thaw_bdev() has re-entrancy guards to allow freezes to nest
> together. That is, it ensures that the filesystem is not thawed
> until the last thaw command is issued. This is needed to prevent the
> filesystem from being unfrozen while an existing freezer is still
> operating on the filesystem in a frozen state (e.g. dm-snapshot).
> 
> Currently, freeze_super() and thaw_super() bypasses these guards,
> and as a result manual freezing and unfreezing via the ioctl methods
> do not nest correctly. hence mixing userspace directed freezes with
> block device level freezes result in inconsistency due to premature
> thawing of the filesystem.
> 
> Move the re-enterency guards to the superblock and into freeze_super
> and thaw_super() so that userspace directed freezes nest correctly
> again. Caveat: Things work as expected as long as direct calls to
> thaw_super are always in response to a previous sb level freeze. In
> other words an unpaired call to thaw_super can still thaw a
> filesystem frozen using freeze_bdev (this issue could be addressed
> in a follow-up patch if deemed necessary).
> 
> This patch retains the bdev level mutex and counter to keep the
> "feature" that we can freeze a block device that does not have a
> filesystem mounted yet.
> 
> Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>
> Cc: Jan Kara <jack@xxxxxxx>
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> Signed-off-by: Fernando Luis Vazquez Cao <fernando@xxxxxxxxxxxxx>
> ---
  Some mostly minor coments below. 

 
> diff -urNp vfs-orig/fs/block_dev.c vfs/fs/block_dev.c
> --- vfs-orig/fs/block_dev.c	2012-07-12 14:31:38.936631141 +0900
> +++ vfs/fs/block_dev.c	2012-07-12 15:03:57.032627014 +0900
> @@ -257,16 +257,18 @@ int fsync_bdev(struct block_device *bdev
>  EXPORT_SYMBOL(fsync_bdev);
>  
>  /**
> - * freeze_bdev  --  lock a filesystem and force it into a consistent state
> + * freeze_bdev  --  lock a block device
>   * @bdev:	blockdevice to lock
>   *
> - * If a superblock is found on this device, we take the s_umount semaphore
> - * on it to make sure nobody unmounts until the snapshot creation is done.
> - * The reference counter (bd_fsfreeze_count) guarantees that only the last
> - * unfreeze process can unfreeze the frozen filesystem actually when multiple
> - * freeze requests arrive simultaneously. It counts up in freeze_bdev() and
> - * count down in thaw_bdev(). When it becomes 0, thaw_bdev() will unfreeze
> - * actually.
> + * Locks the block device and, if present, the associated filesystem too.
> + *
> + * The reference counter (bd_fsfreeze_count) is used to implement the feature
> + * that allows one to freeze a block device that does not have a filesystem
> + * mounted yet. For filesystems using mount_bdev the kernel takes care of
> + * things by preventing the mount operation from succeeding if the underlying
> + * block device is frozen. Other filesystems should check this counter or risk
> + * a situation where a freeze_bdev user (e.g. dm snapshot) and mount race,
> + * which may lead to inconsistencies.
>   */
>  struct super_block *freeze_bdev(struct block_device *bdev)
>  {
> @@ -274,17 +276,7 @@ struct super_block *freeze_bdev(struct b
>  	int error = 0;
>  
>  	mutex_lock(&bdev->bd_fsfreeze_mutex);
> -	if (++bdev->bd_fsfreeze_count > 1) {
> -		/*
> -		 * We don't even need to grab a reference - the first call
> -		 * to freeze_bdev grab an active reference and only the last
> -		 * thaw_bdev drops it.
> -		 */
> -		sb = get_super(bdev);
> -		drop_super(sb);
> -		mutex_unlock(&bdev->bd_fsfreeze_mutex);
> -		return sb;
> -	}
> +	bdev->bd_fsfreeze_count++;
>  
>  	sb = get_active_super(bdev);
>  	if (!sb)
> @@ -297,30 +289,33 @@ struct super_block *freeze_bdev(struct b
>  		return ERR_PTR(error);
>  	}
>  	deactivate_super(sb);
> - out:
> +out:
>  	sync_blockdev(bdev);
>  	mutex_unlock(&bdev->bd_fsfreeze_mutex);
> -	return sb;	/* thaw_bdev releases s->s_umount */
> +	return sb;
>  }
>  EXPORT_SYMBOL(freeze_bdev);
>  
>  /**
> - * __thaw_bdev  -- unlock filesystem
> + * __thaw_bdev  -- unlock a block device
>   * @bdev:	blockdevice to unlock
>   * @sb:		associated superblock
>   * @emergency:	emergency thaw
>   *
> - * Unlocks the filesystem and marks it writeable again after freeze_bdev().
> + * Unlocks the block device and, if present, the associated filesystem too.
>   */
>  static int __thaw_bdev(struct block_device *bdev, struct super_block *sb, int emergency)
>  {
>  	int error = -EINVAL;
>  
>  	mutex_lock(&bdev->bd_fsfreeze_mutex);
> +
>  	if (!bdev->bd_fsfreeze_count)
>  		goto out;
>  
> -	if (--bdev->bd_fsfreeze_count > 0 || !sb) {
> +	bdev->bd_fsfreeze_count--;
> +
> +	if (!sb) {
>  		error = 0;
>  		goto out;
>  	}
> @@ -336,13 +331,6 @@ out:
>  	return error;
>  }
>  
> -/**
> - * thaw_bdev  -- unlock filesystem
> - * @bdev:	blockdevice to unlock
> - * @sb:		associated superblock
> - *
> - * Unlocks the filesystem and marks it writeable again after freeze_bdev().
> - */
>  int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>  {
>  	return __thaw_bdev(bdev, sb, 0);
  It's a bit confusing (for reviewer) to remove the documentation here when
you remove the whole function in a later patch...

...
> @@ -1233,29 +1240,33 @@ static int __thaw_super(struct super_blo
>  {
>  	int error = 0;
>  
> -	if (sb->s_frozen == SB_UNFROZEN) {
> +	mutex_lock(&sb->s_freeze_mutex);
> +	if (!sb->s_freeze_count) {
>  		error = -EINVAL;
> -		goto out;
> +		goto out_unlock;
>  	}
> +	sb->s_freeze_count = emergency ? 1 : sb->s_freeze_count;
  It would be cleaner to do this somewhere in do_thaw_one() and not here.
Also you won't have to pass the emergency parameter then... Also it might
be more logical for __thaw_super() to expect also s_freeze_mutex locked and
handle the locking in thaw_super().

								Honza
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html