Re: [PATCH 6/8] fsfreeze: add vfs ioctl to check freeze state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2012/09/14 09:15, Dave Chinner wrote:
On Thu, Sep 13, 2012 at 05:19:21PM +0900, Fernando Luis Vazquez Cao wrote:
The problem is that we allow users to umount a frozen
filesystem but we have neither  a bdev level fsfreeze
API to thaw it after that nor check ioctls.
Yes, I know they don't exist, but you have to justify why they are
needed. So far you haven't.

I proposed returning EBUSY when userspace tries to umount a
frozen filesystem, but this would break lazy umount, which
I was told by Al Viro and Josef Bacik is a no go, so I discarded
this approach.

I will follow Al's advice a few years ago and do the following:
1- Let userspace umount frozen filesystems.
Which it already can.

Of course. I just meant that I will leave things as they are in that regard.


2- Provide a bdev level ioctl to unfreeze a umounted filesystem.
The only time a block device knows about the frozen state of the
superblock is when dm-snapshot drives the freeze from the block
device. There are special hooks in, e.g. mount_bdev() for checking
whether there is an active bdev freeze and this prevents new mounts
from occurring while a bdev freeze is in progress.

Yes,  and I tried to retain that feature in my patch set.

DM also removes the freeze state itself, so this is not a persistent state.

Unfortunately freeze_bdev and thaw bdev are symbols
exported to modules.


IOWs, there are two specific freeze types - one a superblock (user)
level freeze, and the other is a block device (kernel) level freeze.
What you are proposing here means that the user, in certain
circumstances, needs to manipulate superblock level freezes from the
block level because they superblock is no longer visible to the
user.  It's a recipe for confusion and convoluted bugs, and it sure
as hell won't work for all filesystems. e.g. see 18e9e51 ("Introduce
freeze_super and thaw_super for the fsfreeze ioctl") as the reason
why superblock level freezes exist and why trying to thaw from the
bdev level doesn't work.

I discussed the issue mentioned in that commit with Josef
Bacik and Chris Mason and decided to fix those filesystems
for which it would not work (we need to fix the btrfs issue
anyway).


Indeed, what happens when the superblock freeze is driven from
dm-snapshot, and the user unmounts the fs and runs the blockdev
ioctl to drop the freeze reference that dm-snapshot holds?
That would free the superblock out from underneath DM, so this is
a can of worms I'd prefer not to open.

The prototype I wrote for the blockdev ioctl
grabs the bdev->bd_fsfreeze_mutex and checks
bdev->bd_fsfreeze_count. If bd_fsfreeze_count>1
it returns EBUSY, else it calls freeze_super so
that the thawed and subsequently released.

I hope this addresses your concerns. That said
I am not married to this aproach,  if we decide
that filesystems should be thawed on
umount (which I was told is a no-go in a previous
LSF) I am willing to implement it.


I will also:
3-  add the check ioctls so that users can check whether a
filesystem is frozen or not and avoid surprises.
Which only solves the problem for users that know they have to check
the state in certain corner cases. That doesn't fix the underlying
problem.

The reason this problem exists is that a active superblock level
freeze holds a reference to the superblock. This has interesting
side effects:

$ sudo xfs_freeze -f /mnt/scratch
$ sudo umount /mnt/scratch
$ sudo mount /dev/vdc /mnt/scratch
$ sudo xfs_io -f -c "pwrite 0 64k" /mnt/scratch/foo
<blocked on frozen state>
^Z
[1]+  Stopped                 sudo xfs_io -f -c "pwrite 0 64k" /mnt/scratch/foo
$ sudo xfs_freeze -u /mnt/scratch
$
$ fg
sudo xfs_io -f -c "pwrite 0 64k" /mnt/scratch/foo
wrote 65536/65536 bytes at offset 0
64 KiB, 16 ops; 0.0000 sec (inf EiB/sec and inf ops/sec)
$

The freeze persists over unmount/mount, but we write to the
filesystem during unmount, and run log recovery and write stuff to
the filesystem as part of the normal mount process. IOWs, no
filesystem checks for SB_UNFROZEN in either the unmount or mount
path and so we violate the freeze condition by writing to the block
device. This means that as it stands, an unmount or mount violates
the frozen state guarantee and unmount/mount effectively imply that
the filesystem is no longer frozen.  Hence silent snapshot image
corruption is very possible, and the fact that it is silent is very
worrying.

That is why my original proposal was to return EBUSY
when userspace tries to umount a frozen filesystem.
However this breaks lazy umount and I was told that
auto-thaw on umount is not acceptable, so I came
with the current patch set. If the behaviour implemented
in this patches is considered the right one I guess
filesystems would need to get fixed to check for
SB_UNFROZEN (I need to check whether this is
realistic or not).


As it is, the requirement that we allow unmounting of frozen
filesystems implies that an unmount operation must also be a thaw
operation. The filesystem is no longer visible to the user, and they
have no method of controlling it's state anymore, so the user has
said to the kernel "it's all yours now". If the filesystem was
frozen when the unmount is issued, then the user is saying "I don't
care about the frozen state anymore". If they did care, then they
wouldn't be unmounting the filesystem without having first issued a
thaw operation.

FWIW, while we are unmounting the filesystem, we hold the s_umount lock
exclusively, so the filesystem cannot be thawed while an unmount is
in progress. Hence there is no reason why the unmount can't thaw the
superblock and take away it's reference as part of the unmount.

I was told that behaviour was too harsh during previous
discussions. I do not dislike it personally though. I guess
this is Al's call.


If we do this, all of the problems with persistent frozen superblock
state after unmount go away.  At this point, there is no need for
block device level ioctls to clean up a persistent frozen superblock
becuase they don't exist anymore. That means there is no need for
ioctls to check the frozen state of the filesystem before unmount,
either, because there is no problem to avoid....

Check ioctls are still needed IMHO. For example, there
is one issue we run into quite often in virtualization
environments:

We have Linux guest OS where fsfreeze in performed by a
guest agent controlled by the hypervisor. At the behest of
the hypervisor the agent freezes a guest filesystem and, for
whatever reason (a bug for example), the agent dies after
that. The filesystem was frozen without the guest's users
being aware of it, so these users will find that writes to
the filesystem block and they have no way to know
what is going on because we do not have check ioctls.
In such cases I was told by costumers that they would
like to use either emergency thaw (which this patch set
attempts to fix) or a check ioctl.

Thanks,
Fernando
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux