On 2013/01/10 02:20, Jan Kara wrote:
> On Mon 07-01-13 20:35:14, Fernando Luis Vázquez Cao wrote:
>> The fsfreeze userspace API is a sb level API, which means that to
thaw a frozen
>> filesystem we need to have access to it. This means that after an
unmount the
>> filesystem cannot be thawed, because it is not part of the filesystem
>> namespace anymore.
>>
>> A possible solution to the problem above is to mount the filesystem
again and
>> execute the thaw operation. However, a subsequent mount is not
guaranteed to
>> succeed and even if it does succeeded it may generate I/O in the
process,
>> which is not supposed to happen since the filesystem is frozen.
>>
>> Another approach is adding a bdev level API through which the unmounted
>> filesystem can be reached. The problem with this is that it only
works for
>> filesystems for block devices. We could also return EBUSY when the
user tries
>> to ummount an frozen filesystem, but this wold break lazy unmount.
>>
>> Automatically freezing on sys_umount fixes all the problems
mentioned above.
>
> I have to say I'm not convinced this is the right way to go. For
example
> if dm-snapshot is run on a filesystem being unmounted, thawing during
> unmount isn't really desirable (although I find this scenario unlikely to
> happen in practice).
Patch 12 addresses that scenario. With it applied, the sb level freeze
counter would be set to 1 at umount time and the actual filesystem
thaw carried out after dm-snapshot's call to thaw_bdev (which brings
the freeze counter down to zero thus triggering the unfreeze).
> And the way you implemented the idea is definitely
> wrong - the filesystem can be mounted several times and you would thaw it
> on any umount which can definitely surprise processes using freeze.
I agree, the current behavior can cause confusion. I am attaching an
alternative version of this patch that thaws the filesystem only if
this is the last mount remaining. Does it look like a better approach?
> So what if we changed propagate_mount_busy() to return true if the
> superblock is frozen. That will *not* break lazy umount because that
> doesn't care about propagate_mount_busy() returns but all other
attempts to
> umount will return EBUSY. Sure our problem with unreachable filesystem
> being frozen won't be completely solved since frozen filesystem can still
> be made unreachable by lazy umount but that does not seem to be that
> common. What do you think?
I think that we should avoid solutions that do not address
the unreachable filesystem case properly. Hopefully I got it
right this time around.
Thank you for your detailed review of the patch set!
- Fernando
Subject: [PATCH 10/17] fsfreeze: automatically thaw on umount
The fsfreeze userspace API is a sb level API, which means that to thaw a frozen
filesystem we need to have access to it. This means that after an unmount the
filesystem cannot be thawed, because it is not part of the filesystem
namespace anymore.
A possible solution to the problem above is to mount the filesystem again and
execute the thaw operation. However, a subsequent mount is not guaranteed to
succeed and even if it does succeeded it may generate I/O in the process,
which is not supposed to happen since the filesystem is frozen.
Another approach is adding a bdev level API through which the unmounted
filesystem can be reached. The problem with this is that it only works for
filesystems for block devices. We could also return EBUSY when the user tries
to ummount an frozen filesystem, but this wold break lazy unmount.
Automatically freezing on sys_umount fixes all the problems mentioned above.
Cc: linux-fsdevel@xxxxxxxxxxxxxxx
Cc: Josef Bacik <jbacik@xxxxxxxxxxxx>
Cc: Eric Sandeen <sandeen@xxxxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Cc: Dave Chinner <dchinner@xxxxxxxxxx>
Cc: Jan Kara <jack@xxxxxxx>
Cc: Luiz Capitulino <lcapitulino@xxxxxxxxxx>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@xxxxxxxxxxxxx>
---
diff -urNp linux-3.8-rc3-orig/fs/namespace.c linux-3.8-rc3/fs/namespace.c
--- linux-3.8-rc3-orig/fs/namespace.c 2013-01-10 17:51:41.651680000 +0900
+++ linux-3.8-rc3/fs/namespace.c 2013-01-10 17:47:53.419680000 +0900
@@ -1091,9 +1091,31 @@ int may_umount(struct vfsmount *mnt)
EXPORT_SYMBOL(may_umount);
+static void thaw_mount(struct mount *mnt)
+{
+ struct super_block *sb = mnt->mnt.mnt_sb;
+
+ down_write(&sb->s_umount);
+ if (sb->s_writers.frozen == SB_UNFROZEN) {
+ up_write(&sb->s_umount);
+ return;
+ }
+ /*
+ * The superblock may be in the process of being detached from the
+ * namespace which means we have to make sure the thaw of the superblock
+ * succeeds (once it has been detached the fsfreeze ioctls become
+ * unusable). Thus, Force-thaw sb so that all tasks in fsfreeze wait
+ * queue are woken up.
+ */
+ sb->s_freeze_count = 1;
+ __thaw_super(sb, true);
+ deactivate_locked_super(sb);
+}
+
void release_mounts(struct list_head *head)
{
struct mount *mnt;
+ struct super_block *sb;
while (!list_empty(head)) {
mnt = list_first_entry(head, struct mount, mnt_hash);
list_del_init(&mnt->mnt_hash);
@@ -1111,6 +1133,11 @@ void release_mounts(struct list_head *he
dput(dentry);
mntput(&m->mnt);
}
+ br_read_lock(&vfsmount_lock);
+ sb = mnt->mnt.mnt_sb;
+ if (list_is_last(&mnt->mnt_instance, &sb->s_mounts))
+ thaw_mount(mnt);
+ br_read_lock(&vfsmount_lock);
mntput(&mnt->mnt);
}
}
diff -urNp linux-3.8-rc3-orig/fs/super.c linux-3.8-rc3/fs/super.c
--- linux-3.8-rc3-orig/fs/super.c 2013-01-10 17:51:41.651680000 +0900
+++ linux-3.8-rc3/fs/super.c 2013-01-10 17:47:53.419680000 +0900
@@ -1414,6 +1414,7 @@ EXPORT_SYMBOL(freeze_super);
/**
* __thaw_super -- unlock filesystem
* @sb: the super to thaw
+ * @force: whether or not to force the thaw (read details below before using)
*
* Unlocks the filesystem and marks it writeable again.
*
@@ -1421,11 +1422,17 @@ EXPORT_SYMBOL(freeze_super);
* after this thaw if it succeeded or the corresponding error code otherwise.
* If the unfreeze fails, @sb is left in the frozen state.
*
+ * If the force flag is set the kernel will proceeed with the thaw even if the
+ * call to the filesystem specific thaw function (->unfreeze_fs()) fails. This
+ * feature should be used only in situations where there is no entity we can
+ * return an error to so that it has a chance to clean things up and retry, i.e.
+ * this is the last oportunity to wake the tasks in the fsfreeze wait queue up.
+ *
* This is the unlocked version of thaw_super, so it is the caller's job to
* to protect the superblock by grabbing the s_umount semaphore in write mode
* and release it again on return. See thaw_super() for an example.
*/
-static int __thaw_super(struct super_block *sb)
+int __thaw_super(struct super_block *sb, bool force)
{
int error = 0;
@@ -1442,11 +1449,9 @@ static int __thaw_super(struct super_blo
if (sb->s_flags & MS_RDONLY)
goto out_thaw;
- if (sb->s_op->unfreeze_fs) {
- error = sb->s_op->unfreeze_fs(sb);
- if (error) {
- printk(KERN_ERR
- "VFS:Filesystem thaw failed\n");
+ if (sb->s_op->unfreeze_fs && (error = sb->s_op->unfreeze_fs(sb))) {
+ printk(KERN_ERR "VFS: Filesystem thaw failed\n");
+ if (!force) {
sb->s_freeze_count++;
goto out;
}
@@ -1470,7 +1475,7 @@ int thaw_super(struct super_block *sb)
{
int res;
down_write(&sb->s_umount);
- res = __thaw_super(sb);
+ res = __thaw_super(sb, false);
if (!res) /* Active reference released after last thaw. */
deactivate_locked_super(sb);
else
@@ -1497,7 +1502,7 @@ static void do_thaw_one(struct super_blo
* so we can call the lockless version of thaw_super() safely.
*/
sb->s_freeze_count = 1; /* avoid multiple calls to __thaw_super */
- res = __thaw_super(sb);
+ res = __thaw_super(sb, false);
if (!res) {
deactivate_locked_super(sb);
/*
diff -urNp linux-3.8-rc3-orig/include/linux/fs.h linux-3.8-rc3/include/linux/fs.h
--- linux-3.8-rc3-orig/include/linux/fs.h 2013-01-10 17:51:41.651680000 +0900
+++ linux-3.8-rc3/include/linux/fs.h 2013-01-10 17:47:53.419680000 +0900
@@ -1882,6 +1882,7 @@ extern int user_statfs(const char __user
extern int fd_statfs(int, struct kstatfs *);
extern int vfs_ustat(dev_t, struct kstatfs *);
extern int freeze_super(struct super_block *super);
+extern int __thaw_super(struct super_block *super, bool force);
extern int thaw_super(struct super_block *super);
extern void emergency_thaw_all(void);
extern bool our_mnt(struct vfsmount *mnt);