On 2012/10/09 00:05, Jan Kara wrote:
> On Fri 05-10-12 14:43:29, Fernando Luis Vázquez Cao wrote:
>> The FIISFROZEN ioctl can be use by HA and monitoring software to check
>> the freeze state of a mounted filesystem.
>
> I was thinking about this and your use case and I thing this is just a
> wrong way to fix a problem with your HA application. E.g. in case you
would
> "umount -l" your filesystem, you would hit the same problem as in
presence
> of freezing and the ioctl won't help you. Now I understand that in your
> specific use case you likely don't need to deal with lazy umounts but we
> shouldn't add an interface just to accomodate one use case and later find
> out we need another interface for slightly different one.
By the way, we can end up with a detached and active superblock even
without using lazy umount; it is possible to do a regular umount of a
frozen filesystem.
> So what you rather seem to need is some interface which allows you to
> investigate filesystems that are mounted on a block device but not
attached
> anywhere in the namespace. Would that be enough for you? If yes, some
> extension to /proc/self/mountinfo to do this should be possible...
Well, neither /proc/self/mount* nor /proc/mounts show superblocks not
attached anywhere in the namespace and changing that behavior could
wreck havoc in userspace scripts and management software. I would
rather not change that de-facto ABI unless it is strictly needed. On
the other hand, the check ioctls add a new userspace API that would
not break anything.
Regarding your concern about the ioctl approach, when a frozen
filesystem is detached from the namespace it can still be reached
through the block device it is sitting on (well... with the exception
of btrfs which has some issues that I am working on) and this is the
reason I added a block device level check ioctl too. That said, if one
day we have a filesystem which is not block device based and supports
fsfreeze (ioctl_fsfreeze() returns -EOPNOTSUPP if the superblock has
no ->freeze_fs operation, which is the case for all virtual
filesystems and NAS drivers that we have) the two check ioctls would
not cover that case.
I think that to cover all cases without adding a completely new API we
need to do the following:
1) Filesystems which are not tied to a block device (virtual
filesystems, NAS, etc):
As soon as the filesystem is removed from the namespace the
superblock based fsfreeze ioctls become useless; if we let a umount
of a frozen filesystem succeed we would not be able to thaw it (well
we could use emergency thaw but it would be overkill). Since we do
not want to break lazy umounts the only viable solution is thawing
the superblock automatically on umount (releasing the active
reference taken in freeze_super() to be more precise).
2) Block device based filesystems:
These can be reached through the block device it is sitting on even
if the filesystem was detached from the namespace and have the
particularity that they can be frozen using two different APIs, a
block device level one and the ioctls. When a filesytem was frozen
using the former, which only has in-kernel users such as dm,
automatically thawing the filesystem on umount is arguably too rude
(we can end up breaking the filesystem level consistency of a
storage snapshot). It we care about this, we could modify
sys_umount() so that filesystem is automatically thawed if and only
if there are no block device level freezes active. This behavior
would be consistent with case 1) above (the premise here is that
both fsfreeze and umount are userspace controlled operations and the
administrator should know what it is doing) and is the less likely
to cause surprises to freeze_bdev() users.
It would also be nice to have a block device level thaw ioctl for
emergency cases (for example, a scenario where thaw_bdev() was not
called and the freeze counter was left in a inconsistent state;
freeze_bdev() and thaw_bdev() are exported symbols and in many cases
we cannot control what external modules do).
Thanks,
Fernando
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html