On Wed 10-10-12 11:17:25, Fernando Luis Vazquez Cao wrote: > On 2012/10/09 23:55, Jan Kara wrote: > >On Tue 09-10-12 18:46:26, Fernando Luis Vazquez Cao wrote: > >>I think that to cover all cases without adding a completely new API we > >>need to do the following: > >> > >>1) Filesystems which are not tied to a block device (virtual > >> filesystems, NAS, etc): > >> > >> As soon as the filesystem is removed from the namespace the > >> superblock based fsfreeze ioctls become useless; if we let a umount > >> of a frozen filesystem succeed we would not be able to thaw it (well > >> we could use emergency thaw but it would be overkill). Since we do > > Actually, you can always mount the filesystem again (you will essentially > >just attach the superblock to the namespace again) and thaw the filesystem. > >So this is not a big issue. > > The problem is that we may generate write I/O during the second > mount. We would need to audit all filesystems (which I am fine > with if there is a sensible use case). Most filesystems should be fine as they use mount_bdev() so foo_fill_super() isn't called in that case. But yes, some filesystems could in theory do something weird. > >> not want to break lazy umounts the only viable solution is thawing > >> the superblock automatically on umount (releasing the active > >> reference taken in freeze_super() to be more precise). > > I'm not against this. As you write below, you cannot really thaw > >freeze coming via block device so you end up with somewhat inconsistent > >behavior (thaw only freezes by ioctl) but after all freeze of a filesystem > >and freeze of a block device *are* somewhat different requests so the > >inconsistency can be justified. > > > >Do I get right that when we do this, you won't need ioctls for querying the > >freeze state? > > I would still want the check ioctls. For example, in some cases the > freeze/unfreeze process is controlled by a daemon which can die > and with the current API there is no way to check what state > filesystems where left in (well, we have emergency thaw but thaw > unfreezes all filesystems which may not be what we want, i.e. overkill). > I have heard a lot of complaints about this from users. Well, you're going to find out pretty quickly in what state a filesystem is :) But I understand that with ioctl() you can produce a sensible output... > Virtualization is a special case of this where the freeze of a guest > filesystem can be initiated from the hypervisor and carried out by > a guest agent behind the guest's administrator's back. OK. > >>2) Block device based filesystems: > >> > >> These can be reached through the block device it is sitting on even > >> if the filesystem was detached from the namespace and have the > >> particularity that they can be frozen using two different APIs, a > >> block device level one and the ioctls. When a filesytem was frozen > >> using the former, which only has in-kernel users such as dm, > >> automatically thawing the filesystem on umount is arguably too rude > >> (we can end up breaking the filesystem level consistency of a > >> storage snapshot). It we care about this, we could modify > >> sys_umount() so that filesystem is automatically thawed if and only > >> if there are no block device level freezes active. This behavior > >> would be consistent with case 1) above (the premise here is that > >> both fsfreeze and umount are userspace controlled operations and the > >> administrator should know what it is doing) and is the less likely > >> to cause surprises to freeze_bdev() users. > >> > >> It would also be nice to have a block device level thaw ioctl for > >> emergency cases (for example, a scenario where thaw_bdev() was not > >> called and the freeze counter was left in a inconsistent state; > >> freeze_bdev() and thaw_bdev() are exported symbols and in many cases > >> we cannot control what external modules do). > > Umm, I don't know. I'd rather forbid thawing via ioctl when the device is > >frozen via block device so that should solve possible issues caused by > >buggy userspace and the rest is a kernel bug - emergency thaw is for > >that... > > That is an approach I myself considered and that I would be ok > with. I guess I will implement both and let Al decide. OK. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html