On Tue, Jun 4, 2019 at 4:10 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote: > > On Tue, Jun 4, 2019 at 5:18 AM Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > > > > On Mon, Jun 3, 2019 at 10:23 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > > > > > On Mon, Jun 3, 2019 at 1:07 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > > > > Can we also discuss how useful is allowing to recover a mount after it > > > > has been blacklisted? After we fail everything with EIO and throw out > > > > all dirty state, how many applications would continue working without > > > > some kind of restart? And if you are restarting your application, why > > > > not get a new mount? > > > > > > > > IOW what is the use case for introducing a new debugfs knob that isn't > > > > that much different from umount+mount? > > > > > > People don't like it when their filesystem refuses to umount, which is > > > what happens when the kernel client can't reconnect to the MDS right > > > now. I'm not sure there's a practical way to deal with that besides > > > some kind of computer admin intervention. (Even if you umount -l, that > > > by design doesn't reply to syscalls and let the applications exit.) > > > > Well, that is what I'm saying: if an admin intervention is required > > anyway, then why not make it be umount+mount? That is certainly more > > intuitive than an obscure write-only file in debugfs... > > > > I think 'umount -f' + 'mount -o remount' is better than the debugfs file A small bit of user input: for some of the places we'd like to use CephFS we value availability over consistency. For example, in a large batch processing farm, it is really inconvenient (and expensive in lost CPU-hours) if an operator needs to repair thousands of mounts when cephfs breaks (e.g. an mds crash or whatever). It is preferential to let the apps crash, drop caches, fh's, whatever else is necessary, and create a new session to the cluster with the same mount. In this use-case, it doesn't matter if the files were inconsistent, because a higher-level job scheduler will retry the job from scratch somewhere else with new output files. It would be nice if there was a mount option to allow users to choose this mode (-o soft, for example). Without a mount option, we're forced to run ugly cron jobs which look for hung mounts and do the necessary. My 2c, dan > > > > We have umount -f, which is there for tearing down a mount that is > > unresponsive. It should be able to deal with a blacklisted mount, if > > it can't it's probably a bug. > > > > Thanks, > > > > Ilya