On Mon, Jun 3, 2019 at 7:54 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > On Mon, Jun 3, 2019 at 6:51 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote: > > > > On Fri, May 31, 2019 at 10:20 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > > > > > > On Fri, May 31, 2019 at 2:30 PM Yan, Zheng <zyan@xxxxxxxxxx> wrote: > > > > > > > > echo force_reconnect > /sys/kernel/debug/ceph/xxx/control > > > > > > > > Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx> > > > > > > Hi Zheng, > > > > > > There should be an explanation in the commit message of what this is > > > and why it is needed. > > > > > > I'm assuming the use case is recovering a blacklisted mount, but what > > > is the intended semantics? What happens to in-flight OSD requests, > > > MDS requests, open files, etc? These are things that should really be > > > written down. > > > > > got it > > > > > Looking at the previous patch, it appears that in-flight OSD requests > > > are simply retried, as they would be on a regular connection fault. Is > > > that safe? > > > > > > > It's not safe. I still thinking about how to handle dirty data and > > in-flight osd requests in the this case. > > Can we figure out the consistency-handling story before we start > adding interfaces for people to mis-use then please? > > It's not pleasant but if the client gets disconnected I'd assume we > have to just return EIO or something on all outstanding writes and > toss away our dirty data. There's not really another option that makes > any sense, is there? Can we also discuss how useful is allowing to recover a mount after it has been blacklisted? After we fail everything with EIO and throw out all dirty state, how many applications would continue working without some kind of restart? And if you are restarting your application, why not get a new mount? IOW what is the use case for introducing a new debugfs knob that isn't that much different from umount+mount? Thanks, Ilya