On Mon, May 18, 2020 at 9:56 AM Ken Dreyer <kdreyer@xxxxxxxxxx> wrote: > > Hi folks, > > I was reading https://ceph.io/community/automatic-cephfs-recovery-after-blacklisting/ > about the new recover_session=clean feature. > > The end of that blog post says that this setting involves a trade-off: > "availability is more important than correctness" > > Are there cases where the old behavior is really safer than simply > returning errors? Basically: a frozen (hung mount) or dead (restarted box) application can't have unintended side-effects. If the application is poorly written to not handle I/O errors or to not fsync, then any undesirable behavior resulting from that may occur after the mount reconnects. > It seems like this feature would not make things worse for > applications. Can we make recover_session=clean the default? There was a proposal for recover_session=strict which would (IIRC) basically kill any application that had any file descriptor open with the backend file system. That would probably be the safest default but also the most intrusive and (perhaps) surprising. Unfortunately, I think there were implementation issues that blocked it and we tabled the idea. Whether or not recover_session=clean should be the default is undecided. I think we should wait to hear back from the community testing it before deciding. -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx