Re: [PATCH 0/9] ceph: auto reconnect after blacklisted

"Yan, Zheng" <ukernel@xxxxxxxxx> · Tue, 9 Jul 2019 23:09:36 +0800

On Tue, Jul 9, 2019 at 10:17 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> On Tue, 2019-07-09 at 21:31 +0800, Yan, Zheng wrote:
> > On Tue, Jul 9, 2019 at 6:18 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > On Tue, 2019-07-09 at 10:14 +0800, Yan, Zheng wrote:
> > > > On Mon, Jul 8, 2019 at 9:45 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > On Mon, 2019-07-08 at 19:55 +0800, Yan, Zheng wrote:
> > > > > > On Mon, Jul 8, 2019 at 7:43 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > > > On Mon, 2019-07-08 at 19:34 +0800, Yan, Zheng wrote:
> > > > > > > > On Mon, Jul 8, 2019 at 6:59 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > > > > > On Mon, 2019-07-08 at 16:43 +0800, Yan, Zheng wrote:
> > > > > > > > > > On Fri, Jul 5, 2019 at 9:22 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > > > > > > > On Fri, 2019-07-05 at 19:26 +0800, Yan, Zheng wrote:
> > > > > > > > > > > > On Fri, Jul 5, 2019 at 6:16 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > > > > > > > > > On Fri, 2019-07-05 at 09:17 +0800, Yan, Zheng wrote:
> > > > > > > > > > > > > > On Thu, Jul 4, 2019 at 10:30 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > > > > > > > > > > > On Thu, 2019-07-04 at 09:30 +0800, Yan, Zheng wrote:
> > > > > > > > > > > > > > > > On 7/4/19 12:01 AM, Jeff Layton wrote:
> > > > > > > > > > > > > > > > > On Wed, 2019-07-03 at 20:44 +0800, Yan, Zheng wrote:
> > > > > > > > > > > > > > > > > > This series add support for auto reconnect after blacklisted.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Auto reconnect is controlled by recover_session=<clean|no> mount option.
> > > > > > > > > > > > > > > > > > Clean mode is enabled by default. In this mode, client drops dirty date
> > > > > > > > > > > > > > > > > > and dirty metadata, All writable file handles are invalidated. Read-only
> > > > > > > > > > > > > > > > > > file handles continue to work and caches are dropped if necessary.
> > > > > > > > > > > > > > > > > > If an inode contains any lost file lock, read and write are not allowed.
> > > > > > > > > > > > > > > > > > until all lost file locks are released.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Just giving this a quick glance:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Based on the last email discussion about this, I thought that you were
> > > > > > > > > > > > > > > > > going to provide a mount option that someone could enable that would
> > > > > > > > > > > > > > > > > basically allow the client to "soldier on" in the face of being
> > > > > > > > > > > > > > > > > blacklisted and then unblacklisted, without needing to remount anything.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > This set seems to keep the force_reconnect option (patch #7) though, so
> > > > > > > > > > > > > > > > > I'm quite confused at this point. What exactly is the goal of here?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > because auto reconnect can be disabled, force_reconnect is the manual
> > > > > > > > > > > > > > > > way to fix blacklistd mount.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Why not instead allow remounting with a different recover_session= mode?
> > > > > > > > > > > > > > > Then you wouldn't need this option that's only valid during a remount.
> > > > > > > > > > > > > > > That seems like a more natural way to use a new mount option.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > you mean something like 'recover_session=now' for remount?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > No, I meant something like:
> > > > > > > > > > > > >
> > > > > > > > > > > > >     -o remount,recover_session=brute
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > This is confusing. user may just want to change auto reconnect mode
> > > > > > > > > > > > for backlist event in the future, does not want to force reconnect.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Why do we need to allow the admin to manually force a reconnect? If you
> > > > > > > > > > > (hypothetically) change the mode to "brute" then it should do it on its
> > > > > > > > > > > own when it detects that it's in this situation, no?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > First, auto reconnect is limited to once every 30 seconds. Second,
> > > > > > > > > > client may fail to detect that itself is blacklisted. So I think we
> > > > > > > > > > still need a way to force client to reconnect
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > How does it detect that it has been blacklisted? Does it do that by
> > > > > > > > > looking at the OSD maps? I'd like to better understand how the client
> > > > > > > > > would recognize this automatically and why it might miss it.
> > > > > > > > >
> > > > > > > >
> > > > > > > > By checking osd request reply and session reject message from mds.
> > > > > > > >
> > > > > > >
> > > > > > > Ok, so is the issue is that the client may become blacklisted and
> > > > > > > unblacklisted before it sends anything to either server?
> > > > > > >
> > > > > >
> > > > > > No. The issue is that old version mds does not send session reject
> > > > > > message or no 'error_str=blacklisted' in session reject message.
> > > > >
> > > > > Is that the only way to detect that this has happened? What if we were
> > > > > to simply force a reconnect on any remount? Would that break anything?
> > > > >
> > > >
> > > > why?  reconnect causes all sorts of integrity issues
> > > >
> > >
> > > Care to elaborate?
> > >
> > > My understanding was that the fact that the MDS journaled everything
> > > meant that the client would be able to reclaim all of its state if the
> > > MDS crashed and restarted, or we had a momentary loss of connection. Is
> > > that not the case?
> > >
> > > Either way, remounts should be _very_ rare events, almost always
> > > performed manually by an administrator. I suggested this under the
> > > assumption that an immediate reconnection might just be a small blip in
> > > performance. If there are data integrity issues when this occurs then
> > > that seems like a bigger problem.
> >
> > If reconnect means 're-open mds sessions',  mds lose track of caps and
> > file locks after reconnect.  It's similar to the situation that client
> > get blacklisted.
> >
>
> I don't have a great grasp of the way state recovery works with cephfs,
> so please bear with me here...
>
> Suppose I have a client with a bunch of caps and file locks, and the MDS
> crashes and is restarted. Will the client be able to reclaim those
> caps/locks in some fashion? If so, how is that different from the
> situation where the client reconnects its session spuriously?
>

client can only reclaim caps/locks when mds is in reconnect state.
'rre-open sessions' is likely to happen when mds is active.

> I'm quite leery of giving admins a knob that may cause data integrity
> problems. No other network filesystem requires something like this
> force_reconnect button, so I'm rather interested to see if we can come
> up with a more conventional way to achieve what you want.

> --
> Jeff Layton <jlayton@xxxxxxxxxx>
>