Re: [PATCH 0/9] ceph: auto reconnect after blacklisted

Jeff Layton <jlayton@xxxxxxxxxx> · Mon, 08 Jul 2019 07:43:52 -0400

On Mon, 2019-07-08 at 19:34 +0800, Yan, Zheng wrote:
> On Mon, Jul 8, 2019 at 6:59 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > On Mon, 2019-07-08 at 16:43 +0800, Yan, Zheng wrote:
> > > On Fri, Jul 5, 2019 at 9:22 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > On Fri, 2019-07-05 at 19:26 +0800, Yan, Zheng wrote:
> > > > > On Fri, Jul 5, 2019 at 6:16 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > > On Fri, 2019-07-05 at 09:17 +0800, Yan, Zheng wrote:
> > > > > > > On Thu, Jul 4, 2019 at 10:30 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > > > > On Thu, 2019-07-04 at 09:30 +0800, Yan, Zheng wrote:
> > > > > > > > > On 7/4/19 12:01 AM, Jeff Layton wrote:
> > > > > > > > > > On Wed, 2019-07-03 at 20:44 +0800, Yan, Zheng wrote:
> > > > > > > > > > > This series add support for auto reconnect after blacklisted.
> > > > > > > > > > > 
> > > > > > > > > > > Auto reconnect is controlled by recover_session=<clean|no> mount option.
> > > > > > > > > > > Clean mode is enabled by default. In this mode, client drops dirty date
> > > > > > > > > > > and dirty metadata, All writable file handles are invalidated. Read-only
> > > > > > > > > > > file handles continue to work and caches are dropped if necessary.
> > > > > > > > > > > If an inode contains any lost file lock, read and write are not allowed.
> > > > > > > > > > > until all lost file locks are released.
> > > > > > > > > > 
> > > > > > > > > > Just giving this a quick glance:
> > > > > > > > > > 
> > > > > > > > > > Based on the last email discussion about this, I thought that you were
> > > > > > > > > > going to provide a mount option that someone could enable that would
> > > > > > > > > > basically allow the client to "soldier on" in the face of being
> > > > > > > > > > blacklisted and then unblacklisted, without needing to remount anything.
> > > > > > > > > > 
> > > > > > > > > > This set seems to keep the force_reconnect option (patch #7) though, so
> > > > > > > > > > I'm quite confused at this point. What exactly is the goal of here?
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > because auto reconnect can be disabled, force_reconnect is the manual
> > > > > > > > > way to fix blacklistd mount.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > Why not instead allow remounting with a different recover_session= mode?
> > > > > > > > Then you wouldn't need this option that's only valid during a remount.
> > > > > > > > That seems like a more natural way to use a new mount option.
> > > > > > > > 
> > > > > > > 
> > > > > > > you mean something like 'recover_session=now' for remount?
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > No, I meant something like:
> > > > > > 
> > > > > >     -o remount,recover_session=brute
> > > > > > 
> > > > > 
> > > > > This is confusing. user may just want to change auto reconnect mode
> > > > > for backlist event in the future, does not want to force reconnect.
> > > > > 
> > > > 
> > > > Why do we need to allow the admin to manually force a reconnect? If you
> > > > (hypothetically) change the mode to "brute" then it should do it on its
> > > > own when it detects that it's in this situation, no?
> > > > 
> > > 
> > > First, auto reconnect is limited to once every 30 seconds. Second,
> > > client may fail to detect that itself is blacklisted. So I think we
> > > still need a way to force client to reconnect
> > > 
> > 
> > How does it detect that it has been blacklisted? Does it do that by
> > looking at the OSD maps? I'd like to better understand how the client
> > would recognize this automatically and why it might miss it.
> > 
> 
> By checking osd request reply and session reject message from mds.
> 

Ok, so is the issue is that the client may become blacklisted and
unblacklisted before it sends anything to either server?

> > If we do end up needing some sort of control to forcibly reconnect, then
> > it seems like we might be better off with something besides a mount
> > option. sysfs control file maybe?
> > 
> 
> why
> 

Because mount options are generally there to control the behavior of the
filesystem at mount time, and not to cue a mount to do some one-shot
activity. That sort of thing is usually done via other mechanisms
(sysfs, ioctl, etc.).

> > > > > > IOW, allow the admin to just change the mount to use a different
> > > > > > recover_session= mode once things are stuck.
> > > > > > 
> > > > > > > > > > There's also nothing in the changelogs or comments about
> > > > > > > > > > recover_session=brute, which seems like it ought to at least be
> > > > > > > > > > mentioned.
> > > > > > > > > 
> > > > > > > > > brute code is not enabled yet
> > > > > > > > 
> > > > > > > > Got it -- I missed that that the mount option for it was commented out.
> > > > > > > > 
> > > > > > > > Given that this is a user interface change, I think it'd be best to not
> > > > > > > > merge merge this until it's complete. Otherwise we'll have to deal with
> > > > > > > > intermediate kernel versions that don't implement some parts of the new
> > > > > > > > interface. That's makes it more difficult for admins to use (and for us
> > > > > > > > to document).
> > > > > > > > 
> > > > > > > > > > At this point, I'm going to say NAK on this set until there is some
> > > > > > > > > > accompanying documentation about how you intend for this be used and by
> > > > > > > > > > whom. A patch for the mount.ceph(8) manpage would be a good place to
> > > > > > > > > > start.
> > > > > > > > > > 
> > > > > > > > > > > Yan, Zheng (9):
> > > > > > > > > > >    libceph: add function that reset client's entity addr
> > > > > > > > > > >    libceph: add function that clears osd client's abort_err
> > > > > > > > > > >    ceph: allow closing session in restarting/reconnect state
> > > > > > > > > > >    ceph: track and report error of async metadata operation
> > > > > > > > > > >    ceph: pass filp to ceph_get_caps()
> > > > > > > > > > >    ceph: return -EIO if read/write against filp that lost file locks
> > > > > > > > > > >    ceph: add 'force_reconnect' option for remount
> > > > > > > > > > >    ceph: invalidate all write mode filp after reconnect
> > > > > > > > > > >    ceph: auto reconnect after blacklisted
> > > > > > > > > > > 
> > > > > > > > > > >   fs/ceph/addr.c                  | 30 +++++++----
> > > > > > > > > > >   fs/ceph/caps.c                  | 84 ++++++++++++++++++++----------
> > > > > > > > > > >   fs/ceph/file.c                  | 50 ++++++++++--------
> > > > > > > > > > >   fs/ceph/inode.c                 |  2 +
> > > > > > > > > > >   fs/ceph/locks.c                 |  8 ++-
> > > > > > > > > > >   fs/ceph/mds_client.c            | 92 ++++++++++++++++++++++++++-------
> > > > > > > > > > >   fs/ceph/mds_client.h            |  6 +--
> > > > > > > > > > >   fs/ceph/super.c                 | 91 ++++++++++++++++++++++++++++++--
> > > > > > > > > > >   fs/ceph/super.h                 | 23 +++++++--
> > > > > > > > > > >   include/linux/ceph/libceph.h    |  1 +
> > > > > > > > > > >   include/linux/ceph/messenger.h  |  1 +
> > > > > > > > > > >   include/linux/ceph/mon_client.h |  1 +
> > > > > > > > > > >   include/linux/ceph/osd_client.h |  2 +
> > > > > > > > > > >   net/ceph/ceph_common.c          | 38 +++++++++-----
> > > > > > > > > > >   net/ceph/messenger.c            |  5 ++
> > > > > > > > > > >   net/ceph/mon_client.c           |  7 +++
> > > > > > > > > > >   net/ceph/osd_client.c           | 24 +++++++++
> > > > > > > > > > >   17 files changed, 365 insertions(+), 100 deletions(-)
> > > > > > > > > > > 
> > > > > > > > 
> > > > > > > > --
> > > > > > > > Jeff Layton <jlayton@xxxxxxxxxx>
> > > > > > > > 
> > > > > > 
> > > > > > --
> > > > > > Jeff Layton <jlayton@xxxxxxxxxx>
> > > > > > 
> > > > 
> > > > --
> > > > Jeff Layton <jlayton@xxxxxxxxxx>
> > > > 
> > 
> > --
> > Jeff Layton <jlayton@xxxxxxxxxx>
> > 

-- 
Jeff Layton <jlayton@xxxxxxxxxx>