Re: [PATCH] libceph: fix PG split vs OSD (re)connect race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



With this patch, the issue isn't encountered in my environment (more
than 20 runs of tests).

Tested-by: Jerry Lee <leisurelysw24@xxxxxxxxx>

Thanks!

On Wed, 21 Aug 2019 at 22:56, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> On Wed, 2019-08-21 at 14:07 +0200, Ilya Dryomov wrote:
> > We can't rely on ->peer_features in calc_target() because it may be
> > called both when the OSD session is established and open and when it's
> > not.  ->peer_features is not valid unless the OSD session is open.  If
> > this happens on a PG split (pg_num increase), that could mean we don't
> > resend a request that should have been resent, hanging the client
> > indefinitely.
> >
> > In userspace this was fixed by looking at require_osd_release and
> > get_xinfo[osd].features fields of the osdmap.  However these fields
> > belong to the OSD section of the osdmap, which the kernel doesn't
> > decode (only the client section is decoded).
> >
> > Instead, let's drop this feature check.  It effectively checks for
> > luminous, so only pre-luminous OSDs would be affected in that on a PG
> > split the kernel might resend a request that should not have been
> > resent.  Duplicates can occur in other scenarios, so both sides should
> > already be prepared for them: see dup/replay logic on the OSD side and
> > retry_attempt check on the client side.
> >
> > Cc: stable@xxxxxxxxxxxxxxx
> > Fixes: 7de030d6b10a ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT")
> > Reported-by: Jerry Lee <leisurelysw24@xxxxxxxxx>
> > Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
> > ---
> >  net/ceph/osd_client.c | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> > index fed6b0334609..4e78d1ddd441 100644
> > --- a/net/ceph/osd_client.c
> > +++ b/net/ceph/osd_client.c
> > @@ -1514,7 +1514,7 @@ static enum calc_target_result calc_target(struct ceph_osd_client *osdc,
> >       struct ceph_osds up, acting;
> >       bool force_resend = false;
> >       bool unpaused = false;
> > -     bool legacy_change;
> > +     bool legacy_change = false;
> >       bool split = false;
> >       bool sort_bitwise = ceph_osdmap_flag(osdc, CEPH_OSDMAP_SORTBITWISE);
> >       bool recovery_deletes = ceph_osdmap_flag(osdc,
> > @@ -1602,15 +1602,14 @@ static enum calc_target_result calc_target(struct ceph_osd_client *osdc,
> >               t->osd = acting.primary;
> >       }
> >
> > -     if (unpaused || legacy_change || force_resend ||
> > -         (split && con && CEPH_HAVE_FEATURE(con->peer_features,
> > -                                            RESEND_ON_SPLIT)))
> > +     if (unpaused || legacy_change || force_resend || split)
> >               ct_res = CALC_TARGET_NEED_RESEND;
> >       else
> >               ct_res = CALC_TARGET_NO_ACTION;
> >
> >  out:
> > -     dout("%s t %p -> ct_res %d osd %d\n", __func__, t, ct_res, t->osd);
> > +     dout("%s t %p -> %d%d%d%d ct_res %d osd%d\n", __func__, t, unpaused,
> > +          legacy_change, force_resend, split, ct_res, t->osd);
> >       return ct_res;
> >  }
> >
>
> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
>



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux