Re: [PATCH] libceph: fix PG split vs OSD (re)connect race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2019-08-21 at 14:07 +0200, Ilya Dryomov wrote:
> We can't rely on ->peer_features in calc_target() because it may be
> called both when the OSD session is established and open and when it's
> not.  ->peer_features is not valid unless the OSD session is open.  If
> this happens on a PG split (pg_num increase), that could mean we don't
> resend a request that should have been resent, hanging the client
> indefinitely.
> 
> In userspace this was fixed by looking at require_osd_release and
> get_xinfo[osd].features fields of the osdmap.  However these fields
> belong to the OSD section of the osdmap, which the kernel doesn't
> decode (only the client section is decoded).
> 
> Instead, let's drop this feature check.  It effectively checks for
> luminous, so only pre-luminous OSDs would be affected in that on a PG
> split the kernel might resend a request that should not have been
> resent.  Duplicates can occur in other scenarios, so both sides should
> already be prepared for them: see dup/replay logic on the OSD side and
> retry_attempt check on the client side.
> 
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 7de030d6b10a ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT")
> Reported-by: Jerry Lee <leisurelysw24@xxxxxxxxx>
> Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
> ---
>  net/ceph/osd_client.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index fed6b0334609..4e78d1ddd441 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1514,7 +1514,7 @@ static enum calc_target_result calc_target(struct ceph_osd_client *osdc,
>  	struct ceph_osds up, acting;
>  	bool force_resend = false;
>  	bool unpaused = false;
> -	bool legacy_change;
> +	bool legacy_change = false;
>  	bool split = false;
>  	bool sort_bitwise = ceph_osdmap_flag(osdc, CEPH_OSDMAP_SORTBITWISE);
>  	bool recovery_deletes = ceph_osdmap_flag(osdc,
> @@ -1602,15 +1602,14 @@ static enum calc_target_result calc_target(struct ceph_osd_client *osdc,
>  		t->osd = acting.primary;
>  	}
>  
> -	if (unpaused || legacy_change || force_resend ||
> -	    (split && con && CEPH_HAVE_FEATURE(con->peer_features,
> -					       RESEND_ON_SPLIT)))
> +	if (unpaused || legacy_change || force_resend || split)
>  		ct_res = CALC_TARGET_NEED_RESEND;
>  	else
>  		ct_res = CALC_TARGET_NO_ACTION;
>  
>  out:
> -	dout("%s t %p -> ct_res %d osd %d\n", __func__, t, ct_res, t->osd);
> +	dout("%s t %p -> %d%d%d%d ct_res %d osd%d\n", __func__, t, unpaused,
> +	     legacy_change, force_resend, split, ct_res, t->osd);
>  	return ct_res;
>  }
>  

Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux