Re: [PATCH] ceph: fix mdsmap_decode got incorrect mds(X)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2019-12-03 at 09:29 -0500, xiubli@xxxxxxxxxx wrote:
> From: Xiubo Li <xiubli@xxxxxxxxxx>
> 
> The possible max rank, it maybe larger than the m->m_num_mds,
> for example if the mds_max == 2 in the cluster, when the MDS(0)
> was laggy and being replaced by a new MDS, we will temporarily
> receive a new mds map with n_num_mds == 1 and the active MDS(1),
> and the mds rank >= m->m_num_mds.
> 
> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> ---
>  fs/ceph/mdsmap.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ceph/mdsmap.c b/fs/ceph/mdsmap.c
> index 284d68646c40..a77e0ecb9a6b 100644
> --- a/fs/ceph/mdsmap.c
> +++ b/fs/ceph/mdsmap.c
> @@ -129,6 +129,7 @@ struct ceph_mdsmap *ceph_mdsmap_decode(void **p, void *end)
>  	int err;
>  	u8 mdsmap_v, mdsmap_cv;
>  	u16 mdsmap_ev;
> +	u32 possible_max_rank;
>  
>  	m = kzalloc(sizeof(*m), GFP_NOFS);
>  	if (!m)
> @@ -164,6 +165,15 @@ struct ceph_mdsmap *ceph_mdsmap_decode(void **p, void *end)
>  	m->m_num_mds = n = ceph_decode_32(p);
>  	m->m_num_active_mds = m->m_num_mds;
>  
> +	/*
> +	 * the possible max rank, it maybe larger than the m->m_num_mds,
> +	 * for example if the mds_max == 2 in the cluster, when the MDS(0)
> +	 * was laggy and being replaced by a new MDS, we will temporarily
> +	 * receive a new mds map with n_num_mds == 1 and the active MDS(1),
> +	 * and the mds rank >= m->m_num_mds.
> +	 */
> +	possible_max_rank = max((u32)m->m_num_mds, m->m_max_mds);
> +
>  	m->m_info = kcalloc(m->m_num_mds, sizeof(*m->m_info), GFP_NOFS);
>  	if (!m->m_info)
>  		goto nomem;
> @@ -238,7 +248,7 @@ struct ceph_mdsmap *ceph_mdsmap_decode(void **p, void *end)
>  		     ceph_mds_state_name(state),
>  		     laggy ? "(laggy)" : "");
>  
> -		if (mds < 0 || mds >= m->m_num_mds) {
> +		if (mds < 0 || mds >= possible_max_rank) {
>  			pr_warn("mdsmap_decode got incorrect mds(%d)\n", mds);
>  			continue;
>  		}

Thanks, Xiubo. I'll squash this one into your earlier ceph_mdsmap_decode
patch, since it's fixing that logic up.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux