Re: [SOLVED] Monitor rename / recreate issue -- probing state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Joao,

Thanks for your help.    I increased logging on the failed monitor and noticed a lot of cephx authentication errors.   After verifying ntp sync, I noticed that the monitor keyring deployed on working monitors differed from what was stored in the management server’s ceph.mon.keyring.   Syncing the key and redeploying monitors got them to peer and establish quorum.



On Dec 14, 2015, at 11:10 , deeepdish <deeepdish@xxxxxxxxx> wrote:

Joao,

Please see below.   I think you’re totally right on:

I suspect they may already have this monitor in their map, but either
with a different name or a different address -- and are thus ignoring
probes from a peer that does not match what they are expecting.

The monitor in question has been previously working (quorum).   It was removed and now attempting to re-add using a different IP address as per public procedure:  http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/   (I followed the 'CHANGING A MONITOR’S IP ADDRESS (THE RIGHT WAY)’ procedure)

#  ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.smg01.asok mon_status
{
    "name": "smg01",
    "rank": 0,
    "state": "probing",
    "election_epoch": 0,
    "quorum": [],
    "outside_quorum": [
        "smg01"
    ],
    "extra_probe_peers": [
        "10.20.1.8:6789\/0",
        "10.20.10.251:6789\/0",
        "10.20.10.252:6789\/0"
    ],
    "sync_provider": [],
    "monmap": {
        "epoch": 0,
        "fsid": "693834c1-1f95-4237-ab97-a767b0c0e6e7",
        "modified": "0.000000",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "smg01",
                "addr": "10.20.10.250:6789\/0"
            },
            {
                "rank": 1,
                "name": "smon01s",
                "addr": "0.0.0.0:0\/1"
            },
            {
                "rank": 2,
                "name": "smon02s",
                "addr": "0.0.0.0:0\/2"
            },
            {
                "rank": 3,
                "name": "b02s08",
                "addr": "0.0.0.0:0\/3"
            }
        ]
    }
}


# ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.smon01.asok mon_status
{
    "name": "smon01",
    "rank": 1,
    "state": "peon",
    "election_epoch": 2702,
    "quorum": [
        0,
        1,
        2
    ],
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 12,
        "fsid": "693834c1-1f95-4237-ab97-a767b0c0e6e7",
        "modified": "2015-12-09 06:23:43.665100",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "b02s08",
                "addr": "10.20.1.8:6789\/0"
            },
            {
                "rank": 1,
                "name": "smon01",
                "addr": "10.20.10.251:6789\/0"
            },
            {
                "rank": 2,
                "name": "smon02",
                "addr": "10.20.10.252:6789\/0"
            }
        ]
    }
}

# ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.smon02.asok mon_status
{
    "name": "smon02",
    "rank": 2,
    "state": "peon",
    "election_epoch": 2702,
    "quorum": [
        0,
        1,
        2
    ],
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 12,
        "fsid": "693834c1-1f95-4237-ab97-a767b0c0e6e7",
        "modified": "2015-12-09 06:23:43.665100",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "b02s08",
                "addr": "10.20.1.8:6789\/0"
            },
            {
                "rank": 1,
                "name": "smon01",
                "addr": "10.20.10.251:6789\/0"
            },
            {
                "rank": 2,
                "name": "smon02",
                "addr": "10.20.10.252:6789\/0"
            }
        ]
    }
}


# ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.b02s08.asok mon_status
{
    "name": "b02s08",
    "rank": 0,
    "state": "leader",
    "election_epoch": 2702,
    "quorum": [
        0,
        1,
        2
    ],
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 12,
        "fsid": "693834c1-1f95-4237-ab97-a767b0c0e6e7",
        "modified": "2015-12-09 06:23:43.665100",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "b02s08",
                "addr": "10.20.1.8:6789\/0"
            },
            {
                "rank": 1,
                "name": "smon01",
                "addr": "10.20.10.251:6789\/0"
            },
            {
                "rank": 2,
                "name": "smon02",
                "addr": "10.20.10.252:6789\/0"
            }
        ]
    }
}



On Dec 14, 2015, at 04:56 , Joao Eduardo Luis <joao@xxxxxxx> wrote:

On 12/14/2015 12:41 AM, deeepdish wrote:
Perhaps I’m not understanding something..

The “extra_probe_peers” ARE the other working monitors in quorum out of
the mon_host line in ceph.conf.

In the example below 10.20.1.8 = b20s08; 10.20.10.251 = smon01s;
10.20.10.252 = smon02s

The monitor is not reaching out to the other IPs and syncing.   I’m able
to ping all IPs in the extra_probe_peers list.

Okay, so that means the other monitors are, for some reason, ignoring
the probes from this monitor.

Can you please show the result of mon_status from the monitors in the
quorum?

I suspect they may already have this monitor in their map, but either
with a different name or a different address -- and are thus ignoring
probes from a peer that does not match what they are expecting.

 -Joao


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux