Adding a new monitor to CEPH setup remains in state probing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all, I have a CEPH setup installed:  3 monitors, 3 mgr and 3 mds (CEPH 15.2.4 Octopus version / CentOS Linux release 7.8.2003) and the rest of OSDs.
The idea is to add a new node on an updated OS like Rocky Linux release 8.5 and then start to install CEPH Pacific release in order to test the upgrading process from CEPH Octopus to Pacific (I know it isn't a suitable number of monitors in order to establish a quorum).
Before to upgrade to Pacific release I installed on the new node the last release of Octopus: ceph-mds-15.2.16-0 and ceph-mgr-15.2.16-0 (Octopus 15.2.16 on the Rocky Linux) without troubles. 
However when I try to add the new monitor (Octopus 15.2.16) and start the mon daemon it never reach to join the rest of monitoring daemons and remains always on the "probing state". The networks are fine (the rest of daemons are using the same network) and new and old daemons have connectivity.
Below I show the configurations and log traces.
Thanks in advace.
 
 
ceph versions
{
    "mon": {
        "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 3
    },
    "mgr": {
        "ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)": 1,
        "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 4
    },
    "osd": {
        "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 258
    },
    "mds": {
        "ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)": 1,
        "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 3
    },
    "overall": {
        "ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)": 2,
        "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 268
    }
 
 
ceph --admin-daemon  /var/run/ceph/ceph-mon.MN03.asok mon_status 
{
    "name": "MN03",
    "rank": -1,
    "state": "probing",
    "election_epoch": 0,
    "quorum": [],
    "features": {
        "required_con": "2449958197560098820",
        "required_mon": [
            "kraken",
            "luminous",
            "mimic",
            "osdmap-prune",
            "nautilus",
            "octopus"
        ],
        "quorum_con": "0",
        "quorum_mon": []
    },
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 14,
        "fsid": "c31094f9-f9c2-41fc-9f0c-3a0fad593e72",
        "modified": "2022-03-15T12:55:53.431087Z",
        "created": "2020-02-12T15:30:05.703351Z",
        "min_mon_release": 15,
        "min_mon_release_name": "octopus",
        "features": {
            "persistent": [
                "kraken",
                "luminous",
                "mimic",
                "osdmap-prune",
                "nautilus",
                "octopus"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": "MN00",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.2.0.5:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.2.0.5:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.2.0.5:6789/0",
                "public_addr": "10.2.0.5:6789/0",
                "priority": 0,
                "weight": 0
            },
            {
                "rank": 1,
                "name": "MN01",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.2.0.6:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.2.0.6:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.2.0.6:6789/0",
                "public_addr": "10.2.0.6:6789/0",
                "priority": 0,
                "weight": 0
            },
            {
                "rank": 2,
                "name": "MN02",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.2.0.7:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.2.0.7:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.2.0.7:6789/0",
                "public_addr": "10.2.0.7:6789/0",
                "priority": 0,
                "weight": 0
            }
        ]
    },
    "feature_map": {
        "mon": [
            {
                "features": "0x3f01cfb8ffedffff",
                "release": "luminous",
                "num": 1
            }
        ]
    }
}
 
ceph -s
  cluster:
    id:     c31094f9-f9c2-41fc-9f0c-3a0fad593e72
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum MN00,MN01,MN02 (age 8d)
    mgr: MN02(active, since 2w), standbys: MN03, MN00, MN01
    mds: data:1 {0=mds00=up:active} 1 up:standby-replay 2 up:standby
    osd: 258 osds: 258 up (since 2d), 258 in (since 4w)
 
cat /var/log/ceph/ceph-mon.MN03.log
2022-03-24T15:06:47.111+0100 7f4e31ae26c0  0 mon.MN03 does not exist in monmap, will attempt to join an existing cluster
2022-03-24T15:06:47.112+0100 7f4e31ae26c0  0 using public_addr v2:10.2.0.4:0/0 -> [v2:10.2.0.4:3300/0,v1:10.2.0.4:6789/0]
2022-03-24T15:06:47.112+0100 7f4e31ae26c0  0 starting mon.MN03 rank -1 at public addrs [v2:10.2.0.4:3300/0,v1:10.2.0.4:6789/0] at bind addrs [v2:10.2.0.4:3300/0,v1:10.2.0.4:6789/0] mon_data /var/lib/ceph/mon/ceph-MN01 fsid c31094f9-f9c2-41fc-9f0c-3a0fad593e72
2022-03-24T15:06:47.113+0100 7f4e31ae26c0  0 mon.MN03@-1(???).mds e330085 new map
2022-03-24T15:06:47.113+0100 7f4e31ae26c0  0 mon.MN03@-1(???).mds e330085 print_map
e330085
2022-03-24T15:06:47.114+0100 7f4e31ae26c0  0 mon.MN03@-1(???).osd e106656 crush map has features 432629239337189376, adjusting msgr requires
2022-03-24T15:06:47.114+0100 7f4e31ae26c0  0 mon.MN03@-1(???).osd e106656 crush map has features 432629239337189376, adjusting msgr requires
2022-03-24T15:06:47.114+0100 7f4e31ae26c0  0 mon.MN03@-1(???).osd e106656 crush map has features 3314933000854323200, adjusting msgr requires
2022-03-24T15:06:47.114+0100 7f4e31ae26c0  0 mon.MN03@-1(???).osd e106656 crush map has features 432629239337189376, adjusting msgr requires
2022-03-24T15:39:38.488+0100 7f4e229ca700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2022-03-24T15:39:38.488+0100 7f4e229ca700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2022-03-24T15:39:57.855+0100 7f4e229ca700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2022-03-24T15:39:57.855+0100 7f4e229ca700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2022-03-24T15:40:12.180+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 2 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:17.179+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 2 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:22.180+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 2 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:27.180+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 2 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:32.180+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:37.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:42.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:47.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:52.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:57.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:06:47.111+0100 7f4e31ae26c0  0 mon.MN03 does not exist in monmap, will attempt to join an existing cluster
2022-03-24T15:06:47.112+0100 7f4e31ae26c0  0 using public_addr v2:10.2.0.4:0/0 -> [v2:10.2.0.4:3300/0,v1:10.2.0.4:6789/0]
2022-03-24T15:06:47.112+0100 7f4e31ae26c0  0 starting mon.MN03 rank -1 at public addrs [v2:10.2.0.4:3300/0,v1:10.2.0.4:6789/0] at bind addrs [v2:10.2.0.4:3300/0,v1:10.2.0.4:6789/0] mon_data /var/lib/ceph/mon/ceph-MN03 fsid c31094f9-f9c2-41fc-9f0c-3a0fad593e72
2022-03-24T15:06:47.113+0100 7f4e31ae26c0  0 mon.MN03@-1(???).mds e330085 new map
2022-03-24T15:06:47.113+0100 7f4e31ae26c0  0 mon.MN03@-1(???).mds e330085 print_map
2022-03-24T15:06:47.114+0100 7f4e31ae26c0  0 mon.MN03@-1(???).osd e106656 crush map has features 432629239337189376, adjusting msgr requires
2022-03-24T15:06:47.114+0100 7f4e31ae26c0  0 mon.MN03@-1(???).osd e106656 crush map has features 432629239337189376, adjusting msgr requires
2022-03-24T15:06:47.114+0100 7f4e31ae26c0  0 mon.MN03@-1(???).osd e106656 crush map has features 3314933000854323200, adjusting msgr requires
2022-03-24T15:06:47.114+0100 7f4e31ae26c0  0 mon.MN03@-1(???).osd e106656 crush map has features 432629239337189376, adjusting msgr requires
2022-03-24T15:39:38.488+0100 7f4e229ca700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2022-03-24T15:39:38.488+0100 7f4e229ca700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2022-03-24T15:39:57.855+0100 7f4e229ca700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2022-03-24T15:39:57.855+0100 7f4e229ca700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2022-03-24T15:40:12.180+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 2 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:17.179+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 2 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:22.180+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 2 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:27.180+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 2 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:32.180+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:37.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:42.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:47.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:52.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
2022-03-24T15:40:57.181+0100 7f4e1d94f700 -1 mon.MN03@-1(probing) e14 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2022-03-24T15:39:38.489173+0100)
 
 
 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux