Re: CEPH Cluster mon is out of quorum

Eugen Block <eblock@xxxxxx> · Mon, 13 Nov 2023 13:34:16 +0000

Is this the same cluster as the one your reported down OSDs for? Can  
you share the logs from before the "probing" status? You may have to  
increase the log level to something like debug_mon = 20. But be  
cautious and monitor the used disk space, it can increase quite a lot.  
Did you have any changes in the network infrastructure (links going up  
and down, usually visible in dmesg)? Are the nodes correctly  
configured ntp wise?

Zitat von Mosharaf Hossain <mosharaf.hossain@xxxxxxxxxxxxxx>:

Dear Concern
I am observing a mon is out of quorum. The current running version of Ceph
is octopus.

Total Node in the cluster: 13
Mon: 3/3
Network: 20G(10Gx 2) bonded link
Each node capacity: 512GB RAM + 72 Core CPU

root@ceph6:/var/run/ceph# systemctl status ceph-mon@ceph6.service
â— ceph-mon@ceph6.service - Ceph cluster monitor daemon
   Loaded: loaded (/lib/systemd/system/ceph-mon@.service; indirect; vendor
preset: enabled)
   Active: active (running) since Mon 2023-11-13 14:18:28 +06; 55min ago
 Main PID: 165299 (ceph-mon)
    Tasks: 26
   CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@ceph6.service
           â””â”€165299 /usr/bin/ceph-mon -f --cluster ceph --id ceph6
--setuser ceph --setgroup ceph

Nov 13 15:13:09 ceph6.bol-online.com ceph-mon[165299]:
2023-11-13T15:13:09.190+0600 7f7855600700 -1 mon.ceph6@2(probing) e6
get_health_metrics reporting 8 slow ops, oldest is log(1 entries from seq 1
at 2023-11-13T14:27:42.428071+0600)
Nov 13 15:13:14 ceph6.bol-online.com ceph-mon[165299]:
2023-11-13T15:13:14.190+0600 7f7855600700 -1 mon.ceph6@2(probing) e6
get_health_metrics reporting 8 slow ops, oldest is log(1 entries from seq 1
at 2023-11-13T14:27:42.428071+0600)
Nov 13 15:13:34 ceph6.bol-online.com ceph-mon[165299]:
2023-11-13T15:13:34.190+0600 7f7855600700 -1 mon.ceph6@2(probing) e6
get_health_metrics reporting 8 slow ops, oldest is log(1 entries from seq 1
at 2023-11-13T14:27:42.428071+0600)
Nov 13 15:13:39 ceph6.bol-online.com ceph-mon[165299]:
2023-11-13T15:13:39.191+0600 7

root@ceph6:/var/run/ceph# ceph -s
  cluster:
    id:     f8096ec7-51db-4557-85e6-57d7fdfe9423
    health: HEALTH_WARN
            1/3 mons down, quorum ceph2,mon1
            nodeep-scrub flag(s) set
            614 pgs not deep-scrubbed in time

  services:
    mon:     3 daemons, quorum ceph2,mon1 (age 8h), out of quorum: ceph6
    mgr:     ceph4(active, since 2w), standbys: mon1, ceph3, ceph1, ceph6
    mds:     cephfs:1 {0=ceph8=up:active} 1 up:standby
    osd:     107 osds: 102 up (since 3h), 102 in (since 3h); 24 remapped pgs
             flags nodeep-scrub
    rgw:     4 daemons active (ceph10.rgw0, ceph7.rgw0, ceph9.rgw0,
mon1.rgw0)
    rgw-nfs: 2 daemons active (ceph7, ceph9)

root@ceph6:/var/run/ceph# ceph daemon mon.ceph6 mon_status
{
    "name": "ceph6",
    "rank": 2,
    "state": "probing",
    "election_epoch": 0,
    "quorum": [],
    "features": {
        "required_con": "2449958747315978244",
        "required_mon": [
            "kraken",
            "luminous",

Total 3 mon in the cluster and all addresses are able to ping and telnet
between them.

   1. mon1: 10.10.10.71
   2. ceph2: 10.10.10.52
   3. ceph6: 10.10.10.56

Regards
Mosharaf Hossain
Manager, Product Development
IT Division

Bangladesh Export Import Company Ltd.

Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh

Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757

Cell: +8801787680828, Email: mosharaf.hossain@xxxxxxxxxxxxxx, Web:
www.bol-online.com
<https://www.google.com/url?q=http://www.bol-online.com&sa=D&source=hangouts&ust=1557908951423000&usg=AFQjCNGMxIuHSHsD3qO6y5JddpEZ0S592A>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx