No active MDS after upgrade to 16.2.6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi - I have a 3 node cluster, and ran the upgrade to 16.2.6 yesterday.
All looked like it was going well, but the MDS servers are not coming up

Ceph status shows 2 failed daemons and 3 standby.


ceph status
  cluster:
    id:     fe3a7cb0-69ca-11eb-8d45-c86000d08867
    health: HEALTH_ERR
            client is using insecure global_id reclaim
            failed to probe daemons or devices
            1 filesystem is degraded
            1 filesystem has a failed mds daemon
            1 filesystem is offline
            1 filesystem is online with fewer MDS than max_mds

  services:
    mon: 2 daemons, quorum rhel1,story (age 81m)
    mgr: cube.snthzq(active, since 63s), standbys: story.gffann, rhel1.cmxwxg
    mds: 0/2 daemons up (2 failed), 3 standby
    osd: 12 osds: 12 up (since 80m), 12 in (since 26h); 41 remapped pgs
    rgw: 3 daemons active (3 hosts, 1 zones)

  data:
    volumes: 0/1 healthy, 1 failed
    pools:   11 pools, 497 pgs
    objects: 2.06M objects, 4.1 TiB
    usage:   13 TiB used, 25 TiB / 38 TiB avail
    pgs:     354383/6190401 objects misplaced (5.725%)
             456 active+clean
             35  active+remapped+backfill_wait
             6   active+remapped+backfilling

  io:
    recovery: 36 MiB/s, 17 objects/s

And MDS metadata shows that there are only 3
ceph mds metadata
[
    {
        "name": home.story.rqrdtz<http://home.story.rqrdtz>,
        "addr": "[v2:192.168.2.199:6800/1255725176,v1:192.168.2.199:6801/1255725176]",
        "arch": "x86_64",
        "ceph_release": "pacific",
        "ceph_version": "ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific (stable)",
       "ceph_version_short": "16.2.6",
        "container_hostname": "story.robeckert.us",
        "container_image": quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac<mailto:quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac>,
        "cpu": "Intel(R) Pentium(R) Silver J5005 CPU @ 1.50GHz",
        "distro": "centos",
        "distro_description": "CentOS Linux 8",
        "distro_version": "8",
        "hostname": "story.robeckert.us",
        "kernel_description": "#1 SMP Mon Jul 26 08:06:24 EDT 2021",
        "kernel_version": "4.18.0-305.12.1.el8_4.x86_64",
        "mem_swap_kb": "8093692",
        "mem_total_kb": "32367924",
        "os": "Linux"
    },
    {
        "name": home.rhel1.ffrufi<http://home.rhel1.ffrufi>,
        "addr": "[v2:192.168.2.141:6800/169048976,v1:192.168.2.141:6801/169048976]",
        "arch": "x86_64",
        "ceph_release": "pacific",
        "ceph_version": "ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific (stable)",
        "ceph_version_short": "16.2.6",
        "container_hostname": "rhel1.robeckert.us",
        "container_image": quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac<mailto:quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac>,
        "cpu": "Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz",
        "distro": "centos",
        "distro_description": "CentOS Linux 8",
        "distro_version": "8",
        "hostname": "rhel1.robeckert.us",
        "kernel_description": "#1 SMP Mon Jul 26 08:06:24 EDT 2021",
        "kernel_version": "4.18.0-305.12.1.el8_4.x86_64",
        "mem_swap_kb": "12378108",
        "mem_total_kb": "24408040",
        "os": "Linux"
    },
    {
        "name": home.cube.cfrali<http://home.cube.cfrali>,
        "addr": "[v2:192.168.2.142:6800/2860921355,v1:192.168.2.142:6801/2860921355]",
        "arch": "x86_64",
        "ceph_release": "pacific",
        "ceph_version": "ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific (stable)",
        "ceph_version_short": "16.2.6",
        "container_hostname": "cube.robeckert.us",
        "container_image": quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac<mailto:quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac>,
        "cpu": "AMD Ryzen 5 3600 6-Core Processor",
        "distro": "centos",
        "distro_description": "CentOS Linux 8",
        "distro_version": "8",
        "hostname": "cube.robeckert.us",
        "kernel_description": "#1 SMP Mon Jul 26 08:06:24 EDT 2021",
        "kernel_version": "4.18.0-305.12.1.el8_4.x86_64",
        "mem_swap_kb": "0",
        "mem_total_kb": "65595656",
        "os": "Linux"
    }
]

How do I find the ghost services and remove them, or at least force them to use the actual MDS servers?

Thanks,
Rob
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux