Re: desaster recovery Ceph Storage , urgent help needed

"Gerhard W. Recher" <gerhard.recher@xxxxxxxxxxx> · Fri, 23 Oct 2020 14:22:30 +0200

This is a proxmox cluster ...
sorry for formating problems of my post :(

short plot, we messed with ip addr. change of public network, so
monitors went down.

we changed monitor information in ceph.conf and with
ceph-mon -i pve01 --extract-monmap /tmp/monmap
monmaptool --rm pve01 --rm pve02 --rm pve03 /tmp/monmap
monmaptool --add pve01 10.100.200.141 --add pve02 10.100.200.142 --add
pve03 10.100.200.143 /tmp/monmap
monmaptool --print /tmp/monmap
ceph-mon -i pve01 --inject-monmap /tmp/monmap

restart of all three nodes, but osd's dont't come up

so howto recover from this disaster ?

# ceph -s
  cluster:
    id:     92d063d7-647c-44b8-95d7-86057ee0ab22
    health: HEALTH_WARN
            1 daemons have recently crashed
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 3 daemons, quorum pve01,pve02,pve03 (age 19h)
    mgr: pve01(active, since 19h)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

 cat /etc/pve/ceph.conf
[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 10.112.200.0/24
         fsid = 92d063d7-647c-44b8-95d7-86057ee0ab22
         mon_allow_pool_delete = true
         mon_host = 10.100.200.141 10.100.200.142 10.100.200.143
         osd_pool_default_min_size = 2
         osd_pool_default_size = 3
         public_network = 10.100.200.0/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

[mon.pve01]
         public_addr = 10.100.200.141

[mon.pve02]
         public_addr = 10.100.200.142

[mon.pve03]
         public_addr = 10.100.200.143

Gerhard W. Recher

net4sec UG (haftungsbeschränkt)
Leitenweg 6
86929 Penzing

+49 8191 4283888
+49 171 4802507
Am 23.10.2020 um 13:50 schrieb Burkhard Linke:
> Hi,
>
>
> your mail is formatted in a way that makes it impossible to get all
> information, so a number of questions first:
>
>
> - are the mons up, or are the mon up and in a quorum? you cannot
> change mon IP addresses without also adjusting them in the mon map.
> use the daemon socket on the systems to qeury the current state of the
> mons
>
> - the osd systemd output is useless for debugging. it only states that
> the osd is not running and not able to start
>
>
> The real log files are located in /var/log/ceph/. If the mon are in
> quorum, you should find more information here. Keep in mind that you
> also need to change ceph.conf on the OSD hosts if you change the mon
> IP addresses, otherwise the OSDs won't be able to find the mon and the
> processes will die.
>
> And I do not understand how corosync should affect your ceph cluster.
> Ceph does not use corosync...
>
>
> If you need fast help I can recommend the ceph irc channel ;-)
>
>
> Regards,
>
> Burkhard
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx