Hi experts,
I need your help. I have a running cluster with 19 OSDs and 3 MONs. I
created a separate LVM for /var/lib/ceph on one of the nodes. I
stopped the mon service on that node, rsynced the content to the newly
created LVM and restarted the monitor, but obviously, I didn't do that
correctly as I'm stuck in ERROR state and can't repair the respective
PGs.
How would I do that correctly? I want to do the same on the remaining
nodes, but without bringing the cluster to error state.
One thing I alreade learned is to set the noout flag before stopping
services, but what else is there to accomplish that?
But now that it is in error state, how can I repair my cluster? the
current status is:
---cut here---
ceph@node01:~/ceph-deploy> ceph -s
cluster 655cb05a-435a-41ba-83d9-8549f7c36167
health HEALTH_ERR
16 pgs inconsistent
261 scrub errors
monmap e7: 3 mons at
{mon1=192.168.160.15:6789/0,mon2=192.168.160.17:6789/0,mon3=192.168.160.16:6789/0}
election epoch 356, quorum 0,1,2 mon1,mon2,mon3
osdmap e3394: 19 osds: 19 up, 19 in
pgmap v7105355: 8432 pgs, 15 pools, 1003 GB data, 205 kobjects
2114 GB used, 6038 GB / 8153 GB avail
8413 active+clean
16 active+clean+inconsistent
3 active+clean+scrubbing+deep
client io 0 B/s rd, 136 kB/s wr, 34 op/s
ceph@ndesan01:~/ceph-deploy> ceph health detail
HEALTH_ERR 16 pgs inconsistent; 261 scrub errors
pg 1.ffa is active+clean+inconsistent, acting [16,5]
pg 1.cc9 is active+clean+inconsistent, acting [5,18]
pg 1.bb1 is active+clean+inconsistent, acting [15,5]
pg 1.ac4 is active+clean+inconsistent, acting [0,5]
pg 1.a46 is active+clean+inconsistent, acting [13,4]
pg 1.a16 is active+clean+inconsistent, acting [5,18]
pg 1.9e4 is active+clean+inconsistent, acting [13,9]
pg 1.9b7 is active+clean+inconsistent, acting [5,6]
pg 1.950 is active+clean+inconsistent, acting [0,9]
pg 1.6db is active+clean+inconsistent, acting [15,5]
pg 1.5f6 is active+clean+inconsistent, acting [17,5]
pg 1.5c2 is active+clean+inconsistent, acting [8,4]
pg 1.5bc is active+clean+inconsistent, acting [9,6]
pg 1.505 is active+clean+inconsistent, acting [16,9]
pg 1.3e6 is active+clean+inconsistent, acting [2,4]
pg 1.32 is active+clean+inconsistent, acting [18,5]
261 scrub errors
---cut here---
And the number of scrub errors is increasing, although I started with
more thatn 400 scrub errors.
What I have tried is to manually repair single PGs as described in [1]
--
Eugen Block voice : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail : eblock@xxxxxx
Vorsitzende des Aufsichtsrates: Angelika Mozdzen
Sitz und Registergericht: Hamburg, HRB 90934
Vorstand: Jens-U. Mozdzen
USt-IdNr. DE 814 013 983
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com