Recovery mon. from OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

i have one big problem. My ceph cluster working only one month with this configuration:

Centos 7 64bit

ceph-admin      10.0.34.10
MON1                 10.0.34.11

OSD01        10.0.34.21

OSD02        10.0.34.22
OSD03        10.0.34.23

OSD04        10.0.34.24

After one month, in the server with Mon1, crashed Raid. Mon1 lost. 
one day cluster worked without Mon1. I try recovery Mon1 form OSDs.

I created new Mon1 with the same ip.

on all 4 OSDs:

systemctl stop ceph-osd.target

mkdir -p /tmp/mon-store

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-[1-2-3-4]/ --op update-mon-db --mon-store-path /tmp/mon-store/

 

rsync all data to new MON1.

rsync -avz root@CLNODE0[1-2-3-4]:/tmp/mon-store/ /tmp/mon-store/

mkdir -p /var/lib/ceph/mon/ceph-MON1/

cp -r /tmp/mon-store/* /var/lib/ceph/mon/ceph-MON1/

 

cp keyring-mon to /var/lib/ceph/mon/ceph-MON1/

chown ceph:ceph -R /var/lib/ceph


and

ceph-monstore-tool /tmp/mon-store rebuild -- --keyring /etc/ceph/ceph.client.admin.keyring

after that i copy all to  /var/lib/ceph/mon1
(done  keyring  store.db  systemd)

touch done, systemd
cp keyring-mon to keyring

systemctl start ceph-mon@MON1.service

but after that>
:/1811249608 >> 10.0.34.11:6789/0 pipe(0x7f71d805c8c0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f71d805db80).fault

mon/AuthMonitor.cc: In function 'virtual void AuthMonitor::update_from_paxos(bool*)' thread 7f0db1661600 time 2017-12-25 00:31:44.810236
mon/AuthMonitor.cc: 160: FAILED assert(ret == 0)
 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x5645fa9e5895]
 2: (AuthMonitor::update_from_paxos(bool*)+0x1953) [0x5645fa77af83]
 3: (PaxosService::refresh(bool*)+0x1a5) [0x5645fa68dc05]
 4: (Monitor::refresh_from_paxos(bool*)+0x15b) [0x5645fa6248eb]
 5: (Monitor::init_paxos()+0x95) [0x5645fa624d85]
 6: (Monitor::preinit()+0x949) [0x5645fa6378f9]
 7: (main()+0x242d) [0x5645fa5c266d]
 8: (__libc_start_main()+0xf5) [0x7f0dae9d5c05]
 9: (()+0x25ec3f) [0x5645fa615c3f]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2017-12-25 00:31:44.812415 7f0db1661600 -1 mon/AuthMonitor.cc: In function 'virtual void AuthMonitor::update_from_paxos(bool*)' thread 7f0db1661600 time 2017-12-25 00:31:44.810236
mon/AuthMonitor.cc: 160: FAILED assert(ret == 0)

ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (()+0x509c2a) [0x5645fa8c0c2a]
 2: (()+0xf5e0) [0x7f0db01eb5e0]
 3: (gsignal()+0x37) [0x7f0dae9e91f7]
 4: (abort()+0x148) [0x7f0dae9ea8e8]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x267) [0x5645fa9e5a77]
 6: (AuthMonitor::update_from_paxos(bool*)+0x1953) [0x5645fa77af83]
 7: (PaxosService::refresh(bool*)+0x1a5) [0x5645fa68dc05]
 8: (Monitor::refresh_from_paxos(bool*)+0x15b) [0x5645fa6248eb]
 9: (Monitor::init_paxos()+0x95) [0x5645fa624d85]
 10: (Monitor::preinit()+0x949) [0x5645fa6378f9]
 11: (main()+0x242d) [0x5645fa5c266d]
 12: (__libc_start_main()+0xf5) [0x7f0dae9d5c05]
 13: (()+0x25ec3f) [0x5645fa615c3f]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Aborted


What i not right make? Maybe have everyone faq, how to recovery lost MON from OSDs…. 

Thank you.
Best regards, Alex.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux