Hi, Yes, the old mon daemons are removed. In the first post mon daemons were started with mon data from scratch. After some code search, I suspect without original mon data I could restore the cluster from all osds. But I may be wrong on this. For now, I think it could be of less configuration if I could start a mon daemon cluster with exact ID as original one( something like k,m,o). Any thoughts on this? Ben Eugen Block <eblock@xxxxxx> 于2023年3月9日周四 20:56写道: > Hi, > > I'm not familiar with rook so the steps required may vary. If you try > to reuse the old mon stores you'll have the mentioned mismatch between > the new daemons and the old monmap (which still contains the old mon > daemons). It's not entirely clear what went wrong in the first place > and what you already tried exactly, so it's hard to tell if editing > the monmap is the way to go here. I guess the old mon daemons are > removed, is that assumption correct? In that case it could be worth a > try to edit the current monmap to contain only the new mons and inject > it (see [1] for details). If the mons start and form a quorum you'd > have a cluster, but I can't tell if the OSDs will register > successfully. I think the previous approach when the original mons > were up but the OSDs didn't start would have been more promising. > Anyway, maybe editing the monmap will fix this for you. > > [1] > > https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovering-a-monitor-s-broken-monmap > > Zitat von Ben <ruidong.gao@xxxxxxxxx>: > > > Hi Eugen, > > > > Thank you for help on this. > > > > Forget the log. A little progress, the monitors store were restored. I > > created a new ceph cluster to use the restored monitors store. But the > > monitor log complains: > > > > debug 2023-03-09T11:00:31.233+0000 7fe95234f880 0 starting mon.a rank -1 > > at public addrs [v2:169.169.163.25:3300/0,v1:169.169.163.25:6789/0] at > bind > > addrs [v2:197.166.206.27:3300/0,v1:197.166.206.27:6789/0] mon_data > > /var/lib/ceph/mon/ceph-a fsid 3f271841-6188-47c1-b3fd-90fd4f978c76 > > > > debug 2023-03-09T11:00:31.234+0000 7fe95234f880 1 mon.a@-1(???) e27 > > preinit fsid 3f271841-6188-47c1-b3fd-90fd4f978c76 > > > > debug 2023-03-09T11:00:31.234+0000 7fe95234f880 -1 mon.a@-1(???) e27 > not in > > monmap and have been in a quorum before; must have been removed > > > > debug 2023-03-09T11:00:31.234+0000 7fe95234f880 -1 mon.a@-1(???) e27 > commit > > suicide! > > > > debug 2023-03-09T11:00:31.234+0000 7fe95234f880 -1 failed to initialize > > > > > > The fact is original monitor clusters ids are k,m,o, however the new ones > > are a,b,d. It was deployed by rook. Any ideas to make this work? > > > > > > Ben > > > > Eugen Block <eblock@xxxxxx> 于2023年3月9日周四 16:00写道: > > > >> Hi, > >> > >> there's no attachment to your email, please use something like > >> pastebin to provide OSD logs. > >> > >> Thanks > >> Eugen > >> > >> Zitat von Ben <ruidong.gao@xxxxxxxxx>: > >> > >> > Hi, > >> > > >> > I ended up with having whole set of osds to get back original ceph > >> cluster. > >> > I figured out to make the cluster running. However, it's status is > >> > something as below: > >> > > >> > bash-4.4$ ceph -s > >> > > >> > cluster: > >> > > >> > id: 3f271841-6188-47c1-b3fd-90fd4f978c76 > >> > > >> > health: HEALTH_WARN > >> > > >> > 7 daemons have recently crashed > >> > > >> > 4 slow ops, oldest one blocked for 35077 sec, daemons > >> > [mon.a,mon.b] have slow ops. > >> > > >> > > >> > > >> > services: > >> > > >> > mon: 3 daemons, quorum a,b,d (age 9h) > >> > > >> > mgr: b(active, since 14h), standbys: a > >> > > >> > osd: 4 osds: 0 up, 4 in (since 9h) > >> > > >> > > >> > > >> > data: > >> > > >> > pools: 0 pools, 0 pgs > >> > > >> > objects: 0 objects, 0 B > >> > > >> > usage: 0 B used, 0 B / 0 B avail > >> > > >> > pgs: > >> > > >> > > >> > All osds are down. > >> > > >> > > >> > I checked the osds logs and attached with this. > >> > > >> > > >> > Please help and I wonder if it's possible to get the cluster back. I > have > >> > some backup for monitor's data. Till now I haven't restore that in the > >> > course. > >> > > >> > > >> > Thanks, > >> > > >> > Ben > >> > _______________________________________________ > >> > ceph-users mailing list -- ceph-users@xxxxxxx > >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > >> > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx