Re: restoring ceph cluster from osds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm not familiar with rook so the steps required may vary. If you try to reuse the old mon stores you'll have the mentioned mismatch between the new daemons and the old monmap (which still contains the old mon daemons). It's not entirely clear what went wrong in the first place and what you already tried exactly, so it's hard to tell if editing the monmap is the way to go here. I guess the old mon daemons are removed, is that assumption correct? In that case it could be worth a try to edit the current monmap to contain only the new mons and inject it (see [1] for details). If the mons start and form a quorum you'd have a cluster, but I can't tell if the OSDs will register successfully. I think the previous approach when the original mons were up but the OSDs didn't start would have been more promising. Anyway, maybe editing the monmap will fix this for you.

[1] https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovering-a-monitor-s-broken-monmap

Zitat von Ben <ruidong.gao@xxxxxxxxx>:

Hi Eugen,

Thank you for help on this.

Forget the log. A little progress, the monitors store were restored. I
created a new ceph cluster to use the restored monitors store. But the
monitor log complains:

debug 2023-03-09T11:00:31.233+0000 7fe95234f880  0 starting mon.a rank -1
at public addrs [v2:169.169.163.25:3300/0,v1:169.169.163.25:6789/0] at bind
addrs [v2:197.166.206.27:3300/0,v1:197.166.206.27:6789/0] mon_data
/var/lib/ceph/mon/ceph-a fsid 3f271841-6188-47c1-b3fd-90fd4f978c76

debug 2023-03-09T11:00:31.234+0000 7fe95234f880  1 mon.a@-1(???) e27
preinit fsid 3f271841-6188-47c1-b3fd-90fd4f978c76

debug 2023-03-09T11:00:31.234+0000 7fe95234f880 -1 mon.a@-1(???) e27 not in
monmap and have been in a quorum before; must have been removed

debug 2023-03-09T11:00:31.234+0000 7fe95234f880 -1 mon.a@-1(???) e27 commit
suicide!

debug 2023-03-09T11:00:31.234+0000 7fe95234f880 -1 failed to initialize


The fact is original monitor clusters ids are k,m,o, however the new ones
are a,b,d. It was deployed by rook. Any ideas to make this work?


Ben

Eugen Block <eblock@xxxxxx> 于2023年3月9日周四 16:00写道:

Hi,

there's no attachment to your email, please use something like
pastebin to provide OSD logs.

Thanks
Eugen

Zitat von Ben <ruidong.gao@xxxxxxxxx>:

> Hi,
>
> I ended up with having whole set of osds to get back original ceph
cluster.
> I figured out to make the cluster running. However, it's status is
> something as below:
>
> bash-4.4$ ceph -s
>
>   cluster:
>
>     id:     3f271841-6188-47c1-b3fd-90fd4f978c76
>
>     health: HEALTH_WARN
>
>             7 daemons have recently crashed
>
>             4 slow ops, oldest one blocked for 35077 sec, daemons
> [mon.a,mon.b] have slow ops.
>
>
>
>   services:
>
>     mon: 3 daemons, quorum a,b,d (age 9h)
>
>     mgr: b(active, since 14h), standbys: a
>
>     osd: 4 osds: 0 up, 4 in (since 9h)
>
>
>
>   data:
>
>     pools:   0 pools, 0 pgs
>
>     objects: 0 objects, 0 B
>
>     usage:   0 B used, 0 B / 0 B avail
>
>     pgs:
>
>
> All osds are down.
>
>
> I checked the osds logs and attached with this.
>
>
> Please help and I wonder if it's possible to get the cluster back. I have
> some backup for monitor's data. Till now I haven't restore that in the
> course.
>
>
> Thanks,
>
> Ben
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux