Re: question on restoring mons

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Glad to hear it, and happy to help more if needed :) Pretty sure I made exactly the same reading error you did...

-Alex

On 10/21/21, 11:23 AM, "Marcel Kuiper" <ceph@xxxxxxxx> wrote:

    Hi Alex

    Thanks for your answer. That was very helpful. I apparently completely 
    misread the script.

    Marcel

    Alexander Closs schreef op 2021-10-21 16:39:
    > Hi Marcel - I'm not sure if this message will make it to the list as
    > well (seems my emails there are blackholed for some reason), but we
    > just did this procedure. What happens in the script (which does need
    > some tweaking to work) is, or is supposed to be, that
    > 
    > - the monstore is created on the first OSD host against all local
    > OSDs, then copied back to the local machine
    > - that monstore is then copied to the next OSD host, where the
    > procedure is repeated, then copied back
    > - repeat for all OSD hosts
    > 
    > The thing preventing overwrites is that the monstore is copied back
    > and forth before and after each host's set of OSDs are added.
    > 
    > Happy to write a less-hasty reply with more detail (and hopefully the
    > actual script I ended up using, though I'm not sure that host survived
    > the Cephpocalypse we just had) if that would be helpful?
    > 
    > Good luck,
    > -Alex
    > 
    > On 10/21/21, 9:51 AM, "Marcel Kuiper" <ceph@xxxxxxxx> wrote:
    > 
    >     Hi
    > 
    >     A while ago we were close to loosing our monitors (disk free space 
    > got
    >     thin after a 3 hour network outage) so I am trying to get some grip 
    > on
    >     restoring the mon db's from osds with ceph-objectstore-tool 
    > according to
    >     this page
    > 
    > https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds.
    >     Just in case.
    > 
    >     However the first lines of the shellscript do not seem to be 
    > accurate.
    >     I've written some comments inbetween to clarify
    > 
    >     <snip>
    >     ms=/root/mon-store
    >     mkdir $ms
    > 
    >     # collect the cluster map from stopped OSDs
    >     for host in $hosts; do
    >        rsync -avz $ms/. user@$host:$ms.remote
    > 
    >        # my comment1: this makes the previous rsync fail in the next 
    > loop.
    >     easy to fix
    >        rm -rf $ms
    > 
    >        ssh user@$host <<EOF
    >          for osd in /var/lib/ceph/osd/ceph-*; do
    >            ceph-objectstore-tool --data-path \$osd --no-mon-config --op
    >     update-mon-db --mon-store-path $ms.remote
    >          done
    >     EOF
    > 
    >        # my comment2
    >        # since previous commands produces filenames that are mostly the 
    > same
    >     for each host, this sync will overwrite all data
    >        # with the data of the last host. Not sure how to fix in order 
    > to
    >     rebuild a new mon database from all host data
    >        rsync -avz user@$host:$ms.remote/. $ms
    >     done
    > 
    >     </snip>
    > 
    >     Does anyone have an idea how this is supposed to work?
    > 
    >     Marcel
    >     _______________________________________________
    >     ceph-users mailing list -- ceph-users@xxxxxxx
    >     To unsubscribe send an email to ceph-users-leave@xxxxxxx
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux