Re: monitor removal and re-add

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The issue, Sage, is that we have to deal with the cluster being
re-expanded.  If we start with 5 monitors and scale back to 3, running
the "ceph mon remove N" command after stopping each monitor and don't
restart the existing monitors, we cannot re-add those same monitors
that were previously removed.  They will suicide at startup.

On Mon, Jun 24, 2013 at 4:22 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:
> On Mon, 24 Jun 2013, Mandell Degerness wrote:
>> Hmm.  This is a bit ugly from our perspective, but not fatal to your
>> design (just our implementation).  At the time we run the rm, the
>> cluster is smaller and so the restart of each monitor is not fatal to
>> the cluster.  The problem is on our side in terms of guaranteeing
>> order of behaviors.
>
> Sorry, I'm still confused about where the monitor gets restarted.  It
> doesn't matter if the removed monitor is stopped or failed/gone; 'ceph mon
> rm ...' will remove it from the monmap and quorum.  It sounds like you're
> suggesting that the surviving monitors need to be restarted, but they do
> not, as long as enough of them are alive to form a quorum and pass the
> decree that the mon cluster is smaller.  So 5 -> 2 would be problematic,
> but 5 -> 3 (assuming there are 3 currently up) will work without
> restarts...
>
> sage
>
>
>>
>> On Mon, Jun 24, 2013 at 1:54 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:
>> > On Mon, 24 Jun 2013, Mandell Degerness wrote:
>> >> I'm testing the change (actually re-starting the monitors after the
>> >> monitor removal), but this brings up the issue with why we didn't want
>> >> to do this in the first place:  When reducing the number of monitors
>> >> from 5 to 3, we are guaranteed to have a service outage for the time
>> >> it takes to restart at least one of the monitors (and, possibly, for
>> >> two of the restarts, now that I think on it).  In theory, the
>> >> stop/start cycle is very short and should complete in a reasonable
>> >> time.  What I'm concerned about, however, is the case that something
>> >> is wrong with our re-written config file.  In that case, the outage is
>> >> immediate and will last until the problem is corrected on the first
>> >> server to have the monitor restarted.
>> >
>> > I'm jumping into this thread late, but: why would you follow the second
>> > removal procedure for broken clusters?  To go from 5->3 mons, you should
>> > just stop 2 of the mons and do 'ceph mon rm <addr1>' 'ceph mon rm
>> > <addr2>'.
>> >
>> > sage
>> >
>> >>
>> >> On Mon, Jun 24, 2013 at 10:07 AM, John Nielsen <lists@xxxxxxxxxxxx> wrote:
>> >> > On Jun 21, 2013, at 5:00 PM, Mandell Degerness <mandell@xxxxxxxxxxxxxxx> wrote:
>> >> >
>> >> >> There is a scenario where we would want to remove a monitor and, at a
>> >> >> later date, re-add the monitor (using the same IP address).  Is there
>> >> >> a supported way to do this?  I tried deleting the monitor directory
>> >> >> and rebuilding from scratch following the add monitor procedures from
>> >> >> the web, but the monitor still suicide's when started.
>> >> >
>> >> >
>> >> > I assume you're already referencing this:
>> >> > http://ceph.com/docs/master/rados/operations/add-or-rm-mons/
>> >> >
>> >> > I have done what you describe before. There were a couple hiccups, let's see if I remember the specifics:
>> >> >
>> >> > Remove:
>> >> > Follow the first two steps under "removing a monitor (manual) at the link above:
>> >> >         service ceph stop mon.N
>> >> >         ceph mon remove N
>> >> > Comment out the monitor entry in ceph.conf on ALL mon, osd and client hosts.
>> >> > Restart services as required to make everyone happy with the smaller set of monitors
>> >> >
>> >> > Re-add:
>> >> > Wipe the old monitor's directory and re-create it
>> >> > Follow the steps for "adding a monitor (manual) at the link above. Instead of adding a new entry you can just un-comment your old ones in ceph.conf. You can also start the monitor with "service ceph start mon N" on the appropriate host instead of running yourself (step 8). Note that you DO need to run ceph-mon as specified in step 5. I was initially confused about the '--mkfs' flag there--it doesn't refer to the OS's filesystem, you should use a directory or mountpoint that is already prepared/mounted.
>> >> >
>> >> > HTH. If you run into trouble post exactly the steps you followed and additional details about your setup.
>> >> >
>> >> > JN
>> >> >
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> ceph-users@xxxxxxxxxxxxxx
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >>
>>
>>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux