Re: moving mons across networks

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Wed, 13 Sep 2017 11:13:38 +0200

On Wed, Sep 13, 2017 at 11:04 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> On Wed, Sep 13, 2017 at 10:54 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
>>
>>> Op 13 september 2017 om 10:38 schreef Dan van der Ster <dan@xxxxxxxxxxxxxx>:
>>>
>>>
>>> Hi Blair,
>>>
>>> You can add/remove mons on the fly -- connected clients will learn
>>> about all of the mons as the monmap changes and there won't be any
>>> downtime as long as the quorum is maintained.
>>>
>>> There is one catch when it comes to OpenStack, however.
>>> Unfortunately, OpenStack persists the mon IP addresses at volume
>>> creation time. So next time you hard reboot a VM, it will try
>>> connecting to the old set of mons.
>>> Whatever you have in ceph.conf on the hypervisors is irrelevant (after
>>> a volume was created) -- libvirt uses the IPs in each instance's xml
>>> directly.
>>>
>>
>> That's why I always recommend people to use DNS and preferably a Round Robin DNS record to overcome these situations.
>>
>> That should work with OpenStack as well.
>>
>> ceph-mon.storage.local. AAAA 2001:db8::101
>> ceph-mon.storage.local. AAAA 2001:db8::102
>> ceph-mon.storage.local. AAAA 2001:db8::103
>>
>> And then use *ceph-mon.storage.local* in your OpenStack configuration a MON.
>>
>
> Does that work? Last time I checked, it works like this when a new
> volume is attached:
>
>    - OpenStack connects to ceph using ceph.conf, DNS, whatever...
>    - Retrieve the monmap.
>    - Extract the list of IPs from the monmap.
>    - Persist the IPs in the block-device-mapping table.
>
> I still find that logic here:
> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/storage/rbd_utils.py#L163
>

And here's that same approach in the Cinder driver:

https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L350

-- dan

> Next time you hard-reboot the VM, it will connect to the IPs directly
> -- not use ceph.conf or DNS.
>
> -- Dan
>
>
>
>> Wido
>>
>>> There is an old ticket here: https://bugs.launchpad.net/cinder/+bug/1452641
>>> It's recently gone unassigned, but there is a new proposed fix here:
>>> http://lists.openstack.org/pipermail/openstack-dev/2017-June/118040.html
>>>
>>> As of today, you will need to manually update nova's
>>> block-device-mapping table for every volume when you re-ip the mons.
>>>
>>> Cheers, Dan
>>>
>>>
>>> On Wed, Sep 13, 2017 at 4:57 AM, Blair Bethwaite
>>> <blair.bethwaite@xxxxxxxxx> wrote:
>>> > Hi all,
>>> >
>>> > We're looking at readdressing the mons (moving to a different subnet)
>>> > on one of our clusters. Most of the existing clients are OpenStack
>>> > guests on Libvirt+KVM and we have a major upgrade to do for those in
>>> > coming weeks that will mean they have to go down briefly, that will
>>> > give us an opportunity to update their libvirt config to point them at
>>> > new mon addresses. We plan to do the upgrade in a rolling fashion and
>>> > thus need to keep Ceph services up the whole time.
>>> >
>>> > So question is, can we for example have our existing 3 mons on network
>>> > N1, add another 2 mons on network N2, reconfigure VMs to use the 2 new
>>> > mon addresses, all whilst not impacting running clients. You can
>>> > assume we'll setup routing such that the new mons can talk to the old
>>> > mons, OSDs, and vice-versa.
>>> >
>>> > Perhaps flipping the question on its head - if you configure a librbd
>>> > client with only a subset of mon addresses will it *only* talk to
>>> > those mons, or will it just use that config to bootstrap and then talk
>>> > to any mons that are up in the current map? Or likewise, is there
>>> > anything the client has to talk to the mon master for?
>>> >
>>> > --
>>> > Cheers,
>>> > ~Blairo
>>> > _______________________________________________
>>> > ceph-users mailing list
>>> > ceph-users@xxxxxxxxxxxxxx
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com