Hi guys, have any updates here? On Sun, Jun 17, 2012 at 12:59 PM, Igor Laskovy <igor.laskovy@xxxxxxxxx> wrote: > John, Jason, can you please concretely clarify what this bad things? For > example the worst-case. > > Yoshi, Kei, can you please clarify current status of Kemari. How far it is > from production usage? > > On Fri, Jun 15, 2012 at 5:48 PM, Jason Hedden <jhedden@xxxxxxxxxxx> wrote: >> I'm running 2 full nova controllers behind a NGINX load balancer. While >> there still is that chance of half completed tasks, it's been working very >> well. >> >> Each nova controller is running (full time) nova-scheduler, nova-cert, >> keystone, and 6 nova-api processes. All API requests go through NGINX which >> reverse proxies the traffic to these 2 systems. >> >> example Nginx nova-api config: >> upstream nova-api { >> server hostA:8774 fail_timeout=30s; >> server hostB:8774 fail_timeout=30s; >> server hostA:18774 fail_timeout=30s; >> server hostB:18774 fail_timeout=30s; >> server hostA:28774 fail_timeout=30s; >> server hostB:28774 fail_timeout=30s; >> server hostA:38774 fail_timeout=30s; >> server hostB:38774 fail_timeout=30s; >> server hostA:48774 fail_timeout=30s; >> server hostB:48774 fail_timeout=30s; >> server hostA:58774 fail_timeout=30s; >> server hostB:58774 fail_timeout=30s; >> } >> >> server { >> listen x.x.x.x:8774; >> server_name public.name; >> >> location / { >> proxy_pass http://nova-api; >> proxy_set_header Host "public.address:8774"; >> proxy_set_header X-Real-IP $remote_addr; >> proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; >> } >> } >> >> >> Attached is a diagram that gives a brief overview of the HA environment >> I've setup. >> >> --Jason Hedden >> >> >> On Jun 15, 2012, at 5:36 AM, John Garbutt wrote: >> >>> I know there is some work in the XenAPI driver to make it resilient to >>> these kinds of failures (to allow frequent updates of the nova code), and I >>> think there were plans for the work to be reused in the Libvirt driver. >>> >>> AFAIK, in Essex and lower, bad things can happen if you don’t wait for >>> all the tasks to finish. You may well be OK some of the time. >>> >>> It boils down to an issue of consuming the message from Rabbit but not >>> completing the task, and not being able to recover from half completed >>> tasks. >>> >>> Hope that helps, >>> John >>> >>> From: Igor Laskovy [mailto:igor.laskovy@xxxxxxxxx] >>> Sent: 15 June 2012 11:31 >>> To: Christian Parpart >>> Cc: John Garbutt; openstack-operators@xxxxxxxxxxxxxxxxxxx; >>> <,openstack@xxxxxxxxxxxxxxxxxxx>, >>> Subject: Re: [Openstack-operators] Nova Controller HA issues >>> >>> I am using OpenStack for my little lab for a short time too)) >>> >>> Ok, you are right of course, but I meant a some another design when told >>> about virtualization controller nodes. >>> >>> It is can be only two dedicated hypetvisor with dedicated share/drbd >>> between them. This hypervisors will be standalone, and not be part of nova. >>> Than, maybe pacemaker or another tool can take availability function to >>> restart VM to alive node when active will die. >>> >>> Main question here - how worth can be if occurs controller nodes >>> unexpected power off. In another word, when VM restart it will be in crash >>> consisted state. >>> Will some nova services will loose here? >>> Will RabbiMQ loose some data here? (I am new to RabbitMQ too) >>> >>> Igor Laskovy >>> facebook.com/igor.laskovy >>> Kiev, Ukraine >>> >>> On Jun 15, 2012 10:54 AM, "Christian Parpart" <trapni@xxxxxxxxx> wrote: >>> Hey, >>> >>> well, I said "I might be wrong" because I have no "clear" vision on how >>> OpenStack works in >>> its deepest detail, however, I would not like to depend on a controller >>> node that >>> is inside a virtual machine, controlled by compute nodes, that are >>> controlled by the controller >>> node. This sounds quite like a chicken-and-egg problem. >>> >>> However, at the time of this writing, I think you'll have to have a >>> working nova-scheduler process, >>> which is responsible on deciding on which compute node to spawn your VM >>> (what else?), >>> and think about what you do when this (or all your controller-)VMs >>> terribly die, >>> and you want to rebuild it, how do you plan to do this when your >>> controller node is out-of-service? >>> >>> I in my case have put the controller services onto two compute nodes, and >>> use Pacemaker >>> to switch between them, in case one node goes down, the other can take >>> over (via shared service-IP). >>> >>> Again, these are my thoughts, and I am using OpenStack for just about a >>> month now :-) >>> But I hope this helps a bit... >>> >>> Best regards, >>> Christian Parpart. >>> >>> On Fri, Jun 15, 2012 at 8:16 AM, Igor Laskovy <igor.laskovy@xxxxxxxxx> >>> wrote: >>> Why? Can you please clarify. >>> >>> Igor Laskovy >>> facebook.com/igor.laskovy >>> Kiev, Ukraine >>> >>> On Jun 15, 2012 1:55 AM, "Christian Parpart" <trapni@xxxxxxxxx> wrote: >>> I don't think putting the controller node completely into a VM is a good >>> advice, >>> at least when speaking of nova-scheduler and nova-api (if central). >>> >>> I may be wrong, and if so, please correct me. >>> >>> Christian. >>> >>> On Thu, Jun 14, 2012 at 7:20 PM, Igor Laskovy <igor.laskovy@xxxxxxxxx> >>> wrote: >>> Hi, have any updates there? >>> Can anybody clarify what happens if controller nodes just going hard >>> shutdown? >>> >>> I thinking about solution with two hypervisors and putting controller >>> node in VM shared storage, which can be relaunched when active >>> hypervisor will die. >>> Any ideas, advise? >>> >>> >>> On Tue, Jun 12, 2012 at 3:52 PM, John Garbutt <John.Garbutt@xxxxxxxxxx> >>> wrote: >>> > Sure, I get your point. >>> > >>> > I think Florian is working on some docs to help on that. >>> > >>> > Not sure how much has been done already. >>> > >>> > >>> > >>> > Cheers, >>> > >>> > John >>> > >>> > >>> > >>> > From: Christian Parpart [mailto:trapni@xxxxxxxxx] >>> > Sent: 12 June 2012 13:47 >>> > To: John Garbutt >>> > Cc: openstack-operators@xxxxxxxxxxxxxxxxxxx >>> > Subject: Re: [Openstack-operators] Nova Controller HA issues >>> > >>> > >>> > >>> > Hey, ya I also found this page, but didn't find it yet that helpful, it >>> > rather much sounds like a theoretical paper on >>> > >>> > how they implemented it rather then telling me on how to actually make >>> > it >>> > happen (from the sysop point of view :-) >>> > >>> > >>> > >>> > I hoped that someone had to face this already, since I really find it >>> > very >>> > unintuitive to realize, or need to wait until >>> > >>> > I get more time to investigate dedicated. :-) >>> > >>> > >>> > >>> > Regards, >>> > >>> > Christian. >>> > >>> > On Tue, Jun 12, 2012 at 12:52 PM, John Garbutt >>> > <John.Garbutt@xxxxxxxxxx> >>> > wrote: >>> > >>> > I thought Rabbit had a built in HA solution these days: >>> > >>> > http://www.rabbitmq.com/ha.html >>> > >>> > >>> > >>> > From: openstack-operators-bounces@xxxxxxxxxxxxxxxxxxx >>> > [mailto:openstack-operators-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of >>> > Christian Parpart >>> > Sent: 12 June 2012 09:59 >>> > To: openstack-operators@xxxxxxxxxxxxxxxxxxx >>> > Subject: [Openstack-operators] Nova Controller HA issues >>> > >>> > >>> > >>> > Hi all, >>> > >>> > >>> > >>> > after spending the whole evening in making our cloud controller node >>> > highly >>> > available >>> > >>> > using Corosync/Pacemaker, at which I am really proud about it, I am >>> > having >>> > just a few >>> > >>> > problems left, and the one that freaks me out the most is >>> > rabbitmq-server. >>> > >>> > >>> > >>> > That beast I just seem to find no good documenation on how to set >>> > rabbitmq-server up >>> > >>> > properly for HA'ing. >>> > >>> > >>> > >>> > Does anyone have ever tried to set a nova controller (including >>> > rabbitmq >>> > dependency) up for HAing? >>> > >>> > If so, I'd be pleased to share experiences, especially to the latter >>> > part. >>> > :-) >>> > >>> > >>> > >>> > Best regards, >>> > >>> > Christian Parpart >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > Openstack-operators mailing list >>> > Openstack-operators@xxxxxxxxxxxxxxxxxxx >>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >>> > >>> >>> >>> >>> -- >>> Igor Laskovy >>> Kiev, Ukraine >>> >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~openstack >>> Post to : openstack@xxxxxxxxxxxxxxxxxxx >>> Unsubscribe : https://launchpad.net/~openstack >>> More help : https://help.launchpad.net/ListHelp > >> >> >> _______________________________________________ >> Openstack-operators mailing list >> Openstack-operators@xxxxxxxxxxxxxxxxxxx >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >> > > > > -- > Igor Laskovy > facebook.com/igor.laskovy > Kiev, Ukraine -- Igor Laskovy facebook.com/igor.laskovy Kiev, Ukraine -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html