Summary/Minutes from today's Fedora Infrastructure meeting (2013-02-21)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



============================================
#fedora-meeting: Infrastructure (2013-02-21)
============================================


Meeting started by nirik at 19:00:01 UTC. The full logs are available at
http://meetbot.fedoraproject.org/fedora-meeting/2013-02-21/infrastructure.2013-02-21-19.00.log.html
.



Meeting summary
---------------
* welcome y'all  (nirik, 19:00:01)

* New folks introductions and Apprentice tasks.  (nirik, 19:02:15)
  * new easyfix tasks welcome, team members are encouraged to try and
    file tickets for them.  (nirik, 19:05:28)

* Applications status / discussion  (nirik, 19:06:17)
  * pingou has vastly simplified the pkgdb db.  (nirik, 19:07:42)
  * new pkgdb-cli pushed out as well as copr-cli  (nirik, 19:08:16)
  * fas release being tested in staging, for 2013-02-28 release to prod.
    (nirik, 19:08:57)
  * askbot is now sending fedmsg's.  (nirik, 19:11:56)
  * more fas-openid testing welcome. Has worked for those folks that
    have tried it so far.  (nirik, 19:15:29)
  * fedocal ready for 1.0 tag and review process.  (nirik, 19:16:16)
  * LINK: http://elections-dev.cloud.fedoraproject.org/   (abadger1999,
    19:16:30)
  * testing on new elections version welcome:
    http://elections-dev.cloud.fedoraproject.org/ (make account in
    fakefas)  (nirik, 19:17:04)
  * will try out an f18 server for mm3 staging testing and feel out an
    updates policy, etc. Possibly using snapshots more.  (nirik,
    19:33:27)
  * will look at moving fas-openid to prod as soon as is feasable.
    (nirik, 19:33:46)
  * feedback on github reviews of all commits welcome.  (nirik,
    19:39:04)
  * mirrormanager update to 1.4 soon.  (nirik, 19:39:11)

* Sysadmin status / discussion  (nirik, 19:43:00)
  * smooge got our bnfs01 server's disks working again.  (nirik,
    19:43:56)
  * nagios adjustments in progress  (nirik, 19:44:30)
  * arm boxes will get new net friday hopefully  (nirik, 19:45:07)
  * mass reboot next wed (tenative) for rhel 6.4 upgrades.  (nirik,
    19:47:52)

* Private Cloud status update / discussion  (nirik, 19:52:50)
  * euca cloudlet limping along after upgrade.  (nirik, 19:55:11)
  * work on going to bring openstack cloudlet up to more production
    (nirik, 19:55:26)
  * please see skvidal if you want to get involved in our private cloud
    setup  (nirik, 20:01:29)

* Upcoming Tasks/Items  (nirik, 20:01:33)
  * 2013-02-28 end of 4th quarter  (nirik, 20:01:44)
  * 2013-03-01 nag fi-apprentices  (nirik, 20:01:44)
  * 2013-03-07 remove inactive apprentices.  (nirik, 20:01:44)
  * 2013-03-19 to 2013-03-26 - koji update  (nirik, 20:01:44)
  * 2013-03-29 - spring holiday.  (nirik, 20:01:44)
  * 2013-04-02 to 2013-04-16 ALPHA infrastructure freeze  (nirik,
    20:01:46)
  * 2013-04-16 F19 alpha release  (nirik, 20:01:48)
  * 2013-05-07 to 2013-05-21 BETA infrastructure freeze  (nirik,
    20:01:50)
  * 2013-05-21 F19 beta release  (nirik, 20:01:52)
  * 2013-05-31 end of 1st quarter  (nirik, 20:01:54)
  * 2013-06-11 to 2013-06-25 FINAL infrastructure freeze.  (nirik,
    20:01:56)
  * 2013-06-25 F19 FINAL release  (nirik, 20:01:58)

* Open Floor  (nirik, 20:02:49)

Meeting ended at 20:04:14 UTC.




Action Items
------------





Action Items, by person
-----------------------
* **UNASSIGNED**
  * (none)




People Present (lines said)
---------------------------
* nirik (143)
* skvidal (99)
* abadger1999 (47)
* pingou (24)
* abompard (15)
* smooge (10)
* mdomsch (10)
* threebean (6)
* zodbot (5)
* SmootherFrOgZ (4)
* cyberworm54 (4)
* lmacken (2)
* maayke (1)
* ricky (0)
* dgilmore (0)
* CodeBlock (0)
--
19:00:01 <nirik> #startmeeting Infrastructure (2013-02-21)
19:00:01 <zodbot> Meeting started Thu Feb 21 19:00:01 2013 UTC.  The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:01 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:00:01 <nirik> #meetingname infrastructure
19:00:01 <zodbot> The meeting name has been set to 'infrastructure'
19:00:01 <nirik> #topic welcome y'all
19:00:01 <nirik> #chair smooge skvidal CodeBlock ricky nirik abadger1999 lmacken dgilmore mdomsch threebean
19:00:01 <zodbot> Current chairs: CodeBlock abadger1999 dgilmore lmacken mdomsch nirik ricky skvidal smooge threebean
19:00:13 * skvidal is here
19:00:15 <nirik> hello everyone. whos around for an infrastructure meeting?
19:00:15 <smooge> not guilty
19:00:23 * cyberworm54 is here
19:00:25 * lmacken 
19:00:26 * threebean is kinda here
19:00:28 * maayke is here
19:00:33 * abadger1999 here
19:00:40 * pingou here
19:00:52 * SmootherFrOgZ here
19:02:08 <nirik> ok, I guess lets go ahead and dive in...
19:02:15 <nirik> #topic New folks introductions and Apprentice tasks.
19:02:30 <nirik> any new folks like to introduce themselves? or apprentices with questions or comments?
19:03:04 <cyberworm54> Hi I am an apprentice and hopefully I can learn and contribute as much as I can
19:03:31 <nirik> welcome (back) cyberworm54
19:03:57 <cyberworm54> Thanks!
19:04:01 <nirik> to digress a bit... do folks think our apprentice setup is working well? or is there anything we can do to improve it?
19:04:20 <nirik> I think the biggest problem is new people getting up to speed and finding things they can work on.
19:04:52 <skvidal> nirik: also - we have a fair amount more code-related tasks than general admin tasks that newcomers can get into
19:04:56 <nirik> we are also low on new easyfix tickets, particularly in the sysadmin side.
19:05:02 <nirik> yeah.
19:05:14 <cyberworm54> it is a bit ...confusing but once you get to the docs and actually read it you have a start point
19:05:28 <nirik> #info new easyfix tasks welcome, team members are encouraged to try and file tickets for them.
19:06:06 <nirik> ok, moving on then I guess.
19:06:17 <nirik> #topic Applications status / discussion
19:06:27 <nirik> any application / development news this week or upcoming?
19:06:46 <pingou> I've been doing some cleanup on the pkgdb db scheme
19:06:49 <pingou> before: http://ambre.pingoured.fr/public/pkgdb.png
19:06:57 <pingou> after: http://ambre.pingoured.fr/public/pkgdb2.png
19:07:25 <pingou> that's with the help of abadger1999 :)
19:07:29 <nirik> wow. nice!
19:07:29 <lmacken> nice ☺
19:07:42 <nirik> #info pingou has vastly simplified the pkgdb db.
19:07:46 * abadger1999 just reviews and makes suggestions to what pingou writes ;-)
19:07:54 <pingou> pushed a new version of pkgdb-cli (waiting to arrive in testing) and pushed upstream a new version of copr-cli
19:08:16 <nirik> #info new pkgdb-cli pushed out as well as copr-cli
19:08:19 <abadger1999> New fas release is finally out the door.  Planning to upgrade production on Feb 28.
19:08:29 <pingou> abadger1999 and I have started to think about pkgdb2 basically, schema update is the first step
19:08:56 <abadger1999> pkgdb -- yeah, and pkgdb2 api is probably going to be the second step
19:08:57 <nirik> #info fas release being tested in staging, for 2013-02-28 release to prod.
19:09:19 <abadger1999> as a note for admins -- the fas release that introduced fedmsg introduced a bug that you should know about
19:09:40 <SmootherFrOgZ> btw, there's a bunch of locale fixes in the new fas release
19:09:41 <abadger1999> email verification when people change their email address was broken.
19:09:50 <nirik> thats the one we have in prod, but we have hotfixed it right?
19:10:00 <SmootherFrOgZ> would good to test fas with different languages
19:10:32 <nirik> cool.
19:10:39 <abadger1999> it would change the email when the user first entered the updated email in the form instead of waiting for them to confirm that the received the verification email.
19:10:45 <nirik> I saw in stg that it also has the 'no longer accept just yubikey for password' in.
19:11:37 <threebean> askbot got fedmsg hooks in production this week.  there are some new bugs to chase down regarding invalid sigs and busted links..
19:11:41 <nirik> any other application news? oh...
19:11:56 <nirik> #info askbot is now sending fedmsg's.
19:11:58 <threebean> Latest status -> http://www.fedmsg.com/en/latest/status/
19:12:08 <skvidal> fedmsg.com? wow
19:12:25 <nirik> Has anyone had a chance to test patrick's fas-openid dev instance? any feedback for him?
19:12:26 <abadger1999> nirik: Hmm... looks like production isn't hotfixed.
19:12:30 <skvidal> threebean: what's the status on fedmsg emitters from outside of the vpn?
19:12:35 <abadger1999> nirik: but next fas release will have the fix.
19:12:40 <nirik> abadger1999: :( I thought we did. ok.
19:12:47 <threebean> skvidal: no material progress yet, but I've been thinking it over.
19:12:50 <abadger1999> Can we wait until Thursday?
19:13:01 <skvidal> threebean: okay thanks
19:13:04 <threebean> skvidal: I have some janitorial work to do.. then that's next on my list.
19:13:21 <skvidal> threebean: that's the limiting factor for adding notices from coprs, I think
19:13:29 <nirik> abadger1999: I suppose
19:14:12 * threebean nods
19:14:18 <abadger1999> I've used fas-openid but not tested it heavily.  It has worked and looks nice.  puiterwijk has a flask-fas-openid auth plugin that he's tested and converted fedocal, IIRC, to use it.
19:14:41 <nirik> yeah, it's worked for me for a small set of sites I tested.
19:15:22 <pingou> speaking of fedocal, I need to tag 0.1.0 and put it up for review
19:15:29 <nirik> #info more fas-openid testing welcome. Has worked for those folks that have tried it so far.
19:15:41 <pingou> the current feature requests will have to wait for the next release...
19:15:57 <nirik> pingou: yeah. Will be good to get it setup. :)
19:16:15 <abadger1999> Oh, fchiulli has a new version of elections that's ready for some light testing
19:16:16 <nirik> #info fedocal ready for 1.0 tag and review process.
19:16:24 <pingou> abadger1999: oh cool!
19:16:30 <abadger1999> http://elections-dev.cloud.fedoraproject.org/
19:16:31 <nirik> abadger1999: cool. Is there an instance up?
19:16:34 <nirik> nice.
19:16:44 <skvidal> nirik: should be
19:16:48 <abadger1999> You need to make an account on fakefas in order to try it out.
19:17:04 <nirik> #info testing on new elections version welcome: http://elections-dev.cloud.fedoraproject.org/ (make account in fakefas)
19:17:06 <abadger1999> Please do try it out.
19:17:06 <skvidal> abadger1999: is elections switching to fas-openid, too?
19:17:24 <pingou> abadger1999: and the code is ?
19:17:45 <abadger1999> skvidal: I believe it is using flask-fas right now because flask-fas-openid isn't in a released python-fedora yet.
19:18:03 <abadger1999> pingou: https://github.com/fedora-infra/elections
19:18:14 <skvidal> abadger1999: got it
19:18:19 <pingou> abadger1999: great
19:18:20 <skvidal> abadger1999: thx
19:18:36 <abadger1999> np
19:18:47 <nirik> I have one more application type thing to discuss... dunno if abompard is still awake, but we should discuss mailman3. ;)
19:18:51 <abadger1999> I am all for moving more things over to the flask-fas-openid plugin though.
19:19:15 * nirik is too.
19:19:33 <nirik> anyhow, we are looking at setting up a mailman3 staging to do some more testing and shake things out.
19:19:41 <nirik> however, mailman3 needs python 2.7
19:19:43 <abompard> nirik: yeaj
19:20:06 <nirik> so, it seems: a) rhel6 + a bunch of python rpms we build and maintain against python 2.7
19:20:12 <nirik> or b) fedora 18 instance
19:20:30 <smooge> abadger1999, congrats on election stuff
19:20:38 <abompard> yes, and MM3 really does not work on python 2.6, sadly
19:20:47 * pingou question: which one will be out first: EL7 or MM3? :-p
19:20:55 <nirik> we are starting to have more fedora in our infra (for example, the arm builders are all f18)
19:21:09 <nirik> so, we might want to come up with some policy/process around them. Like when do to updates, etc.
19:21:09 <abadger1999> smooge: thanks.  It was all fchiulli though :-)  I told him he can be the new owner of the code too :-)
19:21:13 <abompard> I've already rebuilt an application for a non-system python, and it's not much fun
19:21:33 <smooge> bwahahahah
19:21:33 <abompard> as in non-scriptable
19:21:58 <nirik> yeah, it's pain either way...
19:21:59 * abadger1999 thinks fedora boxes are going to be preferable to non-system python.
19:22:07 <pingou> +1
19:22:09 <skvidal> nirik: an idea
19:22:11 <nirik> I'm leaning that way as well.
19:22:16 <abompard> by the way, Debian has a strange but nifty packaging policy for python package that make them work with all the installed versions of python
19:22:21 <smooge> I think we should make a bunch of servers rawhide
19:22:40 <skvidal> abompard: I assume the db /data for mm3 is all separate from where it needs to run, right
19:22:46 <abadger1999> abompard: yeah -- I've looked at the policy but not hte implementation.  But every time I've run it by dmalcolm, he's said he doesn't like it.
19:23:04 <abadger1999> abompard: i think some of that might be because he has looked at the implementation :-)
19:23:05 <abompard> abadger1999: understandably, it's symlink-based
19:23:17 <abompard> skvidal: yeah, to some extent
19:23:23 <skvidal> nirik: I wonder if we could have 2 instances - talking to the same db - so we could update f18 to latest - run mm3 on it in r/o mode - to make sure it is working
19:23:27 <abompard> skvidal: it has local spool directories
19:23:30 <skvidal> nirik: then just pass the ip over to the other one
19:23:40 <nirik> in the past we have been shy of fedora instances because of the massive updates flow I think, as well as possible bugs around those updates. I think it's gotten much better in the last few years (I like to think due to the updates policy, but hard to say)
19:23:59 <skvidal> nirik: which is why I was thinking we don't do updates to the RUNNING instance
19:24:07 <skvidal> we just swap out the instance that is in use/has that ip
19:24:08 <abadger1999> ... or less contributors?   /me ducks and runs
19:24:16 <nirik> :)
19:24:22 <skvidal> nirik: so we test the 'install'
19:24:24 <nirik> skvidal: right, so a extra level of staging?
19:24:31 <skvidal> nirik: one level, really
19:24:32 <abompard> skvidal: I don't know how MM3 will handle a read-only DB
19:24:37 <skvidal> prod and staging
19:25:02 <nirik> well, right now we are talking about a staging instance only, but yeah, I see what you mean. we could do something along those lines.
19:25:17 <nirik> I also think for some use cases it's not as likely to break...
19:25:36 <nirik> ie, for mailman, postfix and mailman and httpd all need to work, but it doesn't need super leaf nodes right?
19:25:39 <skvidal> abompard: understood
19:26:02 <skvidal> nirik: anyway - just an idea
19:26:04 <skvidal> nirik: ooo - actually
19:26:13 <skvidal> nirik: I just had a second idea that you will either hate or love
19:26:14 <nirik> where as for something like a pyramid app, it would be a much more complex stack
19:26:16 <skvidal> nirik: snapshots
19:26:16 <abompard> skvidal: we may get bugs because of that, not because of the upgrade
19:26:30 <skvidal> nirik: we snapshot the running instance in the cloud
19:26:32 <nirik> yeah, we could do that too.
19:26:32 <skvidal> nirik: upgrade it
19:26:36 <skvidal> and if it dies - roll it out
19:27:04 <abompard> for the moment it will only be low-traffic lists anyway
19:27:22 <abompard> and I must check that, but if MM is not running, I think postfix keeps the message
19:27:30 <abadger1999> skvidal: how would that work in terms of data?  would we keep the db and local spool directory separate from the snapshots?
19:27:33 <abompard> and re-delivers when MM starts
19:27:34 <skvidal> abompard: yes
19:27:35 <nirik> FWIW, I run f18 servers at home here, and they have been pretty darn stable. (as they were when f17... earlier releases had more breakage from my standpoint)
19:27:41 <skvidal> err
19:27:41 <skvidal> abadger1999: yes
19:27:44 <abadger1999> Cool.
19:28:11 <skvidal> abadger1999: no reason we can't have a mm3-db server in the cloud :)
19:28:12 * abadger1999 kinda likes that.  although possibly he just doesn't know all the corner cases there :-)
19:28:16 <nirik> yeah. I'm sure we could do something with snapshots.
19:28:21 <skvidal> anyway - just an idea
19:28:23 <skvidal> nothing in stone
19:28:27 <nirik> yeah.
19:29:06 <nirik> also, for updates, we may just do them on the same schedule as rhel ones, unless something security comes up in an exposed part... ie, just look at the httpd, etc not the entire machine.
19:29:42 <nirik> anyhow, all to be determined, we can feel out a policy.
19:29:49 <nirik> anything else on the applications side?
19:29:56 <abadger1999> I have two more
19:30:00 <abadger1999> Do we have a schedule for getting fas-openid into production?
19:30:28 <nirik> abadger1999: I think it's ready for stg for sure now... but not sure when prod...
19:30:58 <nirik> I'm fine with rolling it out as fast as we are comfortable with.
19:31:03 <nirik> I'd like to see it get more use. ;)
19:31:04 <abadger1999> I think we're coming along great.  But if we're going to start migrating apps to use fas-openid/telling people to use it when developing their apps (like elections), then we need to have a plan for getting it into prod
19:31:09 <abadger1999> <nod>
19:31:19 <abadger1999> nirik: it's setup to replace the current fas urls?
19:31:34 <nirik> abadger1999: not fully sure on that. I think so...
19:31:36 * abadger1999 was wondering if we could deploy it and just not announce it for a few weeks
19:31:46 <nirik> thats a thought.
19:32:22 <abadger1999> alright -- I guess let's talk about htis more on Friday after our classroom session with puiterwijk :-)
19:32:26 <nirik> Oddly I have noticed that for things like askbot you get two different "users" with different urls.
19:32:28 <nirik> yeah
19:33:05 <abadger1999> Other thing is for all the devs here, how's the "review all changes" idea working out?
19:33:27 <nirik> #info will try out an f18 server for mm3 staging testing and feel out an updates policy, etc. Possibly using snapshots more.
19:33:39 <abadger1999> I've liked how it works with pingou, puiterwijk, and SmootherFrogZ for fas, python-fedora, and packagedb.
19:33:46 <nirik> #info will look at moving fas-openid to prod as soon as is feasable.
19:33:55 <skvidal> abompard: how much space do you need on the mm server itself - if you are not storing the db there?
19:33:59 <abadger1999> lmacken: Is it working okay for bodhi and such too?
19:34:07 <abadger1999> anything that's falling through the cracks?
19:34:14 <abompard> skvidal: I need to check that
19:34:21 <nirik> skvidal: if we are doing this as a real staging, we might want to just make a real 'lists01.stg.phx2' virthost instead of cloud?
19:34:26 <pingou> abompard: I defintevely like it
19:34:53 <abadger1999> Do we want to say that certain things are okay to push without review?  (making a release would be a candidate...I was going to suggest documentation earlier but pingou found a number of problems with my documentation patch :-)
19:34:53 <pingou> abadger1999: ^ :)
19:34:54 <skvidal> nirik: okay - I didn't know if we wanted to be cloud-er-fic about it or not
19:35:01 <skvidal> nirik: thx
19:35:31 <nirik> skvidal: yeah, I'm open to either, but I think right now until we have less fog in our clouds, a real one might be better for this... but either way
19:35:53 <skvidal> nirik: well - with attached persistent volumes - using one of the qcow imgs is non-harmful
19:35:55 <nirik> abadger1999: I like seeing the extra review. I've not done much reviewing myself. ;)
19:36:06 <abompard> skvidal: not much, a few hudred MB
19:36:08 <skvidal> nirik: but I agree about fog
19:36:17 * abadger1999 notes that threebean is in another meeting but said he still likes the idea but hasn't done it consisstently all the time.  So more experimentation with it needed.
19:36:43 * abadger1999 liked that nb reviewed a documentation update the other day :-)
19:37:02 <pingou> I think it can bring us new contributor
19:37:21 <pingou> some of them are easyfix
19:37:31 <pingou> other are bigger and then might need more experienced reviewers
19:37:57 <nirik> yeah
19:38:21 <nirik> welcome mdomsch
19:38:41 <abadger1999> Yeah.  I agree.  it's nice to have someone else's eyes on the bigger fixes even if they're relatively new too, though.  It's better than before where I would have committed it without any review at all.
19:38:49 <mdomsch> better late than never
19:38:51 <nirik> that reminds me, mdomsch was going to look at updating mm in prod to 1.4 on friday... if not then, then sometime soon. ;)
19:39:04 <nirik> #info feedback on github reviews of all commits welcome.
19:39:11 <mdomsch> anyone have any grief with doing a major MM upgrade tomorrow afternoon?
19:39:11 <nirik> #info mirrormanager update to 1.4 soon.
19:39:47 <abadger1999> mdomsch: If you're around in case it goes sideways it would be very nice.
19:39:51 <mdomsch> everything I know I've broken, I've fixed.  Now it's time to test in production. :-)
19:39:52 <nirik> I think it should be fine. We can be somewhat paranoid and not touch one of the apps so we have an easy fallback.
19:40:11 <abadger1999> get the fixes in that you've had pending and get us onto a single codebase for development.
19:40:13 <mdomsch> k
19:40:25 <nirik> (until we are sure the others are all working right I mean)
19:40:31 <mdomsch> right
19:40:34 <mdomsch> so bapp02, then app01
19:40:47 * nirik nods.
19:40:48 <mdomsch> and I'll stop the automatic push from bapp02 to app*
19:40:58 <nirik> sounds good.
19:41:00 <mdomsch> until we're comfortable.  Worst case, we have slightly stale data for a few hours
19:41:21 * nirik nods.
19:41:28 <abadger1999> instead of "if you're around"  it would'vs been clearer for me to say "as long as you're around" :-)
19:41:43 <nirik> mdomsch: you've picked up all the hotfixes into 1.4 right?
19:41:46 <mdomsch> abadger1999: naturally; I'm not around nearly as much
19:41:56 <abadger1999> Yeah.  we miss you ;-)
19:42:16 <nirik> abadger1999: +1 :)
19:42:40 <nirik> anyhow, any other application news? or shall we move on?
19:43:00 <nirik> #topic Sysadmin status / discussion
19:43:06 <mdomsch> nirik: yes I pulled them all in while at FUDCon
19:43:17 <nirik> lets see... this week smooge was out at phx2 for a whirlwind tour.
19:43:22 <nirik> mdomsch: cool.
19:43:45 <nirik> #info smooge got out bnfs01 server's disks working again.
19:43:51 <nirik> #undo
19:43:51 <zodbot> Removing item from minutes: <MeetBot.items.Info object at 0x281d8c50>
19:43:56 <nirik> #info smooge got our bnfs01 server's disks working again.
19:44:09 <smooge> kind of sort of
19:44:19 <nirik> I've been tweaking nagios of late... hopefully making it better.
19:44:30 <nirik> #info nagios adjustments in progress
19:44:56 <nirik> We should have net for the rest of the arm boxes friday.
19:45:07 <nirik> #info arm boxes will get new net friday hopefully
19:45:14 <skvidal> I had a discussion with the author of pynag this morning
19:45:49 <nirik> cool. Worth using for a tool for us to runtime manage nagios?
19:45:50 <skvidal> if we have people willing to spend some time - we could easily build a query tool/cli-tool for nagios downtimes/acknowledgements/etc
19:46:07 <nirik> that would be quite handy, IMHO
19:46:12 <skvidal> nirik: it needs some code to make it work - but I think the basic functionality  is available
19:46:41 <nirik> for some things the ansible nagios module would do, but for others it would be nice to have a command line.
19:47:15 <nirik> I'd like to look at doing a mass reboot next wed or so... upgrade everything to rhel 6.4.
19:47:17 <SmootherFrOgZ> skvidal: interesting!
19:47:37 <nirik> Might do staging today/tomorrow to let it soak there and see if any of our stuff breaks. ;)
19:47:52 <nirik> #info mass reboot next wed (tenative) for rhel 6.4 upgrades.
19:47:59 <skvidal> nirik: right - I'd like to be able to enhance the ansible nagios module to be more idempotent and 'proper'
19:48:04 <skvidal> nirik: pynag _could_ do that
19:48:17 <nirik> yeah, it looks very bare bones right now.
19:48:35 <nirik> in particular we could use a 'downtime for host and all dependent hosts' type thing
19:48:52 <skvidal> nirik: we could also use a 'give me the state of this host'
19:48:58 <skvidal> without having to go to the webpage
19:49:14 <skvidal> according to palli (a pynag developer) it can read status.dat
19:49:15 <skvidal> from nagios
19:49:18 <smooge> I am looking at lldpd for our PHX2 systems http://vincentbernat.github.com/lldpd/ Mainly to better get an idea of where things are
19:49:20 <skvidal> to determine ACTUAL state
19:49:26 <nirik> finally in the sysamin world, I'd really like to poke ansible more and get it to where we can use it for more hosts. Keep getting sidetracked, but it will happen! :)
19:51:04 <nirik> smooge: another thing we could look at there is http://linux-ha.org/source-doc/assimilation/html/index.html (it uses lldpd type stuff). They are about to have their first release... so very early days.
19:51:33 <smooge> ah cool
19:51:38 <smooge> wiill look at that also
19:51:55 <nirik> oh, on nagios, I set an option: soft_state_dependencies=1
19:52:22 <nirik> this hopefully will help us not get the flurry of notices when a machine is dropping on and off the net, or has too high a load to answer, then answers again.
19:52:50 <nirik> #topic Private Cloud status update / discussion
19:53:01 <nirik> skvidal: want to share your pain where we are with cloudlets? :)
19:53:08 <skvidal> sure
19:53:23 <skvidal> last week I did the euca upgrade and the wheels came right off
19:53:29 <skvidal> and then it plunged over a cliff
19:53:31 <skvidal> into a volcano
19:53:41 <pingou> sounds like a lot of fun
19:53:42 <skvidal> where it was eaten by a volcano monster
19:53:54 <smooge> who was riding a yak
19:53:56 <skvidal> anyway the euca instance is limping along at the moment with not-occasional failures :(
19:54:04 <skvidal> smooge: and the yak had to be shaven
19:54:17 <pingou> brough back some pictures >
19:54:19 <pingou> ?
19:54:21 <skvidal> so...
19:54:35 <skvidal> I've been working on porting our imgs/amis/etc over to openstack
19:54:44 <skvidal> and getting things more production-y in the openstack instance -
19:54:58 <skvidal> I got ssl working around the ec2 api for openstack
19:55:11 <nirik> #info euca cloudlet limping along after upgrade.
19:55:12 <skvidal> working on ssl'ing the other items
19:55:18 <skvidal> for the past couple of days
19:55:26 <nirik> #info work on going to bring openstack cloudlet up to more production
19:55:28 <skvidal> I've been in a fist fight with openstack and qcow images
19:55:33 <skvidal> and resizing disks
19:55:47 <skvidal> I just got confirmation from someone that what we want to do is just not possible at the moment :)
19:55:54 <nirik> lovely. ;(
19:56:10 <skvidal> nirik: not until we get the initramdisk to resize the partitions :(
19:56:17 <skvidal> so - I'm punting on this
19:56:24 <skvidal> I just put in a new ami and kernel/ramdisk combo
19:56:29 <skvidal> that's rhel6.4 latest
19:56:30 <smooge> sometimes that is best
19:56:35 <nirik> yeah. I think that could work, but needs some time to get working right. Hopefully by the cloud-utils maintainer. ;)
19:56:38 <skvidal> and since it is an AMI  it resizes the disks
19:56:50 <skvidal> what it DOES NOT DO is follow the kernel on the disk - it uses the one(s) in the cloud
19:56:54 <skvidal> which is suck
19:57:00 <skvidal> but at least it is known/obvious suck
19:57:08 <nirik> but it should also get us moving past it for now.
19:57:11 <skvidal> I've also just built a new qcow from rhel6.4
19:57:27 <skvidal> so for systems that don't need to be on-the-fly made - we can spin them up
19:57:31 <skvidal> growpart the partition
19:57:33 <skvidal> reboot
19:57:35 <skvidal> resize
19:57:37 <skvidal> and go
19:57:47 <skvidal> and i'm working on a playbook to handle all of the above for you
19:57:51 <skvidal> and, yes, it makes me cry inside
19:58:08 <nirik> ;(
19:58:12 <skvidal> that's where we are at the moment
19:58:26 <skvidal> I am making new keys/accounts/tenants/whatever
19:58:35 <skvidal> for our lockbox 'admin' user
19:58:40 <skvidal> for making persistent instances
19:58:53 * nirik nods.
19:58:57 <skvidal> the next step is to start making use of the resource tags in openstack
19:59:02 <skvidal> so we can more easily track all this shit
19:59:15 <skvidal> also I have to make a bunch of volumes and rsycn over all the data from the euca volumes :(
19:59:30 <skvidal> I fully expect that last part to be a giant example of suffering
19:59:46 <nirik> yeah. we should probibly move one set of instances first and sort out if there's any doom
19:59:57 <skvidal> if I sound kinda 'bleah' there's a reason
20:00:02 <skvidal> nirik: I thought I'd start with the fartboard
20:00:07 <nirik> heh. ok
20:00:37 <skvidal> nirik: also - now that we have instance tags - it should be doable to write a simple 'start me up' script using ansible to spin out the instances
20:00:40 <skvidal> and KNOW where they are
20:00:48 <nirik> ok, we are running over time... let me quickly do upcoming and open floor. ;)
20:00:52 <skvidal> sorry
20:00:55 <skvidal> thx
20:00:57 <nirik> thats fine. ;) all good info
20:01:04 <skvidal> one last thing
20:01:08 <skvidal> if anyone wants to get involved
20:01:08 <skvidal> ping me
20:01:29 <nirik> #info please see skvidal if you want to get involved in our private cloud setup
20:01:33 <nirik> #topic Upcoming Tasks/Items
20:01:42 <nirik> (big paste)
20:01:44 <nirik> #info 2013-02-28 end of 4th quarter
20:01:44 <nirik> #info 2013-03-01 nag fi-apprentices
20:01:44 <nirik> #info 2013-03-07 remove inactive apprentices.
20:01:44 <nirik> #info 2013-03-19 to 2013-03-26 - koji update
20:01:44 <nirik> #info 2013-03-29 - spring holiday.
20:01:46 <nirik> #info 2013-04-02 to 2013-04-16 ALPHA infrastructure freeze
20:01:48 <nirik> #info 2013-04-16 F19 alpha release
20:01:50 <nirik> #info 2013-05-07 to 2013-05-21 BETA infrastructure freeze
20:01:52 <nirik> #info 2013-05-21 F19 beta release
20:01:54 <nirik> #info 2013-05-31 end of 1st quarter
20:01:56 <nirik> #info 2013-06-11 to 2013-06-25 FINAL infrastructure freeze.
20:01:58 <nirik> #info 2013-06-25 F19 FINAL release
20:02:00 <nirik> anything people want to schedule/note etc?
20:02:07 <nirik> I'll add the fas update and the mass reboot.
20:02:20 <abadger1999> Sounds good.
20:02:49 <nirik> #topic Open Floor
20:02:54 <nirik> Anyone have items for open floor?
20:03:32 <pingou> I have a series of blog post 'Fedora-Infra: Did you know?' coming, like once a week for the coming 4 weeks
20:03:32 <nirik> ok.
20:03:42 <skvidal> pingou: wow
20:03:46 <nirik> pingou: awesome. More blog posts would be great.
20:03:49 <pingou> short stuff, speaking about some cool features/ideas
20:03:52 <skvidal> pingou: looking forward to seeing those
20:04:10 <nirik> Thanks for coming everyone. Do continue over on our regular channels. :)
20:04:14 <nirik> #endmeeting

Attachment: signature.asc
Description: PGP signature

_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure

[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux