============================================ #fedora-meeting: Infrastructure (2011-07-21) ============================================ Meeting started by nirik at 19:00:01 UTC. The full logs are available at http://meetbot.fedoraproject.org/fedora-meeting/2011-07-21/infrastructure.2011-07-21-19.00.log.html .. Meeting summary --------------- * Robot Roll Call (nirik, 19:00:01) * New folks introductions and apprentice tasks/feedback (nirik, 19:01:50) * Phoenix on-site work recap/summary (nirik, 19:03:30) * QA network setup brainstorming (nirik, 19:11:17) * ACTION: nirik to continue talks with qa and move stuff (nirik, 19:24:09) * Outstanding RFR (Request for Resources) (nirik, 19:24:44) * Hotfixes (nirik, 19:40:17) * LINK: https://fedorahosted.org/fedora-infrastructure/query?status=new&status=assigned&status=reopened&group=component&summary=~hotfix&order=priority (nirik, 19:41:17) * Upcoming Tasks/Items (nirik, 19:48:54) * LINK: http://fpaste.org/HTSO/ (skvidal, 19:53:13) * Open Floor (nirik, 19:58:45) Meeting ended at 20:12:10 UTC. Action Items ------------ * nirik to continue talks with qa and move stuff Action Items, by person ----------------------- * nirik * nirik to continue talks with qa and move stuff * **UNASSIGNED** * (none) People Present (lines said) --------------------------- * nirik (126) * skvidal (56) * smooge (28) * abadger1999 (14) * athmane (8) * dgilmore (6) * Southern_Gentlem (5) * CodeBlock (5) * pingou (5) * zodbot (4) * ciphernaut (4) * lmacken (3) * StylusEater (1) * ricky (0) * codeblock (0) -- 19:00:01 <nirik> #startmeeting Infrastructure (2011-07-21) 19:00:01 <zodbot> Meeting started Thu Jul 21 19:00:01 2011 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:01 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:00:01 <nirik> #meetingname infrastructure 19:00:01 <zodbot> The meeting name has been set to 'infrastructure' 19:00:01 <nirik> #topic Robot Roll Call 19:00:01 <nirik> #chair smooge skvidal codeblock ricky nirik abadger1999 19:00:01 <zodbot> Current chairs: abadger1999 codeblock nirik ricky skvidal smooge 19:00:05 * skvidal is here 19:00:07 * CodeBlock waves 19:00:14 <smooge> Here 19:00:44 * athmane here 19:00:49 * nirik waits a minute more for folks. 19:01:50 <nirik> #topic New folks introductions and apprentice tasks/feedback 19:02:06 <nirik> Any new folks like to say hi? or apprentices have questions/comments/tasks to talk about? 19:03:04 * nirik listens to the sound of silence. ;) 19:03:16 <CodeBlock> good song. :P 19:03:24 <nirik> ok, do feel free to chime in on list or in our regular channels anytime... 19:03:30 <nirik> #topic Phoenix on-site work recap/summary 19:03:51 <nirik> So, smooge and I were out at phx the other week. Just thought I would summarize what all we did... 19:04:05 <abadger1999> hola 19:04:20 <nirik> We setup a new backup box and tape drive. We will be transitioning to this one from our existing one in the coming weeks/months. 19:04:46 <nirik> We pulled old machines out and sent some to the great place in the sky for old hardware. 19:05:37 <skvidal> we sent the machines to colorado? 19:05:47 <nirik> We took 5 of the newest/most useful looking boxes and stuck them in a new rack as 'junk01' to 'junk05'. We were thinking we might be able to use these as a testbed for things that we want to see if might someday work for us. 19:05:52 <nirik> ha. ;) 19:06:11 <nirik> We moved all the qa machines to a qa rack and network 19:06:32 <nirik> we took inventory/tried to make sure all management/serial/power was setup and known 19:06:55 <nirik> we were going to rack new machines, but they didn't arrive in time... so they will be added as we can get someone there to rack them for us. 19:07:20 <nirik> smooge: can you think of anything I left out? I probibly did miss some things... 19:07:34 <skvidal> you broke the backups? 19:07:35 <smooge> I got sick 19:07:42 <skvidal> oh - you meant things you did intentionally... 19:08:12 <nirik> yeah, I have no idea how backups broke. ;( We must have nudged the tape drive some how... it was in a weird state. 19:08:22 <smooge> yeah.. 19:08:25 <nirik> but that should be fixed now. 19:08:34 <smooge> we learned that various boxes have only one power supply 19:08:40 <smooge> even if they have multiple plugs 19:08:51 <nirik> oh, I have some pics, need to see if they came out at all. 19:09:04 <skvidal> nirik: hmm - can we put them into the infra-hosts repo? :) 19:09:15 <smooge> my 'love' of ppc found new depths 19:09:38 <nirik> sure, if they are usable. ;) It wasn't easy getting far enough away to get anything... so they might be out of focus, etc. 19:09:43 <smooge> I wish I had another week out there so I could get the rest of the hardware 19:10:18 <nirik> I think it was a very productive trip... we got a lot done/cleaned up/etc 19:11:09 <nirik> which brings us to the next topic... 19:11:17 <nirik> #topic QA network setup brainstorming 19:12:12 <nirik> so, we have 2 racks in a qa network... this includes some qa test boxes, a virtual host that has autoqa01 and autoqa01.stg and bastion-comm01 on it along with the junk boxes and the secondary arch stuff 19:12:36 <nirik> qa folks have expressed interest in monitoring and puppet or puppet like setup. 19:12:57 <nirik> how seperate do we want the qa setup from our main setup? 19:13:17 <athmane> so we separate monitoring too ? 19:13:45 * skvidal thinks separate monitoring is overkill. having said that we have a lot of legacy in our existing nagios layout 19:13:47 <nirik> athmane: thats a question, yeah. It seems a pain to have more nagios to me... but it would allow them to monitor their own stuff without ours 19:14:25 <nirik> I think we could get them to possibly use bcfg2 there, as a testbed. They have many fewer machines than we do. 19:14:38 <nirik> I'm not sure how usefull bastion-comm01 is. 19:15:12 <nirik> I guess we wanted seperate from our bastion for access there. 19:16:42 <CodeBlock> hmm 19:16:45 <nirik> anyone have thoughts or ideas? I guess I am leaning toward no seperate monitoring, stick bcfg2 on bastion-comm01 and make it the config host there... then move bastion-comm01 and virthost-comm01 out of our puppet. 19:17:03 <dgilmore> i think we do one of 2 things 19:17:21 <dgilmore> fully integrate it wheich means move it back to a fedora network 19:17:25 <dgilmore> or fully seperate 19:17:34 <athmane> yes, I don't see the need to separate monitoring, so I agree with skvidal 19:18:04 * nirik nods. we could easily just add their hosts to our puppet, but then qa folks would need access to our puppet. ;) (which I don't really know if it's a big issue or not) 19:18:18 <dgilmore> its not until it is 19:18:30 <athmane> separate config is good for security imho (qa net is more like a lab) 19:19:40 <abadger1999> Did we ever update to the new version of nagios? 19:19:44 <smooge> We went the seperate fact since the boxes there can run stuff 19:19:45 <nirik> abadger1999: yep. 19:19:48 <abadger1999> k 19:20:06 <nirik> abadger1999: we are on nagios3 now 19:21:02 <nirik> ok, I'll gather more info and talk to qa folks and set it up one way or the other. If anyone has strong ideas on it, let me know soon... 19:21:19 <abadger1999> Proposal sounds good... the only question I have is how long well run both bcfg2 and puppet 19:21:46 <nirik> abadger1999: well, it would depend on how well it works out there... and then if it did we would need some kind of transition plan. 19:21:53 <abadger1999> Yeah. 19:21:54 <skvidal> abadger1999: and if we like it at all 19:22:26 <nirik> I think this is a nice small group to test with... 19:22:29 <abadger1999> Would we want a transition plan either way? If we do like it migrate fi-main to bcfg2, if we don't like it migrate fi-qa to puppet? 19:22:41 <nirik> 8 qa machines, 2 autoqa instances, a bastion and a virthost. 19:23:16 <nirik> abadger1999: yeah. either migrate back to puppet there, or fold them into our puppet. 19:23:27 <nirik> but if it's already seperate, probibly just migrate them back to their own. 19:23:35 <abadger1999> Sounds like a plan. 19:24:09 <nirik> #action nirik to continue talks with qa and move stuff 19:24:21 <nirik> anything more on this? 19:24:44 <nirik> #topic Outstanding RFR (Request for Resources) 19:24:53 <nirik> I noticed we have a number of RFR's open. 19:24:58 <smooge> oi 19:25:01 <nirik> I added a list to the agenda email 19:25:20 <nirik> many of them are old or in an unknown to me state. ;) 19:25:35 <smooge> close 19:26:05 <pingou> #1591 is two years old 19:26:08 <nirik> yeah, if anyone wants to update them, please do. Otherwise I will look at closing... 19:26:09 <pingou> and fpaste.org is running 19:26:23 <nirik> pingou: yeah, but that one it turns out is active. ;) 19:26:30 <nirik> herlo is going to be updating it. 19:26:50 <pingou> I saw something about it but didn't get the issue 19:26:56 <nirik> the fpaste.org folks are tired of running it, and want us to. 19:27:10 <pingou> do we ? 19:27:18 <nirik> it's been finally packages up. 19:27:20 * StylusEater is late 19:27:27 <nirik> packaged. 19:27:30 * nirik can't type 19:27:48 <smooge> Have we taken over fpaste? 19:27:54 <Southern_Gentlem> nope 19:28:10 <nirik> not yet... it's unclear to me the status of the domain... 19:28:45 * pingou wonders what make them tired (eg that wouldn't make us tired) 19:28:48 <nirik> askbot is also active recently. Others I am not too clear on. 19:28:59 <ciphernaut> maximum 24hour lifecycle. Is that enough ? 19:29:33 <nirik> pingou: spam, dealing with upkeep, paying for the instance that runs it, etc 19:29:41 <nirik> ciphernaut: for what? 19:30:17 <ciphernaut> nirik, for anything/everthing. 19:30:42 <Southern_Gentlem> ! 19:30:59 <nirik> Related to RFR's: I am going to try and whip up a SOP page on process around them... 19:31:04 <Southern_Gentlem> ciphernaut, if you are referring to fpaste we have found that works very well 19:31:23 <Southern_Gentlem> its a pastebin not permanet hosting 19:31:44 <ciphernaut> most pastebins I've dealt with have 1 month or forever.. though if thats the majority of required usage cool 19:31:58 <nirik> well, thats all details we can tune later right? 19:32:08 <Southern_Gentlem> yep 19:32:13 <ciphernaut> true 19:33:14 <smooge> Of the items what should be at PHX2 and what outside (and where) 19:33:19 <nirik> for ask and paste I would like to try a new process: applicationname01.dev -> applicationname01.stg -> application01 (production). Create the group from the start that will work on it, etc. 19:33:57 <nirik> smooge: yeah. That should be determined at least at the stg point in the process... should it be load balanced/cached or not. 19:35:00 <nirik> I think both ask and paste are good to be nice and seperate as we can easily make them... ie, their own instance/db. I don't know how well clustering/replication will work for them, thats something we also need to find out. 19:36:30 <nirik> anyhow, will try and update the RFR page and make a SOP and send to list for more comment. 19:36:51 <nirik> so we can try and have a process for these. 19:37:27 <nirik> anything else on RFR's? any others folks want to save/comment on? 19:37:31 <smooge> yeah I like that process 19:37:40 <smooge> nitrate sounds like another one for that 19:37:58 <nirik> yeah, it's stalled in review... so not sure whats going to happen there. 19:38:04 <athmane> nitrate is not yet packaged 19:38:42 <nirik> perhaps step 0 of the rfr process should be: "get it packaged, then come back here" to avoid filing RFR's too far in advance. 19:39:11 <smooge> hehehe that would make a lot of stuff easier on us. 19:39:17 <athmane> we (qa team) still use wiki for record test results but i heard about a pilot project to use nitrate 19:39:34 <nirik> nitrate looks cool from a quick glance... 19:39:49 * athmane forgets if for f15 or f16 19:40:02 <nirik> anyhow, if nothing else on this, moving on... 19:40:17 <nirik> #topic Hotfixes 19:40:28 <abadger1999> I think dmalcolm and dgilmore were working on the python buildbot one. 19:40:32 <nirik> So, we also have a pile of hotfixes built up over the last while. 19:40:58 <nirik> abadger1999: ok, will ping them for status. ;) 19:41:17 <nirik> https://fedorahosted.org/fedora-infrastructure/query?status=new&status=assigned&status=reopened&group=component&summary=~hotfix&order=priority 19:41:38 * nirik waits while hosted dies because we all clicked. 19:41:49 <smooge> dead dead dead 19:41:56 <athmane> nirik: :) 19:42:00 <smooge> we need our own hosted just for us 19:42:12 <CodeBlock> :S 19:42:17 <dgilmore> abadger1999: i need to work on that 19:42:29 <dgilmore> it was part of why we moved some builders tp be virtual hosts 19:42:56 <CodeBlock> Yeah, I did some testing/pingdom loading on fedorahosted's gitweb index, and it took over 16 seconds to load on average. That's just bad :( 19:43:13 <nirik> anyhow, what are the chances we could get a pkgdb release update, supybot-koji, pinglists, mediawiki-116? ;) 19:44:00 <nirik> dgilmore: huh... which ones are virtual? 19:44:44 <smooge> I will be working on mediawiki-116/117 19:45:04 <smooge> xb01 I thought was the virtual builder 19:46:09 <nirik> smooge: not sure if the mediawiki bug was filed upstream yet... I guess ping ricky on it. 19:46:23 <smooge> ok will do so 19:46:38 <smooge> oooh long trac traceback 19:47:21 <abadger1999> nirik: pkgdb update is something I want to do in the next month. It's probably third on my list of "not-a-fire" tasks, though. 19:47:35 <nirik> abadger1999: ok, cool. Would close a number of hotfixes. ;) 19:47:45 <abadger1999> Seems that I always run into one freeze or another before getting them out :-) 19:48:09 <nirik> yeah. 19:48:19 <abadger1999> nirik: hehe. The only thing is a lot of hotfixes go in just after a release due to finding new bugs in the code :-) 19:48:29 <nirik> yep. it's a never ending cycle. ;) 19:48:48 <nirik> which brings us to the next topic: 19:48:54 <nirik> #topic Upcoming Tasks/Items 19:49:34 <nirik> basically the only things on my list are the freezes for right now: 19:49:37 <nirik> 2011-08-01 mail fi-apprentice folks. 19:49:37 <nirik> 2011-08-02 - 16: Alpha change freeze 19:49:37 <nirik> 2011-08-09 Remove inactive fi-apprentice people. 19:49:37 <nirik> 2011-08-16: Fedora 16 alpha 19:49:37 <nirik> 2011-09-06 - 20: Beta change freeze 19:49:38 <nirik> 2011-09-20: Fedora 16 Beta 19:49:40 <nirik> 2011-10-11 - 25: Final change freeze 19:49:43 <nirik> 2011-10-25: Fedora 16 release. 19:49:53 <nirik> so, we have until the 2rd before our first freeze. 19:50:10 <nirik> If anyone wants to work on/schedule things, please let me know. 19:50:33 <nirik> more moving things to rhel6. 19:50:35 <abadger1999> We should get the change freezes into the infra calendar 19:50:49 <nirik> yeah, keep meaning to, then getting distracted. ;( 19:50:59 <skvidal> nirik: app servers, proxies, hosted... what else is on the migrate to rhel6 thing? 19:51:02 <nirik> anyone interested in updating the calendars? :) 19:51:39 <nirik> skvidal: last I looked we were just over 50% rhel6, so all the rest. 19:51:44 <skvidal> nirik: :) 19:51:47 <skvidal> smartass 19:51:58 <nirik> I think ibiblio01 we can move over once we have a ibiblio02 we can migrate things to 19:52:06 <smooge> infra calender? 19:52:16 <nirik> tummy01 might be a good one to remote re-install. 19:52:30 <nirik> value's might not be hard to migrate over. 19:52:48 <skvidal> looks like 64 hosts on 5server 19:53:11 <nirik> once we have new machines racked up in phx2, we can move more things there. 19:53:13 <skvidal> http://fpaste.org/HTSO/ 19:53:43 <abadger1999> smooge: http://kevin.fedorapeople.org/infrastructure-*.ics 19:53:58 <nirik> smooge: they are in the git infra repo too. 19:54:07 <skvidal> serverbeach1 should be doable and would be an interesting case to find out if the serverbeach boxes will be able to survive el6 19:54:35 <nirik> skvidal: yeah. I was meaning to talk to them about a hardware refresh at the same time, but didn't get to that either. ;) 19:55:24 <nirik> for many of the rhel5 instances, we need to move their host to rhel6/kvm before moving them. 19:55:28 <skvidal> nod 19:55:38 * skvidal grimaces at torrent 19:55:44 <skvidal> hmm 19:55:50 <skvidal> cnode01... 19:55:56 <skvidal> and dhcp02.c 19:55:59 <skvidal> not our problem soon 19:56:23 <nirik> tummy01 and bodhost might be good ones to re-install as they don't have any critical stuff on them I don't think. we could even leave the guests lvm alone and bring them back up after the re-install 19:56:36 <skvidal> nod 19:57:07 <smooge> also what are we using serverbeach1 for? 19:57:24 <nirik> bxen03 only has releng01 on it... once we get a new machine racked in phx2 in the build rack I can move that over and we can reinstall bxen03 19:57:28 <skvidal> smooge: a mirror istr 19:57:34 <nirik> smooge: another download mirror I think is all. 19:57:40 <nirik> thats not phx2. 19:57:56 <skvidal> sb1 has had a host of issues trying to make it be a virthost 19:58:24 <nirik> another possibly good reason asking about a hw refresh. ;) 19:58:33 <skvidal> nod 19:58:45 <nirik> #topic Open Floor 19:58:59 <nirik> running low on time, any other plans/ideas/dreams? 19:59:24 <skvidal> dreams 19:59:25 <skvidal> yes 19:59:36 <skvidal> anyone here looked at salt? http://saltstack.org/ 19:59:48 <skvidal> I've been playing with it a bit and looking over the features in it 19:59:51 <nirik> I glanced at it the other day... first I had heard of it. 20:00:00 <skvidal> it's more or less func + zeromq for the communication layer 20:00:07 <skvidal> fairly fascinating, actually. 20:00:08 <skvidal> all in python 20:00:23 <skvidal> and the devs definitely have a use case like we have in mind 20:01:06 <nirik> cool. So it means clients listen to a bus for actions? 20:01:16 <skvidal> more or less, yes. 20:01:29 <skvidal> it means the clients don't need a port open 20:01:31 <smooge> heheh I have that calender in my system already. I will update the calenders this week 20:01:32 <skvidal> like we have right now wit hfunc 20:01:41 <skvidal> so it means one more port closed off 20:01:45 <smooge> what is zeromq? 20:01:45 <nirik> cool. 20:01:49 <skvidal> which is good 20:01:58 <skvidal> smooge: google is your friend :) 20:02:03 * nirik remembers lots of talk about message bussing a few years ago, but it never seems to have taken off. 20:02:20 <skvidal> there are a couple of things here that are interesting to me. 20:02:35 <smooge> amcq or something :) 20:02:57 <skvidal> 1. whether or not this adequately covers the functionality of what func has been providing for us? 20:03:18 <skvidal> 2. I'm looking at if I can port functionality like func-yum to it and have it all work the same (which would be amusing) 20:03:45 <nirik> cool. 20:03:46 <skvidal> 3. one of the things we wanted out of qpid/amqp is notifications/events as well. - the question is if we can get there from here with zeromq 20:03:57 * nirik nods. That was my next question. 20:04:09 <abadger1999> smooge: It's a library that implements easy to program buffered network sockets.. depending on who you ask, it's a lightweight message bus or nearly everything you need to make a message bus. 20:04:48 <skvidal> nirik: much to wonder and play with... 20:04:59 <skvidal> anywya - just wanted to ask if anyone here already had experience 20:05:01 <nirik> yeah, keeps things fun/interesting. ;) 20:05:02 <smooge> want to use a couple of cloud instance to do so? 20:05:18 <smooge> I read about it yesterday.. interesting to see if its packaged etc? 20:05:22 <skvidal> smooge: right now I'm just dinking with it on guests on my laptop 20:05:36 <skvidal> smooge: there are pkgs - not in fedora - b/c of our zeromq ver 20:05:50 <skvidal> the authors of salt appear to be rpm-friendly people, though 20:06:20 <lmacken> salt looks interesting... do you see that potentially obsoleting func? 20:06:46 * nirik notes we are over time, but I don't think anyone else is scheduled to meet, so we should be able to just keep going. ;) 20:06:55 <skvidal> lmacken: it has a lot of the same functionality 20:07:04 <skvidal> lmacken: and it offers a very similar plugin infrastructure 20:07:21 <skvidal> lmacken: I talked to the lead dev - the reason it is similar is b/c he had investigated func before working on salt 20:07:29 <skvidal> lmacken: it's not accidental. 20:08:07 <skvidal> lmacken: he doesn't have the same modules but a goodly number of func's modules are bound up with some xmlrpc-isms. 20:08:46 <skvidal> and the connect-out-only is useful for us. 20:08:54 <skvidal> which is the main thing pulling me at it 20:09:07 <skvidal> also that it doesn't require us to tie up qpidd on a systems-mgmt tool is nice 20:09:16 <lmacken> yeah, true 20:09:16 <skvidal> so qpidd can be used for other apps that need it w/o any conflict 20:09:44 <lmacken> speaking of our message bus vision, hopefully we'll pickup some momentum on that in the near future 20:10:13 <nirik> lmacken: cool. 20:10:41 <nirik> ok, anything else, or shall we call it a meeting? 20:11:28 <skvidal> that's all I have 20:12:07 <nirik> cool. Thanks everyone. Lets get back to #fedora-admin and #fedora-noc. ;) Thanks for coming everyone... 20:12:10 <nirik> #endmeeting
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/infrastructure