20:00 < mmcgrath> #startmeeting 20:00 < fedbot> Meeting started Thu Jul 2 20:00:29 2009 UTC. The chair is mmcgrath. 20:00 < fedbot> Information about MeetBot at http://wiki.debian.org/MeetBot , Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:00 < dgilmore> gday mmcgrath 20:00 * ricky 20:00 < mmcgrath> #topic Infrastructure -- Who's here? 20:00 -!- fedbot changed the topic of #fedora-meeting to: Infrastructure -- Who's here? 20:00 * johe|home takes a seat 20:00 < mmcgrath> dgilmore: how's it going? 20:00 * SmootherFrOgZ is 20:01 * sijis sijis is here. 20:01 * ke4qqq is 20:01 < dgilmore> mmcgrath: 2 builders to go 20:01 < SmootherFrOgZ> dgilmore: for stg ? 20:01 < smooge> hello 20:01 < mmcgrath> dgilmore: excellent, happy to hear it. 20:01 < mmcgrath> Well lets get started 20:01 -!- StabbyMc [n=StabbyMc@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] has left #fedora-meeting ["Stab ya later!"] 20:01 < mmcgrath> #topic Infrastructure -- Tickets 20:01 -!- fedbot changed the topic of #fedora-meeting to: Infrastructure -- Tickets 20:01 < mmcgrath> .tiny https://fedorahosted.org/fedora-infrastructure/query?status=new&status=assigned&status=reopened&group=milestone&keywords=~Meeting&order=priority 20:01 < zodbot> mmcgrath: http://tinyurl.com/47e37y 20:02 < mmcgrath> .ticket 1503 20:02 < mmcgrath> abadger1999: take it 20:02 < zodbot> mmcgrath: #1503 (Licensing Guidelines for apps we write) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1503 20:02 < dgilmore> SmootherFrOgZ: nope 20:02 < abadger1999> So we've had a new license pop up in apps we've written recently 20:02 < abadger1999> AGPLv3+ 20:02 < abadger1999> That's incompatible with GPLv2 which is what the majority of our apps use presently. 20:03 < abadger1999> After looking over the situation with spot, it seems like it would be good to move everything to AGPLv3+. 20:03 < dgilmore> im ok with the move 20:03 < abadger1999> (With libraries going to LGPLv2+) 20:03 < smooge> abadger1999, when you say we use.. do you mean we right or other stuff 20:03 < abadger1999> We write. 20:03 < smooge> s/right/write/ 20:03 < smooge> thanks 20:04 < abadger1999> smooge: This would not affect code that we don't write. 20:04 < abadger1999> And it's a recommendation rather than a hard and fast rule. 20:04 < mmcgrath> abadger1999: have you run into anyone saying "ehh, I don't think we should do this." ? 20:04 < abadger1999> ie: mdomsch wants mirrormanager to be MIT; mediawiki plugins should follow mediawiki's license 20:04 < abadger1999> mmcgrath: So far everyone's been positive. 20:04 < mmcgrath> abadger1999: ok, so how do we actually _do_ it? 20:05 < mmcgrath> sed? 20:05 < abadger1999> yeah, we have to replace COPYING files with AGPL/LGPL and then change the headers in source files. 20:05 < smooge> well you need to look at each app and see if its something we wrote or pulled in from somewhere else 20:06 < sijis> do you need to get written proof from author before changing? 20:06 < ricky> How urgent is this time-wise? 20:06 < smooge> if its pulled in we need to deal with it.. if its something we wrote 100% we should be able to replace COPYING/headers 20:06 < abadger1999> sijis: for the majority of things no, but I am going to notify authors of pkgdb and python-fedora before I make chanes. 20:06 < mmcgrath> ricky: I'd say not real urgent, but the longer we wait... the longer we're going to wait I suspect. 20:06 < ricky> For example, with FAS, I'd like to eventually rewrite the OpenID provider part instead of dealing with licensing pain because of samadhi or anything. 20:06 < abadger1999> sijis: The CLA gives us the ability to do a relicense if the contribution was made without an explicit license. 20:07 < mmcgrath> abadger1999: some seemed timid about that on f-a-b. I'm less timid. 20:07 < abadger1999> <nod> ricky the other option is to find out what jcollie thinks about AGPLv3+ 20:07 < mmcgrath> but we should ask 20:07 < mmcgrath> abadger1999: lets take an app like fas first. 20:07 < mmcgrath> just see how it goes. 20:08 < abadger1999> yeah, it's common courtesy and also gives people a chancce to holler "Oh wait, I actually didn't own the copyright to that code.. sorry." 20:08 -!- mcepl [n=mcepl@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] has left #fedora-meeting [] 20:08 < mmcgrath> abadger1999: are you going to lead the effort on this? 20:08 < abadger1999> I'd like to do python-fedora soon It's moving to LGPLv2+ which is more permissive 20:08 < mmcgrath> should we open a ticket for each app? 20:08 < sijis> how many apps are we talking about for this? +/-15? 20:08 < abadger1999> mmcgrath: I can. Yes, each app. 20:08 < mmcgrath> sijis: less then 15 20:08 < abadger1999> sijis: Less htan 15 20:09 < mmcgrath> abadger1999: sounds good, so anything else? 20:09 < abadger1999> A ticket for each app will let us come back next week and say -- half of our app authors like a licensing policy but don't want to change *their* app. 20:09 < abadger1999> Which would mean we need to rethink. 20:10 < abadger1999> I think that's all unless someone wants to shout that it's a bad idea now :-) 20:10 -!- kolesovdv [n=kolesovd@xxxxxxxxxxxxx] has joined #fedora-meeting 20:10 < mmcgrath> anyone have anything to say? If not now, take it to the list. 20:10 < mmcgrath> and do it sooner, not later. 20:10 < mmcgrath> Ok, so next topic 20:10 < mmcgrath> #topic Infrastructure -- The merge, outages and issues. 20:10 -!- fedbot changed the topic of #fedora-meeting to: Infrastructure -- The merge, outages and issues. 20:11 < mmcgrath> So we had a merge last week. 20:11 < mmcgrath> and since the merge we've had some issues 20:11 < mmcgrath> and it's not something obvious. 20:11 < smooge> define merge for me? 20:11 < mmcgrath> and, in fact, could be completely unrelated. 20:11 < mmcgrath> smooge: merge from staging to master branches in puppet. 20:11 < ricky> smooge: We made a ton of changes in the staging branch and merged them to production :-) 20:11 < mmcgrath> Which basically involved refactoring a bunch of puppet code, cleaning things up, creating some new modules, etc, etc. 20:11 < mmcgrath> I've not seen a wiki outage since yesterday. 20:12 < mmcgrath> I need to go through the logs and look. 20:12 -!- cassmodiah [n=cass@fedora/cassmodiah] has quit Remote closed the connection 20:12 < mmcgrath> while doing some digging we, just in general, found strange issues in our environment. 20:13 < smooge> mmcgrath, ricky thanks.. 20:13 < smooge> what have been the strange ones 20:13 < mmcgrath> for example - http://mmcgrath.fedorapeople.org/proxy-errors.html 20:13 < mmcgrath> 200,000+ 502's per day. 20:13 < mmcgrath> just seems massive to me. 20:14 < ricky> In terms of the big outages, they've all seemed to happen during mysql database backups (which lock tables) or smolt render stats jobs. 20:14 < ricky> The proxy errors and 500s seem to be something else though. 20:14 < mmcgrath> <nod> 20:14 < mmcgrath> and our current lead on the 500's errors for fas is a new mod_wsgi 20:14 < ricky> Have the 500 errors stayed normal? 20:14 < mmcgrath> jbowes is working on that. 20:15 < ricky> (As in, have they gone up after the merge or not?) 20:15 < mmcgrath> ricky: hard to say 20:15 < mmcgrath> http://mmcgrath.fedorapeople.org/JuneErrors.html 20:15 < mmcgrath> I'll re-check today now that it's been a few more days. 20:15 < mmcgrath> clearly we had a major spike 20:16 < sijis> mmcgrath: the first graph shows it being mostly proxy2 20:16 < mmcgrath> but it seems to have gone back down. 20:16 < ricky> Strange. 20:16 < mmcgrath> sijis: yeah, and proxy2 is an odd beast. 20:16 < mmcgrath> proxy2 is load balanced with proxy1 behind the PHX balancer. 20:16 < mmcgrath> _however_ 20:16 < mmcgrath> anything in phx uses proxy2 directly to get to the account system. 20:16 < mmcgrath> which not only includes shell accounts. 20:17 < mmcgrath> but also includes our web applications contacting fas for session, auth, etc. 20:17 < mmcgrath> which is a significant amount of traffic. 20:17 < smooge> interesting.. is there a reason for just proxy2? 20:17 < ricky> Funny that proxy1 seems fine. 20:17 < mmcgrath> ricky: well it does get a lot less traffic. 20:17 < ricky> Like it didn't jump significantly at all. 20:17 < ricky> I guess. 20:17 < mmcgrath> smooge: the network team won't let us contact the balancer IP directly. 20:18 < sijis> so you are forced to pick a proxy? 20:18 < smooge> ah ok could we setup another proxy? 20:18 < mmcgrath> smooge: we have two of them there. 20:18 < mmcgrath> but no good way to balance between the two of them. 20:18 < mmcgrath> we could put a load balancer in there, but it'd be just another box, and would need to be rebooted as often as proxy2 is anyway 20:18 < ricky> Is the problem really coming from our PHX admin.fp.o setup though? 20:19 < smooge> mmcgrath, no what I meant was one that was just for that so we could cut down on what might be causing the erorrs? 20:19 < ricky> The 502s really jumped everywhere, so that's what I want to know the root cause of. 20:20 < smooge> so if its a bruteforce attack on stuff we could get an idea of what app is being targeted or soemthing 20:20 < mmcgrath> I think the errors are on our end, I need to do more log checking to know for sure though 20:20 < ricky> But the brute force shouldn't be causing 502, it should be working :-) 20:20 < mmcgrath> but yeah we can add and remove more proxy servers in PHX if we want to 20:20 -!- JSchmitt [n=s4504kr@fedora/JSchmitt] has quit Read error: 104 (Connection reset by peer) 20:20 < ricky> mmcgrath: Can we separate that graph into apache 502s and haproxy 502s? 20:21 < ricky> Right now they're lumped together in the source where you're getting it from, right? 20:21 -!- ddumas [n=ddumas@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] has joined #fedora-meeting 20:21 < mmcgrath> ricky: I don't think so, because if haproxy or the app server returned a 502, apache would log a 502. 20:21 < mmcgrath> so proxyX will always have our largest number of 502's 20:21 < mmcgrath> then haproxy (if we're logging that, not even sure) 20:21 < mmcgrath> then the app server 20:22 < mmcgrath> although the app servers probably don't throw 502 20:22 < ricky> mmcgrath: But some 502s are coming from apache, as in proxy1 couldn't contact locahost:10009 20:22 < ricky> Those are the strangest ones to me. 20:22 < mmcgrath> I'll have to look closer then. 20:22 < sijis> firewall? 20:23 < ricky> sijis: I don't think so - it definitely works a large percent of the time 20:23 < mmcgrath> sijis: I'd actually think that's the app server not responding to haproxy, and thus not responding to the proxy server. 20:23 < ricky> But that should strictly cause haproxy 502s not apache 502s, correct? 20:23 < mmcgrath> and I'm not seeing us hitting our haproxy limit. 20:23 < ricky> and we've seen both :-( 20:23 < mmcgrath> ricky: when looking at the logs, how can you tell the difference? 20:24 < mmcgrath> oh from it saying it couldn't contact localhost:10009 20:24 < ricky> I'm not sure. I'd expect the apache 502s to show up in the apache error log and both types of 502s to show up in the error log. 20:24 < ricky> I'll have to verify that tohugh. 20:24 < ricky> **though 20:24 < mmcgrath> hm 20:24 < mmcgrath> hm 20:24 < mmcgrath> hmmmm 20:24 < ricky> Was your source for these graphs the error log or the access log? 20:25 < mmcgrath> acciess I believe 20:25 < sijis> is haproxy on a different server or on proxy2? 20:25 * mmcgrath looks 20:25 < mmcgrath> sijis: each proxy server has it's own haproxy service on the same host 20:25 < mmcgrath> ricky: access.log 20:26 < mmcgrath> perhaps we should continue discussing this after the meeting. 20:26 < ricky> Ah, OK. 20:26 < mmcgrath> any objections? 20:26 < ricky> Sure thing 20:26 < sijis> nope. 20:27 < mmcgrath> # topic Infrastructure -- Eye in know db. - INNODB 20:27 -!- kolesovdv [n=kolesovd@xxxxxxxxxxxxx] has quit Remote closed the connection 20:27 < mmcgrath> #topic Infrastructure -- Eye in know db. - INNODB 20:27 -!- fedbot changed the topic of #fedora-meeting to: Infrastructure -- Eye in know db. - INNODB 20:27 < mmcgrath> ricky: this one's you. Talk about your plans, what's going on, what's going wrong, etc. 20:27 < smooge> is that a rock band? 20:27 < ricky> Any MySQL experts around, by the way? :-) 20:27 < mmcgrath> ricky: abadger1999 is a mysql expert 20:27 < Jeff_S> ricky: for some definition of expert 20:27 < mmcgrath> :-P 20:28 < ricky> Part of the big outages we've seen since the merge seems to be due to mysql backups (and smolt's stats refresh script, which might be a separate problem) 20:28 < ricky> We've seen this behavior with the zabbix database, where the backup would lock entire tables 20:28 < abadger1999> ricky: Yep, of the yum erase '*ysql' ; yum install 'postgres*' variety 20:28 < ricky> abadger1999: Hehe 20:28 * mmcgrath notes we've always had a small problem with backups and outages. But they've been tiny blips. Lately they've been throwing nagios alerts. 20:29 < smooge> how many mysql databases do we have? 20:29 < ricky> We'd like to move to using the --single-transaction option to mysqldump, which combined with InnoDB, should make backups not lock the entire table 20:30 < Jeff_S> ricky: yes! 20:30 < ricky> THe main mysql usage we have is mediawiki, smolt, and zabbix 20:30 < ricky> Although we have a few others for stuff like cacti, prelude/prewikka, etc. 20:30 < Jeff_S> ricky: FWIW, we've also had good luck with http://www.zmanda.com/backup-mysql.html (community edition) 20:30 < smooge> ricky, are they seperate servers or one single one 20:30 -!- kolesovdv [n=kolesovd@xxxxxxxxxxxxx] has joined #fedora-meeting 20:30 < ricky> Jeff_S: Thanks, I'll take a look at that later 20:30 < ricky> smooge: They're all on db1 20:30 < mmcgrath> smooge: all mysql db's are on db1 20:31 < ricky> So far, the biggest pain we've had so far is the host_links table in smolt 20:31 < mmcgrath> ricky: and how big is it? 20:31 < mmcgrath> O:-) 20:31 < ricky> It has above 70M rows, and I haven't gotten a single successful conversion to InnoDB yet. 20:31 < ricky> And the thing with --single-transaction is that the tables need to be InnoDB to be sure that everything gets dumped in a consistent state 20:31 < Jeff_S> but single-transaction will probably solve your main problem of locking the table(s) 20:32 < abadger1999> ricky: We're able to dump that table? Are we able to reload it except as innodb? 20:32 < smooge> wow thats quite a bit 20:32 < mmcgrath> ricky: and what are the downsides to innodb? (space, etc, etc) 20:32 < abadger1999> slower 20:32 < Jeff_S> mmcgrath: slower at certain operations 20:32 < ricky> So the approaches that we've tried so far are: converting using alter table, and sedding a dump to change the table type, and loading it. 20:32 < mmcgrath> how much slower? 20:32 < ricky> The first didn't finish after some large number of hours, and the second is going now. 20:32 < ricky> mmcgrath: I'm actually not that sure about the downsides yet. Apparently loading huge tables is a huge pain. 20:33 < mmcgrath> ricky: I'm going to want render-stats metrics too 20:33 < Jeff_S> mmcgrath: depends on the dataset & queries. the locking though more than makes up for it IMO 20:33 < ricky> Also, some tables needed MyISAM for full text search - the only table affected by this is mediawiki's searchindex tables 20:33 < abadger1999> :-( 20:33 < ricky> (Which is just a copy of another InnoDB table, I believe) 20:33 < mmcgrath> ricky: and, in theory, we'll be able to get rid of that when we have a fedora search engine. 20:33 < ricky> Hopefully. 20:34 < ricky> Anyway, we'll probably have a mysql outage some time in the future once we get a successful test in staging. 20:34 < Jeff_S> mmcgrath: one of our past employees wrote this, I think it explains the reasons for using InnoDB pretty well http://tag1consulting.com/MySQL_Engines_MyISAM_vs_InnoDB 20:34 < mmcgrath> ricky: yeah, how have the other conversions gone? 20:34 < ricky> what might be the case now is that maybe our configs aren't tuned for large innodb tables. 20:34 < smooge> ok what books/sites should I read to catch up how to help this. (DB's are not my specialty :/) 20:35 -!- hanthana [n=hanthana@xxxxxxxxxxxx] has quit Remote closed the connection 20:35 < ricky> mmcgrath: All of the other tables in the smolt db other than host_links have finished in <20 minutes 20:35 < ricky> Apart from the smolt db, most of the mediawiki db is already innodb 20:36 < ricky> The other databases that need conversions are: cacti, prelude._format, prewikka, and transifex (which isn't used anymore anyway) 20:36 < mmcgrath> ricky: I believe I went through and did some innodb conversions back in the day on some of those. 20:36 -!- openpercept_ [n=openperc@fedora/openpercept] has joined #fedora-meeting 20:36 -!- tatica is now known as tatica-out 20:36 -!- sharkcz [n=dan@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] has quit "Ukončuji" 20:36 < ricky> prelude and prewikka are pretty much dispensable since that stuff is still being tested (lmacken even purged and recreated some of those dbs recently) 20:37 < mmcgrath> ricky: how big were those dumps? 20:37 -!- openpercept [n=openperc@fedora/openpercept] has quit Nick collision from services. 20:37 < ricky> So smolt is basically the big hurdle - although I have some questoins about the smolt upgrade and the db changes there 20:37 < ricky> The dump of the smolt database is 2.5G 20:37 * lmacken looks at the time, and rolls in late 20:37 < mmcgrath> ricky: 20:38 < mmcgrath> alter table host modify column cpu_model varchar(80); 20:38 < mmcgrath> alter table host add column cpu_stepping int(11) DEFAULT NULL; 20:38 < mmcgrath> alter table host add column cpu_family int(11) DEFAULT NULL; 20:38 < mmcgrath> alter table host add column cpu_model_num int(11) DEFAULT NULL; 20:38 < mmcgrath> that's the smolt upgrade. 20:38 < ricky> mmcgrath: Oh, OK - that's no problem at all then. 20:38 < ricky> The host table took <20 minutes, so we can do that before or after, and it's fine 20:38 * mmcgrath doesn't really even know what "int(11)" means 20:38 < lmacken> have you guys been using SQLAlchemy-migrate for that stuff? or doing it by hand? 20:38 < mmcgrath> I need to look that up :) 20:39 < mmcgrath> lmacken: honestly I can't stand alchemy-migrate so I've been doing it by hand. 20:39 < lmacken> mmcgrath: heh. I've never used it before 20:39 < mmcgrath> :) 20:39 < mmcgrath> ricky: ok, so anything else on the db front? 20:40 < ricky> Nope, but if anybody knows a lot about MySQL, let us know about your experiences with stuff like this 20:40 < ricky> Jeff_S: Thanks again for the links! 20:40 < mmcgrath> k 20:40 < Jeff_S> ricky: np. I'm glad to have our current DBA lend a hand if needed 20:40 < mmcgrath> #topic Infrastructure -- Posse 20:40 -!- fedbot changed the topic of #fedora-meeting to: Infrastructure -- Posse 20:41 < mmcgrath> So I haven't been as transparent with this as I should be 20:41 < mmcgrath> It's basically this 20:41 < mmcgrath> #link http://teachingopensource.org/index.php/POSSE_2009 20:41 < mmcgrath> we're providing some guests for a week for them to use. 20:41 < mmcgrath> +1 to open source :) 20:41 < ricky> Is it going to be on fasClient? :-) 20:41 < mmcgrath> ricky: nope, they're completely disconnected atm. 20:42 < mmcgrath> this is their first time through this. 20:42 < ricky> Ah, OK 20:42 < mmcgrath> maybe next year. 20:42 < mmcgrath> but all of these guests are on cnode1 20:42 -!- opossum1er [n=opossum1@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] has joined #fedora-meeting 20:42 < mmcgrath> part of the cloud stuff. 20:42 < smooge> what servers are their guest on 20:42 < ricky> Hehe 20:42 < smooge> ah 20:42 < mmcgrath> I ended up not using osuosl1 20:42 < mmcgrath> since it's RHEL5 and for some reason xen+fedora 11 seems to be my white whale. 20:42 -!- opossum1er [n=opossum1@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] has quit Client Quit 20:42 < mmcgrath> but cnode1 was F10, and using KVM worked just fine 20:43 < mmcgrath> Anyone have any other questions on that? 20:43 -!- Pikachu_2014 [n=Pikachu_@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] has quit Read error: 60 (Operation timed out) 20:43 -!- Pikachu_2014 [n=Pikachu_@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] has joined #fedora-meeting 20:43 < mmcgrath> Ok 20:43 < mmcgrath> #topic Infrastructure -- Open Floor 20:43 -!- fedbot changed the topic of #fedora-meeting to: Infrastructure -- Open Floor 20:43 < mmcgrath> anyone have anything they'd like to discuss? 20:44 < lmacken> I'm going to be deploying a new version of bodhi tonight/tomorrow to support EPEL :) 20:44 < lmacken> hopefully we'll be able to start queueing updates up tonight 20:44 < smooge> yeah 20:44 < lmacken> and ideally mashing repos tomorrow 20:45 < mmcgrath> lmacken: sounds good 20:45 < mmcgrath> and on a related note, I need to rebuild relepel1 20:45 * mmcgrath fail built it 20:45 < mmcgrath> anyone have anything else? 20:45 < mmcgrath> smooge: ? 20:46 < smooge> sorry 20:46 < smooge> keyboard problems 20:46 < smooge> I am checking to see what boxes need updates and I am working on seeing what ones I can do 20:46 < smooge> I should have that done by tonight/tomorrow. 20:47 < smooge> After that I am checking to see that func and puppet are working on the boxes 20:47 < smooge> and then finding out all the secret handshakes and such 20:47 < mmcgrath> heheh 20:47 < mmcgrath> fun times 20:47 < smooge> I should have the func done by friday and then it will be time to work on zabbix 20:48 < mmcgrath> smooge: excellent. 20:48 < mmcgrath> Ok, and with that if no one has anything else we'll close in 30 20:48 < smooge> zabbix will be next weeks project 20:48 < smooge> done 20:49 < mmcgrath> ok everyone, thanks for coming! 20:49 < mmcgrath> #endmeeting 20:49 -!- fedbot changed the topic of #fedora-meeting to: Channel is used by various Fedora groups and committees for their regular meetings | Note that meetings often get logged | For questions about using Fedora please ask in #fedora | See http://fedoraproject.org/wiki/Meeting_channel for meeting schedule 20:49 < fedbot> Meeting ended Thu Jul 2 20:49:12 2009 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . 20:49 < fedbot> Minutes: http://www.scrye.com/~kevin/fedora/fedora-meeting/2009/fedora-meeting.2009-07-02-20.00.html 20:49 < fedbot> Log: http://www.scrye.com/~kevin/fedora/fedora-meeting/2009/fedora-meeting.2009-07-02-20.00.log.html
Attachment:
pgpcXjECToRTp.pgp
Description: PGP signature
_______________________________________________ Fedora-infrastructure-list mailing list Fedora-infrastructure-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list