On 2009-12-31 15:13, Noob Centos Admin wrote: > Just an concluding update to anybody who might be interested :) > > My apologies for blaming spamassassin in the earlier email. It was > taking so long because of the real problem. > > Apparently the odd exim processes that was related to the mail loop > problem I nipped was still the culprit. I had overlooked the fact that > by the time I caught onto the mail loop issue, there were actually > hundreds if not thousands of bounced and rebounced messages in the > queue already. Attempting to deliver these messages queued before I > terminated the mail loop was what those exim processes were trying to > do. > > This would had been ok if not for the other problem. The user > apparently went on 2 week vacation since 15th and thought it was a > good idea to enlarge his mailbox before doing so. So there was this > 2.5GB mailbox choked full of both valid& rebounced mails, plus the > queue of more rebounced mails. So every time exim attempted to add the > queued mails to the user's account, the quota system rejected it. The > cpu load was probably due to this never ending ping pong match between > exim and the quota. > > Yeah, I can't help but feel this must be such a noob mistake allowing > that to develop without realizing it. > > Now that I've purged the queue of those bounced messages and other > housekeeping for that user, server load has finally gone back to the > expected sub 1.0 levels so I can finally go and enjoy my holiday :) > > > > On 1/1/10, Noob Centos Admin<centos.admin@xxxxxxxxx> wrote: >> I initiated services shutdown as previously planned and once the >> external services like exim, dovecot, httpd, crond (because it kept >> restarting these services), the problem child stood out like a sore >> thumb. >> >> There was two exim instances that didn't go away despite service exim >> stop. Once I killed these two PID, the load average started dropping >> rapidly. After a minute or so, the server went back to a happy 0.2~0.3 >> load and disk activity became almost negligible. >> >> I think these, orphaned? zombied?, exim instances were related to a >> mail loop problem I discovered earlier today where one of my client on >> holiday had a full mailbox and keep bouncing mails from a contact >> whose site was suspended. Although I terminated that loop, it seemed >> that exim had gotten those two instances stuck in limbo sucking up >> processing power and hitting the disk somewhere unknown since they >> weren't showing up in my exim logs. >> >> After observing a while, I brought the services back and once exim got >> started, my load went back to 2.x ~ 3.x. Unfortunately while I was >> typing this email, I realize it didn't stop there. I'm up to 4.x ~ 5.x >> load level by now. >> >> So the application that is the cause of the load is definitely exim, >> more specifically I think it's spam assassin because now that the mail >> logs entries are slow, I can read the spamd details and mails are >> taking between 3 to 8 seconds to be checked. >> >> Thanks again to everybody who had offer suggestions and advice and do >> have a Happy New Year :) >> >> >> On 1/1/10, Noob Centos Admin<centos.admin@xxxxxxxxx> wrote: >>> Hi, >>> >>>> I do not know about now but I had to unload the modules in question. >>>> Just clearing the rules was not enough to ensure that the netfilter >>>> connection tracking modules were not using any cpu at all. >>> >>> Thanks for pointing this out. Being a noob admin as my pseudonym >>> states, I'd assumed stopping apf and restarting iptables was >>> sufficient. I'll have to look up unloading module later. >>> >>>> /me shrugs. When I was the mta admin at Outblaze Ltd. (messaging >>>> business now owned by IBM and called Lotus Live) spammers always ensured >>>> I got called. All they do is just press the big red button (aka start >>>> the script/system) and then go and play while I would have to deal with >>>> whatever was started. >>> >>> Based on the almost precise timing of around 9:30 to 5:30 India time, >>> I'm inclined to think in my case it wasn't so much a spammer pressing >>> a red button but a compromised machine in an office starting up when >>> the user gets into office and knocks off on time at 5:30 :D >>> >>>> I remember only one occasion when the spams were >>>> launched but neutralized very soon because they were pushing a website >>>> and I found a sample real early and so the anti spam system could just >>>> dump the spams and knock out accounts being used to send the crap. >>> >>> Could I ask how do I knock out the accounts sending the crap if they >>> are not within my systems? >>> >>>> First, try rmmod'ing the netfilter modules after you have cleared away >>>> the state related rules to make sure that you are only using static >>>> rules in netfilter...unless you have done that already.. >>> >>> I think I'm only using static rules because after I restart iptables, >>> I would then do a service iptables status to check my rules were in, >>> and that list was very short compared to when APF was active. >>> >>> The good news is, I think I've fixed the big problem after doing my >>> shutdown tests and returned to the original problem. >>> >> If you (and other people) have learned, it was worth it :). Ugo _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos