On Tue, 26 Aug 2008 23:40:58 +0300 (EEST) "Ilpo Järvinen" <ilpo.jarvinen@xxxxxxxxxxx> wrote: > If you want to, a tcpdump from normal, working case wouldn't hurt either > to show the "normal pattern" on network level and that is trivial to > produce in no time now that you know the commands etc. I guess... :-) Ok, there it is: http://www.abusar.org/htb/dump-normal.log Just the port 995... I checked email, then received a message, checked again, just the normal behaviour. > They might not be that interested until we have something more concrete > than what we know currently... :-) Ok :) And you're right, because if I disable frto and htb *and* the problem has gone, there's a huge chance to be something related to kernel. Or a mix of kernel and user space problem which happens just when frto and/or htb are used. > Can you explain a bit more. Does it resolve during it or some time after > it? And more importantly how do you know that it resolves? Ie., what is > the normal behavior (be more specific than "it works" :-), how do know > that it's working). Ok. For example: 1) the connection is normal, then suddenly it stalls. I cannot receive mail, nor download nntp messages, nor access ftp etc. 2) I do on my client machine a "nmap -sS server" and... 3) ...imediatelly the connection is not stalled anymore. Now I remembered one thing and I'd like to make a question (I hope it isn't a stupid question): dynticks (tickless) were implemented for x86-64 in 2.6.24 kernel and I started to use dynticks in 2.6.24. Could it be affecting the server behaviour? I use dynticks (enabled) on all my machines, but does it make sense to use in a server environment? Could the dynticks cause this? Until now, I don't think so, but... who knows? http://kernelnewbies.org/Linux_2_6_24#head-4edc562fa1b9fa8e5da5adaf1beab057237c325d > It seems that either we lack some traffic between the parties or simply > need to find out what the userspace is doing, and in the latter case what > happens in the network might not be relevant at all. Is there possibility > that we miss an alternative route by using the host rule for tcpdump (at > the server)? Nmap starts at 22:26:26.613098, the last packet in the client > log is at 22:26:01.452842. Alternatively, the port 995 was not the right > one to track (though there's clearly this on network level visible problem > with it too)... :-( I tracked the 995 port, because I have problems reading email pro pop3s (995). Should I do it different with tcpdump? > You might jump into conclusions too quickly every now and then, more > time might be needed to really ensure something is working. Obviously > if any non-workingness is noticed, it's always a counter-proof even if > long working periods occur in between. Ok. It seems a complex issue. You're right. I need more patience ;) > In syscall terms this ListenOverflow means that int listen(int sockfd, int > backlog); (see man -S 2 listen) is given some size as backlog for those > connections that are not yet accept()'ed, and that is exhausted when the > ListenOverflow gets incremented (ie., if I'm not completely wrong :-)). Hmm interesting. > You might want to look on dovecot how to make it accept more concurrent > connections, perhaps the login_max_processes_count might the right one > (I quickly glanced http://wiki.dovecot.org/LoginProcess) though this is > somewhat site configuration dependant according to that page. Yes, I have login_max_processes_count = 128 (the default) and I have just a few users (just 10 users), so I think it's not the problem. > You could try setting up some script which does something along these > lines and then redirect its during the event to some file (+ tcpdumping > the thing obviously): > > while [ : ]; do > date "+%s.%N" > cat /proc/net/{netstat,snmp} > sleep 1 > done Ok. You're helping a lot. Thanks Ilpo ;) -- -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html