Re: scheduling while atomic followed by oops upon conntrackd -c execution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Pablo,

On 04/03/2012 11:01, Pablo Neira Ayuso wrote:
Hi Kerin,

On Sat, Mar 03, 2012 at 06:47:27PM +0000, Kerin Millar wrote:
Hi,

On 03/03/2012 13:30, Pablo Neira Ayuso wrote:
I just posted another patch to the ML that is a relative fix to
Jozsef's patch. You have to apply that as well.

I've now tested 3.3-rc5 with the addition of the above mentioned
follow-on patch. The behaviour during conntrackd -c execution is
clearly much improved - in so far as it doesn't generate much noise
- but the crash that follows remains. Here's a netconsole capture:-

http://paste.pocoo.org/raw/560439/

Great to know :-).

I apologize but I think I may have led you astray on the nf_nat issue. At the time of submitting my original report, I now believe that the nf_nat module wasn't loaded prior to starting conntrackd, although it was definitely available. For all tests that followed, however, I am entirely certain the the nf_nat module was loaded in advance. The upshot is that my claim that things had improved may have been premature; I need to specifically test under both circumstances to be sure that things are improving. That is, both with and without the module loaded in advance.

Following my own advice then, I first tried going through my test case *without* loading nf_nat in advance. Alas, conntrackd -c triggered hard lockups and didn't return to prompt. Here are the results:-

http://paste.pocoo.org/raw/561350/

In case it matters, the existing ssh session continued to respond to input but I was no longer able to initiate any new sessions.


Regarding your previous email, I'm sorry, by reading your email I
thought you were using 2.6.32 which was not the case, your
configuration is perfectly reasonable.

It seems we still have problems regarding early_drop, but this time
with reliable event delivery enabled (15 seconds is the time that
is required to retry sending the destroy event).

If you can test the following patch, I'll appreciate.

Gladly. I applied the patch to my 3.3-rc5 tree, which is still carrying the two patches discussed earlier in the thread. I then went through my test case under normal circumstances i.e. all firewall rules in place, nf_nat confirmed present before conntrackd etc. Again, conntrackd -c did not return to prompt. Here are the results:-

http://paste.pocoo.org/raw/561354/

Well, at least there was no oops this time. I should also add that the patch was present for both of the tests mentioned in this email.

---
Incidentally, I found out why the internal cache on the master was filling up to capacity. It was apparently due to the use of "iptables -I PREROUTING -t raw -j CT --ctevents assured". Perhaps I'm missing something but doesn't this stop events such as new and destroy from being propagated? An inspection with conntrack -E suggests so. Once I removed the above rule, I could see destroy events being propagated and the number of active connections in the cache no longer exceeded my chosen limit of 2097152 ...

# conntrack -S | head -n1; conntrackd -s | head -n2
entries                 725826
cache internal:
current active connections:          1409472

Whatever the case, I'm quite happy to go without this rule as these systems are coping fine with the load incurred by conntrackd.

Cheers,

--Kerin

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux