Re: null-pointer deref in ulogd2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

ARGH! I found my problem. Apparently Postgres was too slow on INSERT. Although the CPU load looked fine (and even IOWait wasn't out of the ordinary, 20% on one CPU) it seems to have blocked. Sacrificing consistency for speed by setting fsync=no in postgres the IOwait went down to 0.5% and I now have 100 flows, all of them with start and end!

Looks like I spoke too early :-(

We have now passed peak-time, which means about 450 Mbps traffic, 60k concurrent sessions and about 300 flows/s in a 1hour average.

First of all, ulogd has segfaulted again. Unfortunately I didn't get a coredump, I've restarted it in gdb now.

Second, the number of flow records without any time stamp is getting higher and higher again, with now 20% lacking either start or endtime

ulogd=# SELECT count(*) FROM ulog2_ct;
  count
---------
 3278208
(1 row)

ulogd=# SELECT count(*) FROM ulog2_ct WHERE flow_start_sec IS NULL;
 count
--------
 270690
(1 row)

ulogd=# SELECT count(*) FROM ulog2_ct WHERE flow_end_sec IS NULL;
 count
--------
 306740
(1 row)

This seems to get worse the longer ulogd runs, shortly before the segfault there were 8000 flows without end_time in a row. The recent ones are fine again.

I'm still getting (very ocasionally)

Wed Jun 24 00:31:21 2009 <5> ulogd_inpflow_NFCT.c:656 Maximum buffer size (17367040) in NFCT has been reached. Please, consider rising `netlink_socket_buffer_size
` and `netlink_socket_buffer_maxsize` clauses.

does it make sense to increase the buffer even more? If 17MB of buffer aren't enough I don't think it can keep up with any setting. And now that fsync is disabled in Postgres the box is really not that heavily loaded. CPUs 3&4 (serving the interrupts of the two NICs) are near 100% interrupt load at peak time, but 1&2 are >80% idle.

Does anyone else run this setup with similar numbers and can shed some light on tuning?

Oh, and we're dumping conntrack -L every minute. Works fine during the day with 30k connections, but starts to frequently segfault with 60k connections in the evening. Also trying to get a coredump now.

Bernhard
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux