Hi, I've posted the message below to the netfilter list yesterday. Since Patrick asked to send crash reports also to netfilter-devel, I'm also posting it here now. Please let me know if you need additional information. -Rainer ------------------------------------------------------------------- Hi, I'm using conntrackd and keepalived (for a pair of redundant firewalls in active/backup configuration), and from time to time I experience kernel panics or other random system crashes. I'm new to conntrackd, so its likely that I made just some mistakes in my configuration. I'm getting the crashes when keepalived switches the backup host to active. Manually I can trigger the kernel panic when I execute "conntrackd -c" on the backup host (sometimes "conntrackd -c" executes sucessfully, but it crashes at the latest when I repeat the command a few times). This is my setup: * Ubuntu Linux with the distribution's kernel 2.6.24-18-server * libnfnetlink 0.0.38 (compiled from sources) * libnetfilter-conntrack 0.0.94 (compiled from sources) * conntrack-tools 0.9.7 (compiled from sources) My conntrackd.conf is attached below. Does anybody have an idea why I get these crashes and what I could do to avoid them? Best regards, -Rainer ---- /etc/conntrackd.conf ----- Sync { Mode FTFW { ResendBufferSize 262144 CommitTimeout 180 ACKWindowSize 20 } Multicast { IPv4_address 225.0.0.50 IPv4_interface 10.0.1.204 # IP of dedicated link Interface eth0 Group 3780 } Checksum on } General { HashSize 8192 HashLimit 65535 LockFile /var/lock/conntrack.lock UNIX { Path /tmp/sync.sock Backlog 20 } SocketBufferSize 262142 SocketBufferSizeMaxGrown 655355 } IgnoreTrafficFor { IPv4_address 127.0.0.1 # loopback IPv4_address 10.0.1.203 IPv4_address 10.0.1.204 IPv4_address 10.0.0.1 IPv4_address 10.9.62.1 IPv4_address 10.9.62.203 IPv4_address 10.9.62.204 } IgnoreProtocol { ICMP IGMP VRRP } --------------------------------------------- Some additional information: I've now turned on logging to syslog in conntrackd.conf to see if I can get some more information on my problem. 1.) Now, I can see lots of the following messages in the syslog: Jun 9 18:52:49 fw1b conntrack-tools[7385]: Received seq=1213034051 before expected seq=1213034052 2.) When I do "conntrackd -c" I get: Jun 9 18:52:50 fw1b conntrack-tools[10678]: committing external cache Jun 9 18:52:50 fw1b conntrack-tools[10678]: commit: Invalid argument [...] Jun 9 18:52:50 fw1b conntrack-tools[10678]: commit: Cannot allocate memory [...] Jun 9 18:52:50 cfw1b conntrack-tools[10678]: Committed 2 new entries Jun 9 18:52:50 cfw1b conntrack-tools[10678]: 89 entries can't be committed 3.) Since I turned on logging "conntrackd -c" now seems to be more stable. In the first moment I thought my problem was fixed. But then, I started a script which executed this command repeatedly in a loop. It eventually triggered a kernel oops: # while sleep 1 ; do conntrackd -c ; done fw1b kernel: [ 6714.379206] ------------[ cut here ]------------ fw1b kernel: [ 6714.381285] invalid opcode: 0000 [#1] SMP fw1b kernel: [ 6714.388793] Process kjournald (pid: 2267, ti=c79ac000 task=c5121140 task.ti=c79ac000) fw1b kernel: [ 6714.388824] Stack: c5c40a80 00000000 c5c40a80 00000000 c1422000 c5121140 c50e0000 00000002 fw1b kernel: [ 6714.389418] c79adf84 c79adf7c 00000000 c04980e0 c049b480 c049b480 c049b480 c79adf88 fw1b kernel: [ 6714.390152] 00000286 c013b547 c79882ec ffffffff c79882ec 00000286 c013b5c5 00000286 fw1b kernel: [ 6714.391701] Call Trace: fw1b kernel: [ 6714.392871] [<c013b547>] lock_timer_base+0x27/0x60 fw1b kernel: [ 6714.393652] [<c013b5c5>] try_to_del_timer_sync+0x45/0x50 fw1b kernel: [ 6714.394210] [<c8ace740>] kjournald+0xa0/0x200 [jbd] fw1b kernel: [ 6714.394780] [<c0145fc0>] autoremove_wake_function+0x0/0x40 fw1b kernel: [ 6714.395354] [<c8ace6a0>] kjournald+0x0/0x200 [jbd] fw1b kernel: [ 6714.395907] [<c0145d02>] kthread+0x42/0x70 fw1b kernel: [ 6714.396440] [<c0145cc0>] kthread+0x0/0x70 fw1b kernel: [ 6714.396994] [<c010900b>] kernel_thread_helper+0x7/0x10 fw1b kernel: [ 6714.397580] ======================= fw1b kernel: [ 6714.398150] Code: ff f3 90 8b 03 a9 00 00 20 00 0f 84 1f f5 ff ff eb ef 0f 0b eb fe f3 90 8b 03 a9 00 00 20 00 0f 84 af f3 ff ff eb ef 0f 0b eb fe <0f> 0b eb fe 0f 0b eb fe 56 53 89 d3 8d 34 90 eb 16 8d b4 26 00 fw1b kernel: [ 6714.399468] EIP: [<c8acb958>] journal_commit_transaction+0xd88/0xd90 [jbd] SS:ESP 0068:c79adf2c At first glance this oops seems to unrelated because it happens within kjournald. But is triggered by the conntrackd -c command, so I suspect (rather naively) that conntrackd calls some kernel function which mixes up some kenel memory (stack?) causing a crash later on. Does anybody have a hint what could be wrong with my setup? Best regards, -Rainer