Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > On Mon, Oct 10, 2016 at 9:28 AM, Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> >> So as I already answered to Dave, I'm not actually sure that this was >> the buggy code, or that my patch would make any difference at all. > > My patch does seem to fix things, and in fact the warning about "hook > not found" now triggers. > > So I think the bug really was that the singly-linked list handling > code did not correctly handle the case of not finding the entry, and > then freed (incorrectly) the last one that wasn't actually unlinked. > > In fact, I get quite a few warnings (56 total) about 30 seconds after > logging in: > > [ 54.213170] WARNING: CPU: 1 PID: 111 at net/netfilter/core.c:151 > nf_unregister_net_hook+0x8e/0x170 > ... repeat 54 times ... > [ 54.445520] WARNING: CPU: 7 PID: 111 at net/netfilter/core.c:151 > nf_unregister_net_hook+0x8e/0x170 > > and looking in the journal, the first one is (again) immediately > preceded by that systemd-hostnamed service stopping: > > Oct 10 11:45:47 i7 audit[1546]: USER_LOGIN > ... > Oct 10 11:46:11 i7 audit[1]: SERVICE_STOP pid=1 uid=0 > auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 > msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" > hostname=? addr=? terminal=? res=success' > Oct 10 11:46:13 i7 pulseaudio[1697]: [pulseaudio] bluez5-util.c: > GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did > not receive a reply. Possible causes include: the remote application > did not send a reply, the message bus security policy blocked the > reply, the reply timeout expir > Oct 10 11:46:13 i7 dbus-daemon[1003]: [system] Failed to activate > service 'org.bluez': timed out > Oct 10 11:46:20 i7 audit[1]: SERVICE_STOP pid=1 uid=0 > auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 > msg='unit=systemd-hostnamed comm="systemd" > exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? > res=success' > Oct 10 11:46:20 i7 kernel: ------------[ cut here ]------------ > Oct 10 11:46:20 i7 kernel: WARNING: CPU: 1 PID: 111 at > net/netfilter/core.c:151 nf_unregister_net_hook+0x8e/0x170 > > so I do think it's something to do with some network startup service > thing (perhaps dhcp, perhaps chrome, who knows) as I do my initial > login. > > David - I think that also explains what was wrong with the old code. > In the old code, this loop: > > while (hooks_entry && nf_entry_dereference(hooks_entry->next)) { > > would exit with "hooks_entry" pointing to the last list entry (because > ->next was NULL). Nothing was ever unlinked in the loop itself, > because it never actually found a matching entry, but then after the > loop it would free that last entry because it *thought* that was the > match. > > My list rewrite fixes that. > > Anyway, I'm assuming it will come to me from the networking tree after > more testing by the maintainers. You can add my > > Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > to the patch, though. > > David, if you want me to just commit that thing directly, I can > obviously do so, but I do think somebody should look at > > (a) that I actually got the priority list ordering right on the > insertion side It looks correct. Reviewed-by: Aaron Conole <aconole@xxxxxxxxxx> > (b) what it is that makes it try to unregister that hook that isn't > on the list in the first place This is a still problem, I think. I wasn't able to reproduce the issue on a fedora-23 VM. My fedora 24 bare-metal system does trigger this, though. Not sure what changed in userspace/kernel interaction side (not an excuse, but just an observation). > but on the whole I consider this issue explained and solved. I'll > continue to run with my patch on my machine (just not committed). Okay. Very sorry for this, again. > Linus -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html