Patrick McHardy wrote: > Pablo Neira Ayuso wrote: >> Patrick McHardy wrote: >>> Pablo Neira Ayuso wrote: >> >>> But unless I'm missing something, there's nothing wrong with this >>> as long as the error is ignored. The fact that something was received >>> by some listener doesn't have any meaning anyways, it might have >>> been "ip monitor". Which somehow raises doubt about your proposed >>> interface change though, I think anything that wants a reliable >>> answer whether a packet was delivered to a process handling it >>> appropriately should use unicast. >> >> Don't get me wrong, I agree with you that all netlink_broadcast callers >> in the kernel should ignore the return value... >> >> ... unless they have "some way" (like in Netfilter) to make event >> delivery reliable: I have attached a patch that I didn't send you yet, >> I'm still reviewing and testing it. It adds an entry to /proc to enable >> reliable event delivery over netlink by dropping packets whose events >> were not delivered, you mentioned that possibility once during one of >> our conversations ;). > > I know, but in the mean time I think its wrong :) The delivery > isn't reliable and what the admin is effectively expressing by > setting your sysctl is "I don't have any listeners besides the > synchronization daemon running". So it might as well use unicast. No :), this setting means "state-changes over ctnetlink will be reliable at the cost of dropping packets (if needed)", it's an optional trade-off. You may also have more listeners like a logging daemon (ulogd), similarly this will be useful to ensure that ulogd doesn't leak logging information which may happen under very heavy load. This option is *not* only oriented to state-synchronization. Using unicast would not do any different from broadcast as you may have two listeners receiving state-changes from ctnetlink via unicast, so the problem would be basically the same as above if you want reliable state-change information at the cost of dropping packets. BTW, the netlink_broadcast return value looked to me inconsistent before the patch. It returned ENOBUFS if it could not clone the skb, but zero when at least one message was delivered. How useful can be this return value for the callers? I would expect to have a similar behaviour to the one of netlink_unicast (reporting EAGAIN error when it could not deliver the message), even if the return value for most callers should be ignored as it is not of any help. >> I'm aware of that this option may be dangerous if used by a buggy >> process that trigger frequent overflows but it the cost of having >> realible logging for ctnetlink (still, this behaviour is not the one by >> default!). >> >> And I need this option to make conntrackd synchronize state-changes >> appropriately under very heavy load: I've testing the daemon with these >> patches and it reliably synchronizes state-changes (my system were 100% >> busy filtering traffic and fully synchronizing all TCP state-changes in >> near real-time effort, with a noticeable performance drop of 30% in >> terms of filtered connections). > > So you're dropping the packet if you can't manage to synchronize. > Doesn't that defeat the entire purpose of synchronizing, which is > *increasing* reliability? :) This reduces communications reliability a bit under very heavy load, yes, because it may drop some packets but it adds reliable flow-based logging accounting / state-synchronization in return. Both refers to reliability in different contexts. In the end, it's a trade-off world. There's some point at which you may want to choose which one you prefer, reliable communications if the system is under heavy load or reliable logging (no leaks in the logging) / state-synchronization (the backup firewall is able to follow state-changes of the master under heavy load). In my experiments, reaching 100% of CPU consumption, the number of packets drop where in fact very few indeed, but the harm in logging and state-synchronization reliability is considerable in the long run, as the backup starts getting unsynchronized (thus, becoming useless to increase cluster reliability but consuming resources) and you also have to interpret log information without forgetting the margin of error in the case of logging. BTW, I did not tell you, I can give you access to my testbed platform at any time, of course ;). -- "Los honestos son inadaptados sociales" -- Les Luthiers -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html