Pablo Neira Ayuso wrote:
Patrick McHardy wrote:
- /* New conntrack */
- IPCT_NEW_BIT = 0,
- IPCT_NEW = (1 << IPCT_NEW_BIT),
-
-...
+ IPCT_NEW = 0, /* new conntrack */
Why this change? Further down, you change the code to use
(1 << IPCT_*), which isn't really an improvement.
Oh, this is not intended to be an improvement. I needed to change this
to use bitopts operations, that's all. Or I could use IPCT_*_BIT and
change every reference to this in the whole conntrack table, but the
patch would be much bigger and noisy :).
The major part is in ctnetlink I think, and you do seem to touch every
one of them :) But OK, using _BIT for all the events doesn't seem too
appealing either.
+
+ e = nf_ct_ecache_find(ct);
+ if (e == NULL)
+ return;
+
+ set_bit(event, &e->cache);
This looks quite expensive, given how often this operation is performed.
Did you benchmark this?
I'll do that benchmark. I was initially using nf_conntrack_lock to
protect the per-conntrack event cache, I think that use bitops is a
bit better?
I actually meant the lookup done potentially multiple times per packet.
But I incorrectly thought that it was more expensive, that seems fine.
@@ -8,12 +8,14 @@ enum nf_ct_ext_id
NF_CT_EXT_HELPER,
NF_CT_EXT_NAT,
NF_CT_EXT_ACCT,
+ NF_CT_EXT_ECACHE,
NF_CT_EXT_NUM,
Quoting nf_conntrack_extend.c:
/* This assumes that extended areas in conntrack for the types
whose NF_CT_EXT_F_PREALLOC bit set are allocated in order */
Is that actually the case here? It might be beneficial to move
this before accounting if possible, I guess its used more often.
I think that accounting information is updated more often. Events are
only updated for very few packet specifically the setup and the
tear-down packets of a flow.
No, events are only sent to userspace every seldom. But f.i. TCP
conntrack generates at least one event per packet.
But what I actually meant was that its used more often I think.
Never mind, also forget about the PREALLOC question, I should
have read what I pasted :) Of course you could add the PREALLOC
flag, when events are enabled you add the extension for every
conntrack anyways.
@@ -738,6 +739,9 @@ nf_conntrack_in(struct net *net, u_int8_t pf,
unsigned int hooknum,
NF_CT_ASSERT(skb->nfct);
+ /* We may have pending events, deliver them and clear the cache */
+ nf_ct_deliver_cached_events(ct);
How does this guarantee that an event will be delivered in time?
As far as I can see, it might never be delivered, or at least not
until a timeout occurs.
Yes, that's the idea. In short, if we have cached events, we trigger a
delivery via ctnetlink. If the delivery fails, we keep the cached
events and the next packet will trigger a new delivery. We can lose
events but at worse case, the destroy will be delivered.
OK, so this is essentially replacing the delivery we did previously when
beginning to cache events for a different conntrack. Thats fine and
necessary
in case a packet triggering an event didn't made it past POST_ROUTING.
If we add a new conntrack extension to store the creation and
destruction time of the conntrack entries, we can have reliable
flow-accouting since, at least, the destroy event will be delivered.
In the case of the state synchronization, we aim to ensure that
long-standing flows survive failures, thus, under event loss, the
backup nodes would gett the state after some tries, specially for
long-standing flows since every packet would trigger another delivery
in case of problems.
*conntrackd* aims at that I think :) Anyways, on second - third thought, I
agree that this is fine, the previous guarantees weren't any stronger.
BTW, I have removed that line locally. So the next delivery try for
pending events is done only in nf_conntrack_confirm(), not twice, one
in nf_conntrack_in() and nf_conntrack_confirm() as it happens in this
patch.
That means we're only delivering events if a packet actually made
it through. I guess this is fine too, ideally we wouldn't even have
state transistions.
@@ -1123,6 +1123,8 @@ ctnetlink_change_conntrack(struct nf_conn *ct,
struct nlattr *cda[])
err = ctnetlink_change_helper(ct, cda);
if (err < 0)
return err;
+
+ nf_conntrack_event_cache(IPCT_HELPER, ct);
Why are we suddenly caching a lot more events manually?
Currently, in user-space triggered events, we are including in the
event message some fields that may not have been updated. Now we can
provide more accurante events by notifying only the conntrack object
fields that have been updated.
The patch is already pretty large, please seperate that part if
doesn't has to be in this patch to make it work.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html