Re: [RFC] High Performance Packet Classifiction for tc framework

jamal <hadi@cyberus.ca> · 03 Aug 2003 14:14:28 -0400

Hi,

Apologies for late response. Its funny how i thought i was going to have
more time in the last 2 weeks but due to bad scheduling that wasnt the
case.

On Thu, 2003-07-17 at 09:13, Michael Bellion and Thomas Heinz wrote:
> Hi Jamal
> 
> You wrote:
> > This is good.I may have emailed you about this topic before?
> 
> Yes, but at that time we had not any concrete plans to
> integrate hipac into tc. We focussed on making nf-hipac as
> expressive as iptables first.
> 

Good goal.

> > It's a classifier therefore it makes sense ;->
> 
> :-)
> 
> > nice. What would be interesting is to see your rule update rates vs
> > iptables (i expect iptables to suck) - but how do you compare aginst any
> > of the tc classifiers for example?
> 
> Regarding the rule update rates we have not done any measurements
> yet but nf-hipac should be faster than iptables (even more when
> we have implemented the selective cloning stuff). On the other
> hand we are probably slower than tc because in addition to the
> insert operation into an internal chain there is the actual hipac
> insert operation. The insertion in the internal chain is quicker
> than the tc insert operation because we use doubly linked lists.
> 

I think i will have to look at your code to make comments.

> Regarding the matching performance one has to consider a few things.
> The currently existing tc classifiers are an abstraction for rules
> (iptables "slang") whilst hipac is an abstraction for a set of rules
> (including the chain semantics known from iptables), i.e. a table in
> the iptables world. 

Not entirely accurate. Depends which tc classifier. u32 hash tables are
infact like iptables chains.
Note, the concept of priorities which is used for conflict resolution as
well as further separating sets of rules doesnt exist in iptables.

> Of course it is possible to have some sort
> of extended classifying in tc too, 

True, i overlooked this. 

> i.e. you can add several fw or u32
> filters with the same prio which allows the filters to be hashed.

You can also have them use different priorities and with the continue
operator first clasify based on packet data then on metadata or on
another packet header filter.

> One disadvantage of this concept is that the hashed filters
> must be compact, i.e. there cannot be other classifiers in between.

I didnt understand this. Are you talking about conflict resolving of
overlapping filters?

> Another major disadvantage is caused by the hashing scheme.
> If you use the hash for 1 dimension you have to make sure that
> either all filters in a certain bucket are disjoint or you must have
> an implicit ordering of the rules (according to the insertion order
> or something). This scheme is not extendable to 2 or more dimensions,
> i.e. 1 hash for src ip, #(src ip buckets) many dst ip hashes and so
> on, because you simply cannot express arbitrary rulesets.

If i understood you - you are refering to a way to reduce the number of
lookups by having disjoint hashes. I suppose for something as simple as 
a five tuple lookup, this is almost solvable by hardcoding the different
fields into multiway hashes. Its when you try to generalize that it
becomes an issue. 

> Another general problem is of course that the user has to manually
> setup the hash which is rather inconvenient.
> 

Yes. Take a look at Werners tcng - he has a clever way to hide things
from the user. I did experimentation on u32 with a kernel thread which
rearranged things when they seemed out of balance but i havent
experimented with a lot of rules.

> Now, what are the implications on the matching performance:
> tc vs. nf-hipac? As long as the extended hashing stuff is not used
> nf-hipac is clearly superior to tc. 

You are refering to u32. You mean as long as u32 stored things in a
single linked list, you win - correct?

> When hashing is used it _really_
> depends. If there is only one classifier (with hashing) per interface
> and the number of rules per bucket is very small the performance should
> be comparable. As soon as you add other classifiers nf-hipac will
> outperform tc again.
> 

If we take a simple user interface abstraction like tcng which hides the
evil of u32 and then take simple 5 tuple rules - i doubt you will see
any difference. For more generic setup, the kernel thread i refer to
would work - but may slow insertion. 

> >>The tc framework is very flexible with respect to where filters can be
> >>attached. Unfortunately this cannot be mapped into one HIPAC data
> >>structure. Our current design allows to attach filters anywhere but
> >>only the filters attached to the top level qdisc would benefit from the
> >>HIPAC algorithm. Would this be a noticeable restriction?
> > 
> > I dont think so, but can ytou describe this restriction?
> 
> Well, we thought a little more about the design and came to the
> conclusion that it is not necessary to have a HIPAC qdisc at root
> but it suffices to ensure that the HIPAC classifier occurs only
> once per interface. As you can guess from the last sentence we
> dropped the HIPAC qdisc design and changed to the following scheme:
> 
> - there no special HIPAC qdisc at all :-)
> - the HIPAC classifier is no longer a simple rule but represents
>    the whole table
> - the HIPAC classifier can occur in any qdisc but at most once
>    per interface
>
> So, basically HIPAC is just a normal classifier like any other
> with two exceptions:
>    a) it can occur only once per interface
>    b) the rules within the classifier can contain other classifiers,
>       e.g. u32, fw, tc_index, as matches
> 

But why restriction a)? 
Also why should we need hipac to hold other filters when the
infrastructure itself can hold the extended filters just fine?
I think you may actually be trying to say why somewhere in the email,
but it must not be making a significant impression on my brain.

> There is just one problem with the current tc framework. Once
> a new filter is inserted into the chain it is not removed even
> if the change function of the classifier returns < 0
> (2.6.0-test1: net/sched/cls_api.c: line 280f).
> This should be changed anyway, shouldn't it?
> 

Are you refering to this piece of code?:
----
        err = tp->ops->change(tp, cl, t->tcm_handle, tca, &fh);
        if (err == 0)
                tfilter_notify(skb, n, tp, fh, RTM_NEWTFILTER);

errout:
        if (cl)
                cops->put(q, cl);
        return err;
---

change() should not return <0 if it has installed the filter i think.
Should the top level code be responsible for removing filters?

> >>- new HIPAC classifier which supports all native nf-hipac matches
> >>  (src/dst ip, proto, src/dst port, ttl, state, in_iface, icmp type,
> >>  tcpflags, fragments) and additionally fwmark
> > 
> > I would think for cleanliness fwmark or any metadata related
> > classification would be separate from one that is based on packet bits.
> 
> Since our classifier represents a table of rules and the rules are
> based on different matches, like src/dst ip and also fwmark (native)
> or u32 (subclassifier as match), this is definitely a clean design.
> 

I think we need to have the infrastructure in the main tc code. Its
already there - may not be very clean right now. 

> >>- the HIPAC classifier can only be attached to the HIPAC qdisc and vice
> >>  versa the HIPAC qdisc only accepts HIPAC classifiers
> > 
> > <puke> 
> > We do have an issue with being able to do extended classification
> > but building a qdisc for it is a no no. Building a qdisc that will force
> > other classifier to structure themselves after it is even a bigger sin.
> > Look at the action code i have (i can send you an updated patch); a
> > better idea is to make extended classifiers an action based on another
> > filter match. At least this is what i have been toying with and i dont
> > think it is clean enough. what we need is to extend the filtering
> > framework itself to have extended classifiers.
> 
> The new design should be much cleaner. Originally we also thought about
> making HIPAC a classifier only but we expected some problems related
> to this approach. Finally we discovered that this is not the case :)
> 

Consider what i said above. I'll try n cobble together some examples to
demonstrate (although it seems you already know this).
To allow for anyone to install classifiers-du-jour without being
dependet on hipac would be very useful. So ideas that you have for
enabling this cleanly should be moved to cls_api.

cheers,
jamal

-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html