Re: Some more test on ingress, ifb, fwmark

Linux Advanced Routing and Traffic Control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2012-05-23 at 17:28 +0200, Marco Gaiarin wrote:
> Mandi! John A. Sullivan III
>   In chel di` si favelave...
> 
> Ok, i'm ready, some comments on u32 script for ingress. First,
> reference to the author:
> 
> > # tcfilters
> > # Version 0: February 22, 2012; John A. Sullivan III
> 
> 
> I've read your script and also http://b42.cz/notes/u32_classifier/,
> that still seems the only decent reference to u32.
> 
> I've supposed that you use the same ifb interface for many real
> interfaces, so i think ${IH} it is only an integer to keep things more
> readable. Seems a max 2-digit integer are needed.
> 
> Go forward.
> 
> 
> 1) create three filter lists, with handles (''number'') ${IH}6:, ${IH}7:,
>  ${IH}8: ; because we don't need heavy matching on src or dst
> addresses, we create lists with a (minimal) 1 buckets per list, so will
> not use hashing (hadles a:b:c will have b=0).
> 
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 2 handle ${IH}6: u32 divisor 1
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 2 handle ${IH}7: u32 divisor 1
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 3 handle ${IH}8: u32 divisor 1
> 
> 
> 2) add filter on the ''main'' list:
> 
>   Directly match 'netkey' protocol and classify on ${IH}:50
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 1 u32 match ip protocol 50 0xff flowid ${IH}:50
> 
>   Match TCP first, then udp; you say:
>   ''We must sort TCP from UDP first because tcp and udp u32 matches are
>     identical unless the protocol is specified; sorting first allows for
>     simpler rules later''
>   If match, go respectively to filter list ${IH}6: and ${IH}7:
>   I think that 'offset at 0 mask 0x0f00 shift 6 plus 0' compute header
>   lenght, and so permit subsequent match also if the 'options' filed
>   are set.
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 2 u32 match ip protocol 6 0xff link ${IH}6: offset at 0 mask 0x0f00 shift 6 plus 0
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 2 u32 match ip protocol 17 0xff link ${IH}7: offset at 0 mask 0x0f00 shift 6 plus 0
> 
> 
> 3) on filter list ${IH}6: (TCP)
> 
> > # DR backup - tos 0 0x10 means the minimize latency bit is not set; DR01 backup traffic would match interactive filters if we did not process this first
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 3 u32 ht ${IH}6:0 match ip src 192.168.124.120 match ip tos 0 0x10 match tcp dst 922 0xffff at nexthdr+2 flowid ${IH}:10
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 3 u32 ht ${IH}6:0 match ip src 192.168.124.120 match ip tos 0 0x10 match tcp src 922 0xffff at nexthdr+0 flowid ${IH}:10
> > # Interactive
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match ip dst 208.46.93.8 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp dst 922 0xffff at nexthdr+2 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp src 922 0xffff at nexthdr+0 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp dst 1022 0xffff at nexthdr+2 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp src 1022 0xffff at nexthdr+0 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp dst 22 0xffff at nexthdr+2 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp src 22 0xffff at nexthdr+0 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp dst 3389 0xffff at nexthdr+2 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp src 3389 0xffff at nexthdr+0 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match tcp dst 4443 0xffff at nexthdr+2 flowid ${IH}:40
> > # Send packets <64 bytes (u16 0 0xffc0 at 2) with only the ACK flag set (match u8 16 0xff at nexthdr+13) to the low latency queue
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 6 u32 ht ${IH}6:0 match u16 0 0xffc0 at 2 match u8 16 0xff at nexthdr+13 flowid ${IH}:40
> > # Web
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 8 u32 ht ${IH}6:0 match tcp dst 80 0xffff at nexthdr+2 flowid ${IH}:30
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 8 u32 ht ${IH}6:0 match tcp dst 443 0xffff at nexthdr+2 flowid ${IH}:30
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 8 u32 ht ${IH}6:0 match tcp dst 8080 0xffff at nexthdr+2 flowid ${IH}:30
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 8 u32 ht ${IH}6:0 match tcp dst 8443 0xffff at nexthdr+2 flowid ${IH}:30
> 
> Mmmh... i've not clear because you use 'nexthdr+0'/src or
> 'nexthdr+2'/dst... using 'src' it is not the same to use 'nexthdr+0'
> and on the converse 'dst' 'nexthdr+2'?!
> 
> 
> 4) on filter list ${IH}7:0 (UDP)
> 
>   Link filter list ${IH}8: for these matches.
>   'u16 0 0xff00 at 2' mean 'non fragmented packets'?
>   Or mean 'less then 256 bytes'?!
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 4 u32 ht ${IH}7:0 match ip dst 172.30.14.0/24 match u16 0 0xff00 at 2 link ${IH}8:
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 4 u32 ht ${IH}7:0 match ip dst 208.46.93.14 match u16 0 0xff00 at 2 link ${IH}8:
> 
> > # Prioritized UDP traffic
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 7 u32 ht ${IH}7:0 match udp dst 53 0xffff at nexthdr+2 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 7 u32 ht ${IH}7:0 match udp src 53 0xffff at nexthdr+0 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 7 u32 ht ${IH}7:0 match udp dst 500 0xffff at nexthdr+2 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 7 u32 ht ${IH}7:0 match udp src 500 0xffff at nexthdr+0 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 7 u32 ht ${IH}7:0 match udp dst 4500 0xffff at nexthdr+2 flowid ${IH}:40
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 7 u32 ht ${IH}7:0 match udp src 4500 0xffff at nexthdr+0 flowid ${IH}:40
> 
> 
> 5) on filter list ${IH}8:0 (UDP, more specific)
> 
>   Ok, match port with higher bit set.
> > # VoIP - UDP packets to the VoIP network under 256 Bytes over port 1024
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 5 u32 ht ${IH}8:0 match udp dst 32768 0x8000 at nexthdr+2 flowid ${IH}:20
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 5 u32 ht ${IH}8:0 match udp dst 16384 0x4000 at nexthdr+2 flowid ${IH}:20
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 5 u32 ht ${IH}8:0 match udp dst 8192 0x2000 at nexthdr+2 flowid ${IH}:20
> > ${TC} filter ${ACTION} dev ${IIFB} parent ${IH}:0 protocol ip prio 5 u32 ht ${IH}8:0 match udp dst 4096 0x1000 at nexthdr+2 flowid ${IH}:20
> 
> 
> Ok, after that work, i start to understand something, i think. Still
> many question.
> 
> 1) probably i need a good IP header poster on my office wall. ;)
> 
> 2) i've lost a clue on the relation on the filter priority (prio X) and
>  the handle filter item number (A:B:C, so C). Eg, if i specify 'prio 1'
>  and handle 'a:b:100', and next 'prio 100' and handle 'a:b:1', what
>  execute first?
> 
> 3) all the 'hashing' topic is still a mystere; but for now i don't need
>  it... ;-)
> 
<snip>
Hi, Marco.  My apologies for not being able to respond specifically and
in depth.  I only have time to spit out bits we have already documented.
I do in fact have an 18 page document on building a test WAN environment
using HFSC, IFB, and netem which walks through all of these features one
at a time.  I hesitate to paste something that large into an email.
However, here is the excerpt from that document on the hash tables:

Linked filters
Well, to be safe, we probably want to change that single filter into a
linked filter just in case we ever hit the oddball case where there are
IP options and thus the IP header is 24 rather than 20 bytes.  That
would throw off the bits being used to match the sport as they would
then be 4 bytes further into the packet than expected.  To prevent that,
we will create a first filter which will filter on TCP packets (IP
protocol 6) and read the IP packet header length.  It will then set a
new offset calculated from that length to link to a second filter which
will search for the source port from that offset.
First, we create the new hash table (basically a new list of filters to
which we can link from the first, IP protocol based filter):
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32
divisor 1
This creates the hash table with 1 bucket (divisor 1) and with a handle
of 6 so we can reference it.
Now we can link to it:
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip
protocol 6 0xff link 6: offset at 0 mask 0f00 shift 6 plus 0 eat
What in the world does that mean?
The link parameter is the number for the linked hash table (set of
filters) we will jump to.  By our convention, we will number them after
the IP protocol, e.g., we will use 17 for UDP.  Starting from offset 0,
we will grab two bytes and only look at only the second half of the
first byte (mask 0f00); in other words, we want to read the IP header
length field of the packet.  This is the number of 32 bit lengths of the
IP header.  A typical header has 5 = 20 bytes.
What is shift 6? In binary math, shifting six bits to the right is the
same as dividing by 32.  Why 32? This is easier to explain in binary.
Remember, we look at two bytes and mask of all but the last four bits of
the first byte.  With a typical IP header with 5 groups of 4 bytes, the
binary looks like this:
0000 0101 0000 0000 - value of the non-masked four bits = 5, i.e., 4 +
1.  The problem is, when viewed as part of the two bytes it is not 5
because it has eight binary zeros after it.  So we need to shift it by 8
bits to the right to get 5.  So why do we only shift 6? Because 5 is the
count of 4 byte units.  To get the number of bytes, we need to multiply
by four.  Multiplying by 4 is the same as shifting to the left by 2
bits.  8 to the right minus 2 to the left = 6 to the right.
The plus 0 means we are not adding any additional bytes to what we
calculated by the mask and the shift.  The eat means we are going to
automatically add this offset to any explicit or implicit offsets in the
link to which we are about to jump.  If we do not use it, we must
specify the offset for the upper layer header (e.g., TCP or UDP) with
nexthdr+<upper layer offset>.  This may be preferred if some of the new
hash table filters require both IP and upper layer matches.
Now we can create the tcp filter in the hash table to which we we want
to link:
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
tcp src 443 0xffff flowid 1:10


--
To unsubscribe from this list: send the line "unsubscribe lartc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [LARTC Home Page]     [Netfilter]     [Netfilter Development]     [Network Development]     [Bugtraq]     [GCC Help]     [Yosemite News]     [Linux Kernel]     [Fedora Users]
  Powered by Linux