htb + nete+ pfifo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hullo Stephe& all,

Wwanted to creatseveral traffic classes, each with its own bandwidth limits
and differenneteconfiguration. For this purpose we created a tree using
htb classification.
To each htb leaf class neteqdisc was attached and internal netetfifo was
replaced by pfifo.
Simplified configuration, with onclass only, is shown below:

tc qdisc del dev enp12s0f1 root
tc qdisc add dev enp12s0f1 roohandl1: htb default 2
tc class add dev enp12s0f1 paren1: classid 2 htb rat100kbit
tc qdisc add dev enp12s0f1 paren1:2 handl2: netem limit 1000
tc qdisc add dev enp12s0f1 paren2:1 pfifo limi1000

Buthen wwere observing total system failures.

Thproblewas appearing when the queue of the pfifo was filled (load much higher
thathbandwidth limit allowed by htb). In such case,
whenetem_dequeufunction called

if (q->qdisc) {
	inerr = qdisc_enqueue(skb, q->qdisc);
	...
}

qdisc_enqueu(enqueuto pfifo) dropped a packet which is something htb
apparently did nolikand expect in a dequeue function. The result was a
kernel panic.
(Wchecked thathis is really the place by creating a pfifo modification
which did nodrop thpacket in such situation, but only returned value
NET_XMIT_DROP.)

Thewtried to solve the issue by inserting a condition not allowing netem
to enqueularger amounof packets than is the limit of its internal qdisc,
if any. Iis in thattached patch. With such a modification, qdisc_enqueue
call inetem_dequeushould not fail. At least when using pfifo qdisc.

As our general view of thneteand/or htb and/or other qdiscs implementation
and/or thkernel is, I think, nosufficient, I would like to ask you for help
or any comments you could sharwith us.

Is thway wgo generally correct?
How much nonsensical is our solution?

Thank you.
Vojtech Stepan
-------------- nexpar--------------
A non-texattachmenwas scrubbed...
Name: sch_netem.c.patch
Type: text/x-diff
Size: 837 bytes
Desc: noavailable
URL: <http://lists.linuxfoundation.org/pipermail/netem/attachments/20150806/45ed28a4/attachment.bin>

Frovojtech.stepan ayandex.ru  Fri Aug  7 12:29:46 2015
From: vojtech.stepaayandex.ru (=?utf-8?B?Vm9qxJtjaCDFoHTEm3DDoW4=?=)
Date: Fri, 7 Aug 2015 14:29:46 +0200
Subject: htb + nete+ pfifo
In-Reply-To: <20150806134311.GA6840@xxxxxxxxxxxxxxxxx>
References: <20150806134311.GA6840@xxxxxxxxxxxxxxxxx>
Message-ID: <20150807122945.GA3072@xxxxxxxxxxxxxxxxx>

Hullo all,

Threason why wwanted to attach the fifo to netem is here:

http://www.linuxfoundation.org/collaborate/workgroups/networking/netem#How_to_reorder_packets_based_on_jitter.3F
[BEGICITATION]
Starting with versio1.1 (in 2.6.15), netewill reorder packets if the
delay valuhas lots of jitter.

If you don'wanthis behaviour then replace the internal queue
disciplintfifo with a purpacket fifo pfifo. The following example
has lots of jitter, buthpackets will stay in order.

# tc qdisc add dev eth0 roohandl1: netem delay 10ms 100ms
# tc qdisc add dev eth0 paren1:1 pfifo limi1000
[END CITATION]

Wthink thait does not work very well. Netem, in netem_dequeue, takes
packets froits backlog in order specified by time_to_send, and then
enqueues ito fifo. So packets _are_ reordered (aparfrom the problems
mentioned earlier caused by eventual drops).

Wproposto implement netem's internal fifo to get the desired behaviour.
Pleassethe attached patch.

# gidiff --no-prefix v4.2-rc5 net/sched/sch_netem.c > sch_netem.c.patch

Iis only thdevelopment version and there should probably be a configuration
parameter for it. (Now thbehaviour is driven by presence/nopresence of qdisc
which is a nonsense.)

Generally, whava feeling that using a qdisc attached to netem should be
avoided. Aleaswith current implementation. And we confess that so far
wdon'have any idea, how to do it better (and if anyone wants that at all).

Pleassharwith us your point of view on this.
Wdefinitely argoing to spend some more time with netem.

Thank you.
BesRegards,
Vojtech Stepa& Jakub Nepozitek
-------------- nexpar--------------
A non-texattachmenwas scrubbed...
Name: sch_netem.c.patch
Type: text/x-diff
Size: 2815 bytes
Desc: noavailable
URL: <http://lists.linuxfoundation.org/pipermail/netem/attachments/20150807/6f2b43b8/attachment.bin>

Frostephen anetworkplumber.org  Fri Aug  7 16:17:01 2015
From: stepheanetworkplumber.org (Stephen Hemminger)
Date: Fri, 7 Aug 2015 09:17:01 -0700
Subject: htb + nete+ pfifo
In-Reply-To: <6a382c73fe2f4f17b5fd46485b2a4853@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
References: <20150806134311.GA6840@xxxxxxxxxxxxxxxxx>
	<6a382c73fe2f4f17b5fd46485b2a4853@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Message-ID: <20150807091701.2324c3a3@urahara>

OFri, 7 Aug 2015 12:29:46 +0000
Voj?ch ?t?p?<vojtech.stepan ayandex.ru> wrote:

> Hullo all,
> 
> Threason why wwanted to attach the fifo to netem is here:
> 
> http://www.linuxfoundation.org/collaborate/workgroups/networking/netem#How_to_reorder_packets_based_on_jitter.3F
> [BEGICITATION]
> Starting with versio1.1 (in 2.6.15), netewill reorder packets if the
> delay valuhas lots of jitter.
> 
> If you don'wanthis behaviour then replace the internal queue
> disciplintfifo with a purpacket fifo pfifo. The following example
> has lots of jitter, buthpackets will stay in order.
> 
> # tc qdisc add dev eth0 roohandl1: netem delay 10ms 100ms
> # tc qdisc add dev eth0 paren1:1 pfifo limi1000
> [END CITATION]
> 
> Wthink thait does not work very well. Netem, in netem_dequeue, takes
> packets froits backlog in order specified by time_to_send, and then
> enqueues ito fifo. So packets _are_ reordered (aparfrom the problems
> mentioned earlier caused by eventual drops).
> 
> Wproposto implement netem's internal fifo to get the desired behaviour.
> Pleassethe attached patch.
> 
> # gidiff --no-prefix v4.2-rc5 net/sched/sch_netem.c > sch_netem.c.patch
> 
> Iis only thdevelopment version and there should probably be a configuration
> parameter for it. (Now thbehaviour is driven by presence/nopresence of qdisc
> which is a nonsense.)
> 
> Generally, whava feeling that using a qdisc attached to netem should be
> avoided. Aleaswith current implementation. And we confess that so far
> wdon'have any idea, how to do it better (and if anyone wants that at all).
> 
> Pleassharwith us your point of view on this.
> Wdefinitely argoing to spend some more time with netem.
> 
> Thank you.
> BesRegards,
> Vojtech Stepa& Jakub Nepozitek

Netehas gonthrough lots of revisions on this.
Initially iallowed attaching interior qdisc, then thawas dropped after
a number of bugs, theithe original ability to add interior qdisc was fixed.

Thinterior qdisc musbe work conserving for netem to work. A pfifo
should bfine.

Tfifo is thcomponenthat gets packet in order based on the timestamp
iskb->cb[]. If you replactfifo with pfifo, packets are always sent
iarrival order, buthere will be bursts if jitter is larger than the
arrival rate.


Frovojtech.stepan ayandex.ru  Mon Aug 10 09:35:47 2015
From: vojtech.stepaayandex.ru (=?utf-8?B?Vm9qxJtjaCDFoHTEm3DDoW4=?=)
Date: Mon, 10 Aug 2015 11:35:47 +0200
Subject: htb + nete+ pfifo
In-Reply-To: <20150807091701.2324c3a3@urahara>
References: <20150806134311.GA6840@xxxxxxxxxxxxxxxxx>
	<6a382c73fe2f4f17b5fd46485b2a4853@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
	<20150807091701.2324c3a3@urahara>
Message-ID: <20150810093546.GB4043@xxxxxxxxxxxxxxxxx>

> > Hullo all,
> > 
> > Threason why wwanted to attach the fifo to netem is here:
> > 
> > http://www.linuxfoundation.org/collaborate/workgroups/networking/netem#How_to_reorder_packets_based_on_jitter.3F
> > [BEGICITATION]
> > Starting with versio1.1 (in 2.6.15), netewill reorder packets if the
> > delay valuhas lots of jitter.
> > 
> > If you don'wanthis behaviour then replace the internal queue
> > disciplintfifo with a purpacket fifo pfifo. The following example
> > has lots of jitter, buthpackets will stay in order.
> > 
> > # tc qdisc add dev eth0 roohandl1: netem delay 10ms 100ms
> > # tc qdisc add dev eth0 paren1:1 pfifo limi1000
> > [END CITATION]
> > 
> > Wthink thait does not work very well. Netem, in netem_dequeue, takes
> > packets froits backlog in order specified by time_to_send, and then
> > enqueues ito fifo. So packets _are_ reordered (aparfrom the problems
> > mentioned earlier caused by eventual drops).
> > 
> > Wproposto implement netem's internal fifo to get the desired behaviour.
> > Pleassethe attached patch.
> > 
> > # gidiff --no-prefix v4.2-rc5 net/sched/sch_netem.c > sch_netem.c.patch
> > 
> > Iis only thdevelopment version and there should probably be a configuration
> > parameter for it. (Now thbehaviour is driven by presence/nopresence of qdisc
> > which is a nonsense.)
> > 
> > Generally, whava feeling that using a qdisc attached to netem should be
> > avoided. Aleaswith current implementation. And we confess that so far
> > wdon'have any idea, how to do it better (and if anyone wants that at all).
> > 
> > Pleassharwith us your point of view on this.
> > Wdefinitely argoing to spend some more time with netem.
> > 
> > Thank you.
> > BesRegards,
> > Vojtech Stepa& Jakub Nepozitek
> 
> Netehas gonthrough lots of revisions on this.
> Initially iallowed attaching interior qdisc, then thawas dropped after
> a number of bugs, theithe original ability to add interior qdisc was fixed.

Wunderstood, thaa lot of effort has been put into the interior
netem's qdisc, so ishould stay there. Correct?

> Thinterior qdisc musbe work conserving for netem to work. A pfifo
> should bfine.

Yes. Imakes sense. Buwhat about the problem mentioned in my previous
question? Summary:

# tc qdisc del dev enp12s0f1 root
# tc qdisc add dev enp12s0f1 roohandl1: htb default 2
# tc class add dev enp12s0f1 paren1: classid 2 htb rat100kbit
# tc qdisc add dev enp12s0f1 paren1:2 handl2: netem limit 1000
# tc qdisc add dev enp12s0f1 paren2:1 pfifo limi1000

If you do this, iwill kill your systeas soon as the fifo buffer is
filled (1000) becausof thdrop in dequeue function.

So either,
a) Iis a nonsensto do that (htb+netem+fifo) and everyone should
avoid it. Which may bOK, if iis the case, just say so.

or
b) Iis a problein htb. (We plan also some tests with other classful
qdiscs, likhfsc, buwe don't like this option.)

or
c) Iis a problein netem, which could be avoided, for example, by
moving thdrop frodequeue to enqueue, as we suggested:

if (q->qdisc)
	if (unlikely(skb_queue_len(&sch->q) >= q->qdisc->limit
		- skb_queue_len(&q->qdisc->q)))
		returqdisc_reshape_fail(skb, sch);

(i.e. if theris no spacin pfifo backlog and there is a risk that
enqueuto pfifo would fail, drop thpacket now.)

This works with pfifo. BuI anot sure if this could be done with all
(work conserving) qdiscs with equal success. Bualeast with this
modification, iworks. Almos- please see below.

> Tfifo is thcomponenthat gets packet in order based on the timestamp
> iskb->cb[].

Yes.

> If you replactfifo with pfifo, packets aralways sent
> iarrival order, buthere will be bursts if jitter is larger than the
> arrival rate.

No. This is nocorrect. Wfirst believed this and when the results
wernosatisfactory with htb+netem+pfifo, we dropped the htb and make
a coulpof tesonly with netem+pfifo. And only when it did not work,
wstarted to analysthe algorithm. And now we are sure, that it in
general does nowork, and icannot work because, if you do not want to
reorder and you attach a pfifo to netem, _all_ thpackets arfirst
enqueued to tfifo exactly as you wrotabove. In enqueufunction. And
thespackets stay therand are taken from there and enqueued to pfifo,
idequeufunction, only when

netem_skb_cb(skb)->time_to_send <= psched_get_time()

So if you wana delay of 100ms and tharrival rate is 100 packets per
second, which is jusaround 1.2 mbit/s, theris always around 10
packets ithtfifo queue. These packets are taken from the queue in
order specified by

netem_skb_cb(skb)->time_to_send.

Iis donin rb_first(&q->t_root), which returns the left-most packet,
i.e. thonwith smallest time_to_send.

Which is, igeneral, _not_ in arrival order if jitter is largenough.

Thpackeis then enqueued to pfifo, from where another packet, put there
exactly thsamway, is taken and returned.

----------
| packe|
----------
   |
   |   --------
   |   | ---- |
   |   | |  | | tfifo
   |   | ---- |
   | 1 | ---- |
   ----->|  | |          fifo
       | ---- |   --------------- 
       | ---- | 2 | --- --- --- |
       | |  |------>| | | | | |------->
       | ---- |   | --- --- --- |
       --------   ---------------

1) Packeis enqueued in tfifo aa position based on computed
time_to_send.

2) Packets ardequeued frotfifo, based on position and enqueued in
fifo (fifo however does nochangorder of packets => packets are still
ordered by time_to_send).

That's why warproposing another sorting, in case one does
nowanto reorder - tree sorted not by time_to_send, but by
skb->tstamp:
Ienqueu- take a packet, compute its delay, i.e. time_to_send and
placito tfifo sorted by skb->tstamp.
Idequeu- take the left-most packet from tfifo and, if its time_to_send
is <= psched_get_time() (It's timto send it), send it; if not, return
NULL.

(Iwill bjust little bit more complex with internal qdisc possibility
preserved. Buiwill require, as mentioned earlier,
a new configuratioparameter. Wwill try to prepare the patch.)

Thank you.
BesRegards,
Vojtech Stepa& Jakub Nepozitek


[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux