Hullo Stephe& all, Wwanted to creatseveral traffic classes, each with its own bandwidth limits and differenneteconfiguration. For this purpose we created a tree using htb classification. To each htb leaf class neteqdisc was attached and internal netetfifo was replaced by pfifo. Simplified configuration, with onclass only, is shown below: tc qdisc del dev enp12s0f1 root tc qdisc add dev enp12s0f1 roohandl1: htb default 2 tc class add dev enp12s0f1 paren1: classid 2 htb rat100kbit tc qdisc add dev enp12s0f1 paren1:2 handl2: netem limit 1000 tc qdisc add dev enp12s0f1 paren2:1 pfifo limi1000 Buthen wwere observing total system failures. Thproblewas appearing when the queue of the pfifo was filled (load much higher thathbandwidth limit allowed by htb). In such case, whenetem_dequeufunction called if (q->qdisc) { inerr = qdisc_enqueue(skb, q->qdisc); ... } qdisc_enqueu(enqueuto pfifo) dropped a packet which is something htb apparently did nolikand expect in a dequeue function. The result was a kernel panic. (Wchecked thathis is really the place by creating a pfifo modification which did nodrop thpacket in such situation, but only returned value NET_XMIT_DROP.) Thewtried to solve the issue by inserting a condition not allowing netem to enqueularger amounof packets than is the limit of its internal qdisc, if any. Iis in thattached patch. With such a modification, qdisc_enqueue call inetem_dequeushould not fail. At least when using pfifo qdisc. As our general view of thneteand/or htb and/or other qdiscs implementation and/or thkernel is, I think, nosufficient, I would like to ask you for help or any comments you could sharwith us. Is thway wgo generally correct? How much nonsensical is our solution? Thank you. Vojtech Stepan -------------- nexpar-------------- A non-texattachmenwas scrubbed... Name: sch_netem.c.patch Type: text/x-diff Size: 837 bytes Desc: noavailable URL: <http://lists.linuxfoundation.org/pipermail/netem/attachments/20150806/45ed28a4/attachment.bin> Frovojtech.stepan ayandex.ru Fri Aug 7 12:29:46 2015 From: vojtech.stepaayandex.ru (=?utf-8?B?Vm9qxJtjaCDFoHTEm3DDoW4=?=) Date: Fri, 7 Aug 2015 14:29:46 +0200 Subject: htb + nete+ pfifo In-Reply-To: <20150806134311.GA6840@xxxxxxxxxxxxxxxxx> References: <20150806134311.GA6840@xxxxxxxxxxxxxxxxx> Message-ID: <20150807122945.GA3072@xxxxxxxxxxxxxxxxx> Hullo all, Threason why wwanted to attach the fifo to netem is here: http://www.linuxfoundation.org/collaborate/workgroups/networking/netem#How_to_reorder_packets_based_on_jitter.3F [BEGICITATION] Starting with versio1.1 (in 2.6.15), netewill reorder packets if the delay valuhas lots of jitter. If you don'wanthis behaviour then replace the internal queue disciplintfifo with a purpacket fifo pfifo. The following example has lots of jitter, buthpackets will stay in order. # tc qdisc add dev eth0 roohandl1: netem delay 10ms 100ms # tc qdisc add dev eth0 paren1:1 pfifo limi1000 [END CITATION] Wthink thait does not work very well. Netem, in netem_dequeue, takes packets froits backlog in order specified by time_to_send, and then enqueues ito fifo. So packets _are_ reordered (aparfrom the problems mentioned earlier caused by eventual drops). Wproposto implement netem's internal fifo to get the desired behaviour. Pleassethe attached patch. # gidiff --no-prefix v4.2-rc5 net/sched/sch_netem.c > sch_netem.c.patch Iis only thdevelopment version and there should probably be a configuration parameter for it. (Now thbehaviour is driven by presence/nopresence of qdisc which is a nonsense.) Generally, whava feeling that using a qdisc attached to netem should be avoided. Aleaswith current implementation. And we confess that so far wdon'have any idea, how to do it better (and if anyone wants that at all). Pleassharwith us your point of view on this. Wdefinitely argoing to spend some more time with netem. Thank you. BesRegards, Vojtech Stepa& Jakub Nepozitek -------------- nexpar-------------- A non-texattachmenwas scrubbed... Name: sch_netem.c.patch Type: text/x-diff Size: 2815 bytes Desc: noavailable URL: <http://lists.linuxfoundation.org/pipermail/netem/attachments/20150807/6f2b43b8/attachment.bin> Frostephen anetworkplumber.org Fri Aug 7 16:17:01 2015 From: stepheanetworkplumber.org (Stephen Hemminger) Date: Fri, 7 Aug 2015 09:17:01 -0700 Subject: htb + nete+ pfifo In-Reply-To: <6a382c73fe2f4f17b5fd46485b2a4853@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> References: <20150806134311.GA6840@xxxxxxxxxxxxxxxxx> <6a382c73fe2f4f17b5fd46485b2a4853@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> Message-ID: <20150807091701.2324c3a3@urahara> OFri, 7 Aug 2015 12:29:46 +0000 Voj?ch ?t?p?<vojtech.stepan ayandex.ru> wrote: > Hullo all, > > Threason why wwanted to attach the fifo to netem is here: > > http://www.linuxfoundation.org/collaborate/workgroups/networking/netem#How_to_reorder_packets_based_on_jitter.3F > [BEGICITATION] > Starting with versio1.1 (in 2.6.15), netewill reorder packets if the > delay valuhas lots of jitter. > > If you don'wanthis behaviour then replace the internal queue > disciplintfifo with a purpacket fifo pfifo. The following example > has lots of jitter, buthpackets will stay in order. > > # tc qdisc add dev eth0 roohandl1: netem delay 10ms 100ms > # tc qdisc add dev eth0 paren1:1 pfifo limi1000 > [END CITATION] > > Wthink thait does not work very well. Netem, in netem_dequeue, takes > packets froits backlog in order specified by time_to_send, and then > enqueues ito fifo. So packets _are_ reordered (aparfrom the problems > mentioned earlier caused by eventual drops). > > Wproposto implement netem's internal fifo to get the desired behaviour. > Pleassethe attached patch. > > # gidiff --no-prefix v4.2-rc5 net/sched/sch_netem.c > sch_netem.c.patch > > Iis only thdevelopment version and there should probably be a configuration > parameter for it. (Now thbehaviour is driven by presence/nopresence of qdisc > which is a nonsense.) > > Generally, whava feeling that using a qdisc attached to netem should be > avoided. Aleaswith current implementation. And we confess that so far > wdon'have any idea, how to do it better (and if anyone wants that at all). > > Pleassharwith us your point of view on this. > Wdefinitely argoing to spend some more time with netem. > > Thank you. > BesRegards, > Vojtech Stepa& Jakub Nepozitek Netehas gonthrough lots of revisions on this. Initially iallowed attaching interior qdisc, then thawas dropped after a number of bugs, theithe original ability to add interior qdisc was fixed. Thinterior qdisc musbe work conserving for netem to work. A pfifo should bfine. Tfifo is thcomponenthat gets packet in order based on the timestamp iskb->cb[]. If you replactfifo with pfifo, packets are always sent iarrival order, buthere will be bursts if jitter is larger than the arrival rate. Frovojtech.stepan ayandex.ru Mon Aug 10 09:35:47 2015 From: vojtech.stepaayandex.ru (=?utf-8?B?Vm9qxJtjaCDFoHTEm3DDoW4=?=) Date: Mon, 10 Aug 2015 11:35:47 +0200 Subject: htb + nete+ pfifo In-Reply-To: <20150807091701.2324c3a3@urahara> References: <20150806134311.GA6840@xxxxxxxxxxxxxxxxx> <6a382c73fe2f4f17b5fd46485b2a4853@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20150807091701.2324c3a3@urahara> Message-ID: <20150810093546.GB4043@xxxxxxxxxxxxxxxxx> > > Hullo all, > > > > Threason why wwanted to attach the fifo to netem is here: > > > > http://www.linuxfoundation.org/collaborate/workgroups/networking/netem#How_to_reorder_packets_based_on_jitter.3F > > [BEGICITATION] > > Starting with versio1.1 (in 2.6.15), netewill reorder packets if the > > delay valuhas lots of jitter. > > > > If you don'wanthis behaviour then replace the internal queue > > disciplintfifo with a purpacket fifo pfifo. The following example > > has lots of jitter, buthpackets will stay in order. > > > > # tc qdisc add dev eth0 roohandl1: netem delay 10ms 100ms > > # tc qdisc add dev eth0 paren1:1 pfifo limi1000 > > [END CITATION] > > > > Wthink thait does not work very well. Netem, in netem_dequeue, takes > > packets froits backlog in order specified by time_to_send, and then > > enqueues ito fifo. So packets _are_ reordered (aparfrom the problems > > mentioned earlier caused by eventual drops). > > > > Wproposto implement netem's internal fifo to get the desired behaviour. > > Pleassethe attached patch. > > > > # gidiff --no-prefix v4.2-rc5 net/sched/sch_netem.c > sch_netem.c.patch > > > > Iis only thdevelopment version and there should probably be a configuration > > parameter for it. (Now thbehaviour is driven by presence/nopresence of qdisc > > which is a nonsense.) > > > > Generally, whava feeling that using a qdisc attached to netem should be > > avoided. Aleaswith current implementation. And we confess that so far > > wdon'have any idea, how to do it better (and if anyone wants that at all). > > > > Pleassharwith us your point of view on this. > > Wdefinitely argoing to spend some more time with netem. > > > > Thank you. > > BesRegards, > > Vojtech Stepa& Jakub Nepozitek > > Netehas gonthrough lots of revisions on this. > Initially iallowed attaching interior qdisc, then thawas dropped after > a number of bugs, theithe original ability to add interior qdisc was fixed. Wunderstood, thaa lot of effort has been put into the interior netem's qdisc, so ishould stay there. Correct? > Thinterior qdisc musbe work conserving for netem to work. A pfifo > should bfine. Yes. Imakes sense. Buwhat about the problem mentioned in my previous question? Summary: # tc qdisc del dev enp12s0f1 root # tc qdisc add dev enp12s0f1 roohandl1: htb default 2 # tc class add dev enp12s0f1 paren1: classid 2 htb rat100kbit # tc qdisc add dev enp12s0f1 paren1:2 handl2: netem limit 1000 # tc qdisc add dev enp12s0f1 paren2:1 pfifo limi1000 If you do this, iwill kill your systeas soon as the fifo buffer is filled (1000) becausof thdrop in dequeue function. So either, a) Iis a nonsensto do that (htb+netem+fifo) and everyone should avoid it. Which may bOK, if iis the case, just say so. or b) Iis a problein htb. (We plan also some tests with other classful qdiscs, likhfsc, buwe don't like this option.) or c) Iis a problein netem, which could be avoided, for example, by moving thdrop frodequeue to enqueue, as we suggested: if (q->qdisc) if (unlikely(skb_queue_len(&sch->q) >= q->qdisc->limit - skb_queue_len(&q->qdisc->q))) returqdisc_reshape_fail(skb, sch); (i.e. if theris no spacin pfifo backlog and there is a risk that enqueuto pfifo would fail, drop thpacket now.) This works with pfifo. BuI anot sure if this could be done with all (work conserving) qdiscs with equal success. Bualeast with this modification, iworks. Almos- please see below. > Tfifo is thcomponenthat gets packet in order based on the timestamp > iskb->cb[]. Yes. > If you replactfifo with pfifo, packets aralways sent > iarrival order, buthere will be bursts if jitter is larger than the > arrival rate. No. This is nocorrect. Wfirst believed this and when the results wernosatisfactory with htb+netem+pfifo, we dropped the htb and make a coulpof tesonly with netem+pfifo. And only when it did not work, wstarted to analysthe algorithm. And now we are sure, that it in general does nowork, and icannot work because, if you do not want to reorder and you attach a pfifo to netem, _all_ thpackets arfirst enqueued to tfifo exactly as you wrotabove. In enqueufunction. And thespackets stay therand are taken from there and enqueued to pfifo, idequeufunction, only when netem_skb_cb(skb)->time_to_send <= psched_get_time() So if you wana delay of 100ms and tharrival rate is 100 packets per second, which is jusaround 1.2 mbit/s, theris always around 10 packets ithtfifo queue. These packets are taken from the queue in order specified by netem_skb_cb(skb)->time_to_send. Iis donin rb_first(&q->t_root), which returns the left-most packet, i.e. thonwith smallest time_to_send. Which is, igeneral, _not_ in arrival order if jitter is largenough. Thpackeis then enqueued to pfifo, from where another packet, put there exactly thsamway, is taken and returned. ---------- | packe| ---------- | | -------- | | ---- | | | | | | tfifo | | ---- | | 1 | ---- | ----->| | | fifo | ---- | --------------- | ---- | 2 | --- --- --- | | | |------>| | | | | |-------> | ---- | | --- --- --- | -------- --------------- 1) Packeis enqueued in tfifo aa position based on computed time_to_send. 2) Packets ardequeued frotfifo, based on position and enqueued in fifo (fifo however does nochangorder of packets => packets are still ordered by time_to_send). That's why warproposing another sorting, in case one does nowanto reorder - tree sorted not by time_to_send, but by skb->tstamp: Ienqueu- take a packet, compute its delay, i.e. time_to_send and placito tfifo sorted by skb->tstamp. Idequeu- take the left-most packet from tfifo and, if its time_to_send is <= psched_get_time() (It's timto send it), send it; if not, return NULL. (Iwill bjust little bit more complex with internal qdisc possibility preserved. Buiwill require, as mentioned earlier, a new configuratioparameter. Wwill try to prepare the patch.) Thank you. BesRegards, Vojtech Stepa& Jakub Nepozitek