Hi all, I noticed thasomexplanations about packet loss correlation has been added othweb site (http://linux-net.osdl.org/index.php/Netem). But iseems thaa mistakes has been made. Correct me if I'm wrong but wouldn'ibe as follow: *Packeloss* Randopackeloss is specified in the 'tc' command in percent. The smallespossiblnon-zero value is: \fig{ 1/2^{32} = 0.0000000232% } # tc qdisc changdev eth0 roonetem loss 0.1% This causes 1/10th of a percen(i.1 out of 1000) packets to be randomly dropped. Aoptional correlation may also badded. This causes the random number generator to bless randoand can be used to emulate packet burst losses. # tc qdisc changdev eth0 roonetem loss 0.3% 33.33% This will caus0.3% of packets to blost, and each successive probability depends by aboua third on thlast one. \fig{ Prob_= [Prob_{n-1} * 33.33/100] + [Rand() * (1-(0.3/100))] } Thfirsterm into brackets representing the correlation between two successivpackets and thsecond one representing the effective packet loss probability oonpacket. Oncagain, tell mif I'm wrong. Thanking you in advance : H -- Hugues VaPeteghem PhD Student Computer SciencInstitute FUNDP - ThUniversity of Namur Belgium http://www.info.fundp.ac.be/~hvp/ -------------- nexpar-------------- AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060904/cd9b3646/attachment.htm Froshemminger aosdl.org Tue Sep 5 09:25:06 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: Concerning laschanges on thweb site In-Reply-To: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx> References: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx> Message-ID: <20060905092506.5aebab4f@localhost.localdomain> OMon, 04 Sep 2006 11:10:02 +0200 Hugues VaPeteghe<hvp@xxxxxxxxxxxxxxxx> wrote: > Hi all, > > I noticed thasomexplanations about packet loss correlation has been > added othweb site (http://linux-net.osdl.org/index.php/Netem). But > iseems thaa mistakes has been made. Correct me if I'm wrong but > wouldn'ibe as follow: > > *Packeloss* > > Randopackeloss is specified in the 'tc' command in percent. The > smallespossiblnon-zero value is: > > \fig{ > 1/2^{32} = 0.0000000232% > } > > # tc qdisc changdev eth0 roonetem loss 0.1% > > This causes 1/10th of a percen(i.1 out of 1000) packets to be > randomly dropped. > > Aoptional correlation may also badded. This causes the random number > generator to bless randoand can be used to emulate packet burst > losses. > > # tc qdisc changdev eth0 roonetem loss 0.3% 33.33% > > This will caus0.3% of packets to blost, and each successive > probability depends by aboua third on thlast one. > > \fig{ > Prob_= [Prob_{n-1} * 33.33/100] + [Rand() * (1-(0.3/100))] > } > > Thfirsterm into brackets representing the correlation between two > successivpackets and thsecond one representing the effective packet > loss probability oonpacket. > > Oncagain, tell mif I'm wrong. Thanking you in advance : > > H Looks right. Feel freto fix errors in wiki any tim:-) -- StepheHemminger <shemminger@xxxxxxxx> Froexairetos atele2.it Tue Sep 12 09:10:34 2006 From: exairetos atele2.i(Ferdinando Formica) Date: Wed Apr 18 12:51:19 2007 Subject: no loss oping Message-ID: <web-45273940@xxxxxxxxxxxxxxxxx> AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060912/92326901/attachment.htm Froshemminger aosdl.org Tue Sep 12 21:48:44 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: no loss oping In-Reply-To: <web-45273940@xxxxxxxxxxxxxxxxx> References: <web-45273940@xxxxxxxxxxxxxxxxx> Message-ID: <20060913134844.4cfa191d@localhost.localdomain> OTue, 12 Sep 2006 18:10:34 +0200 "Ferdinando Formica" <exairetos@xxxxxxxx> wrote: > > Hi everybody, > Somtimago I set up netem on my Gentoo laptop and it worked fine, now I'm trying to set it up on a SUSE box (kernel 2.6.16) and I'm facing a problem I don't really understand. > Thcommand I enter is: > > # tc qdisc add dev eth0 roonetedelay 20ms loss 20% Try: tc qdisc show dev eth0 roonetem To seif kernel was ignoring parameter ididn't understand (like loss). > > TheI try pinging my laptop, which is connected to eth0, and whilI get a 24.1ms delay (on my laptop I got 21ms) there isn't any packet loss (on my laptop I got values between 18 and 22%). The weird thing is that if I try pinging the box from my laptop the packets get lost in the right percentage. How is this possible? Perhaps thping responsisn't going through the normal queue disc path and is going back directly to device? > > As a sidnote, is thfollowing command correct? > > # tc qdisc add dev eth0 roohandl1: netem delay 20ms > # tc qdisc add dev eth0 paren1:1 handl10: netem loss 20% > > If I try running this, I geonly thpacket loss when pinged (still no packet loss when pinging), and less than 1ms of delay, but shouldn't it be the same than the above? A similar behaviour happens also on my laptop, when the first command works. > > Thanks iadvance, > Ferdinando Formica > Froexairetos atele2.it Wed Sep 13 07:49:49 2006 From: exairetos atele2.i(Ferdinando Formica) Date: Wed Apr 18 12:51:19 2007 Subject: no loss oping In-Reply-To: <20060913134844.4cfa191d@localhost.localdomain> References: <web-45273940@xxxxxxxxxxxxxxxxx> <20060913134844.4cfa191d@localhost.localdomain> Message-ID: <web-48852534@xxxxxxxxxxxxxxxxx> AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060913/7e335022/attachment.htm Froexairetos atele2.it Thu Sep 14 03:55:59 2006 From: exairetos atele2.i(Ferdinando Formica) Date: Wed Apr 18 12:51:19 2007 Subject: no loss oping In-Reply-To: <web-48852534@xxxxxxxxxxxxxxxxx> References: <web-45273940@xxxxxxxxxxxxxxxxx> <20060913134844.4cfa191d@localhost.localdomain> <web-48852534@xxxxxxxxxxxxxxxxx> Message-ID: <web-43174629@xxxxxxxxxxxxxxxxx> AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060914/0a19c302/attachment.htm Frolyonneat ipanematech.com Thu Sep 14 08:44:55 2006 From: lyonneaipanematech.com (frank@xxxxxxxxxxx) Date: Wed Apr 18 12:51:19 2007 Subject: Subtil variations iNetEbehavior as time goes by Message-ID: <00a401c6d814$baf67f60$0202fea9@ipanema.local> Hello, I'vsetup WAemulation on a 4x1Gbps Ethernet port Dell SC1425 with XeoEMT64. I havNetEsetup with 100ms delay, no other impairement on egress of 3 of my interfaces. I'using ping to check NetEbehaviour that report ~200ms RTT between each of my branches. However, whemeasuring responstime of some applications other this setup. I'seeing a changing behaviour after my router is up for a few days: the responstimis improving significantly . but the ping stays the same ! *Rebooting throuter brings thresponse time to what it was originally .* Well . don'know if anybody can help with this. My kernel is 2.6.17 ofedora cor5 - compiled in 32 bits with SMP disabled (to minimizrisks ..). Cheers, Frank -------------- nexpar-------------- AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060914/6cb7bc61/attachment.htm Froshemminger aosdl.org Thu Sep 14 17:31:17 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: no loss oping In-Reply-To: <web-43174629@xxxxxxxxxxxxxxxxx> References: <web-45273940@xxxxxxxxxxxxxxxxx> <20060913134844.4cfa191d@localhost.localdomain> <web-48852534@xxxxxxxxxxxxxxxxx> <web-43174629@xxxxxxxxxxxxxxxxx> Message-ID: <20060915093117.1a5269e1@localhost.localdomain> OThu, 14 Sep 2006 12:55:59 +0200 "Ferdinando Formica" <exairetos@xxxxxxxx> wrote: > Updaton thproblem; surprisingly enough, it seems that the pings *are* dropped. > > > # tc -s qdisc > qdisc nete1: dev eth0 limi1000 delay 20.0ms > Sen28826 bytes 301 pk(dropped 85, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc nete10: dev eth0 paren1:1 limit 1000 loss 20% > Sen28826 bytes 301 pk(dropped 85, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo_fas0: dev eth1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > Sen0 bytes 0 pk(dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > > Now I'starting to think it's a problewith ICMP; also, if I set the loss parameter to 90% it still acknowledges every packet as if it was correctly transmitted, but after a while I get messages like "no buffer space available" and "destination host unreachable". > > MaybI'll try getting another box and going to bridgmode; would this solve anything? > > Thank you very much, > Ferdinando Formica > Therwas a bug in older kernels wherpackets dropped with loss parameter wernobeing freed properly. It was fixed long ago in the mainline kernel, buimay still be an issue with vendor kernel. Frobaumann atik.ee.ethz.ch Thu Sep 21 23:12:11 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.16.19 0/2] LARTC: traccontrol for netem Message-ID: <45137EBB.2030707@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. A new optio(trace) has been added to thnetem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs. After our patches fro2nd and 22th of Auguswe have integrated the comments from Stephen and hope we are on the right way now. Warlooking forward for any comments, feedback and suggestions! Frobaumann atik.ee.ethz.ch Thu Sep 21 23:15:13 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem: kernelspace Message-ID: <45137F71.2000404@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. kernel space: Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data. Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues. Having applied thdelay valuto a packet, the packet gets processed by the original netem functions. Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> --- Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch Frobaumann atik.ee.ethz.ch Thu Sep 21 23:13:54 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.16.19 1/2] LARTC: traccontrol for netem: userspace Message-ID: <45137F22.4000304@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. user spac(iproute2): Thdirectory tc/netewas split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed). If thtracoption is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not send data to thkernel until thregistration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself. Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> --- Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch Froshemminger aosdl.org Fri Sep 22 10:20:56 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <45137F71.2000404@xxxxxxxxxxxxxx> References: <45137F71.2000404@xxxxxxxxxxxxxx> Message-ID: <20060922102056.0069f944@localhost.localdomain> OFri, 22 Sep 2006 08:15:13 +0200 Rainer Bauman<baumann@xxxxxxxxxxxxxx> wrote: > TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. > > kernel space: > Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data. > > Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues. > > Having applied thdelay valuto a packet, the packet gets processed by the original netem functions. > > Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> > > --- > > Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch I likthconcept of the trace based delay stuff, it is just that the implementation needs morwork. Style: * whitespacaround operators, keywords etc * us/* for comments no// * indentation scripts/Lindenmay help * accidental blank linchanges introduced in patch as well * You don'really changMakefile Code: * now netedepends on CONFIG_PROC_FS * why nousa miscdevice (/dev/netem_trace?) instead of /proc * still has signal flow control to process. This is aawkward way to do flow control and I don'think iis safe. * hard coding MAX_FLOWS leads to scaling problems. Noall users will wanto wastthe memory, and what if there are more flows. Can't you figuroua way to allocate and scale flow buffers. -- StepheHemminger <shemminger@xxxxxxxx> Frohagen ajauu.net Fri Sep 22 08:19:06 2006 From: hageajauu.net (Hagen Paul Pfeifer) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <45137F71.2000404@xxxxxxxxxxxxxx> References: <45137F71.2000404@xxxxxxxxxxxxxx> Message-ID: <20060922151906.GA25483@xxxxxxxxxxxxxx> * Rainer Bauman| 2006-09-22 08:15:13 [+0200]: >Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch Coding Stylneed aleast some work ... Whitespaces around operators and parentheses, useless parentheses, braces for thelsbranch, mixes C99/C89 comments, indentation, .... proc_read_stats() look unclea(bzero) and maybsome other stuff too - the codaa whole look a little bit grubby. HGN -- 43rd Law of Computing: Anything thacan go wr fortune: Segmentatioviolation -- Cordumped Frobaumann atik.ee.ethz.ch Sat Sep 23 00:04:45 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.17.13 0/2] LARTC: traccontrol for netem Message-ID: <4514DC8D.2010405@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. A new optio(trace) has been added to thnetem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs. Sorry, yesterday, this was thold version, this heris now the new version! After our patches fro2nd and 22th of Auguswe have integrated the comments from Stephen and hope we are on the right way now. Warlooking forward for any comments, feedback and suggestions! Frobaumann atik.ee.ethz.ch Sat Sep 23 00:04:58 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace Message-ID: <4514DC9A.2000505@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. kernel space: Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data. Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues. Having applied thdelay valuto a packet, the packet gets processed by the original netem functions. Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> --- Patch for linux kernel 2.6.17.13: http://tcn.hypert.net/tcn_kernel_configfs.patch Frobaumann atik.ee.ethz.ch Sat Sep 23 00:04:49 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.17.13 1/2] LARTC: traccontrol for netem: userspace Message-ID: <4514DC91.2070507@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. user spac(iproute2): Thdirectory tc/netewas split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed). If thtracoption is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not send data to thkernel until thregistration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself. Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> --- Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch Froshemminger aosdl.org Mon Sep 25 13:28:00 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <4514DC9A.2000505@xxxxxxxxxxxxxx> References: <4514DC9A.2000505@xxxxxxxxxxxxxx> Message-ID: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> Somchanges: 1. need to selecCONFIGFS into configuration 2. don'add declarations after code. 3. usunsigned noint for counters and mask. 4. don'return a structur(ie pkt_delay) 5. usenufor magic values 6. don'usGFP_ATOMIC unless you have to 7. check error values oconfigfs_init 8. map initializatiois unneeded. static's always inito zero. ------------------ diff --gia/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index d10f353..a51de64 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h @@ -430,6 +430,8 @@ enum TCA_NETEM_DELAY_DIST, TCA_NETEM_REORDER, TCA_NETEM_CORRUPT, + TCA_NETEM_TRACE, + TCA_NETEM_STATS, __TCA_NETEM_MAX, }; @@ -445,6 +447,35 @@ structc_netem_qopt __u32 jitter; /* randojitter in latency (us) */ }; +structc_netem_stats +{ + inpacketcount; + inpacketok; + innormaldelay; + indrops; + indupl; + incorrupt; + innovaliddata; + inuninitialized; + inbufferunderrun; + inbufferinuseempty; + innoemptybuffer; + inreadbehindbuffer; + inbuffer1_reloads; + inbuffer2_reloads; + intobuffer1_switch; + intobuffer2_switch; + inswitch_to_emptybuffer1; + inswitch_to_emptybuffer2; +}; + +structc_netem_trace +{ + __u32 fid; /*flowid */ + __u32 def; /* defaulaction 0 = no delay, 1 = drop*/ + __u32 ticks; /* number of ticks corresponding to 1ms */ +}; + structc_netem_corr { __u32 delay_corr; /* delay correlatio*/ diff --gia/net/sched/Kconfig b/net/sched/Kconfig index 8298ea9..aee4bc6 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -232,6 +232,7 @@ config NET_SCH_DSMARK config NET_SCH_NETEM tristat"Network emulator (NETEM)" + selecCONFIGFS_FS ---help--- Say Y if you wanto emulatnetwork delay, loss, and packet re-ordering. This is ofteuseful to simulatnetworks when diff --gia/net/sched/sch_netem.c b/net/sched/sch_netem.c index 45939ba..521b9e3 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -11,6 +11,9 @@ * * Authors: StepheHemminger <shemminger@xxxxxxxx> * Catalin(ux aka Dino) BOIE <catab aumbrella doro> + * netetracenhancement: Ariane Keller <arkeller@xxxxxxxxxx> ETH Zurich + * Rainer Bauman<baumann@xxxxxxxxxx> ETH Zurich + * Ulrich Fiedler <fiedler@xxxxxxxxxxxxxx> ETH Zurich */ #includ<linux/module.h> @@ -21,10 +24,16 @@ #includ<linux/errno.h> #includ<linux/netdevice.h> #includ<linux/skbuff.h> #includ<linux/rtnetlink.h> +#includ<linux/init.h> +#includ<linux/slab.h> +#includ<linux/configfs.h> +#includ<linux/vmalloc.h> #includ<net/pkt_sched.h> -#definVERSIO"1.2" +#includ"net/flowseed.h" + +#definVERSIO"1.3" /* Network EmulatioQueuing algorithm. ==================================== @@ -50,6 +59,11 @@ #definVERSIO"1.2" Thsimulator is limited by thLinux timer resolution and will creatpackebursts on the HZ boundary (1ms). + + Thtracoption allows us to read the values for packet delay, + duplication, loss and corruptiofroa tracefile. This permits + thmodulation of statistical properties such as long-rang + dependences. Sehttp://tcn.hypert.net. */ strucnetem_sched_data { @@ -65,6 +79,11 @@ strucnetem_sched_data { u32 duplicate; u32 reorder; u32 corrupt; + u32 tcnstop; + u32 trace; + u32 ticks; + u32 def; + u32 newdataneeded; struccrndstat{ unsigned long last; @@ -72,9 +91,13 @@ strucnetem_sched_data { } delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor; strucdisttabl{ - u32 size; + u32 size; s16 table[0]; } *delay_dist; + + structcn_statistic *statistic; + structcn_control *flowbuffer; + wait_queue_head_my_event; }; /* Timstamp puinto socket buffer control block */ @@ -82,6 +105,18 @@ strucnetem_skb_cb { psched_time_t time_to_send; }; + +strucconfdata { + infid; + strucnetem_sched_data * sched_data; +}; + +static strucconfdata map[MAX_FLOWS]; + +#definMASK_BITS 29 +#definMASK_DELAY ((1<<MASK_BITS)-1) +#definMASK_HEAD ~MASK_DELAY + /* init_crando- initializcorrelated random number generator * Usentropy sourcfor initial seed. */ @@ -139,6 +174,103 @@ static long tabledist(unsigned long mu, retur x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * + mu; } +/* don'call this function directly. Iis called after + * a packehas been taken ouof a buffer and it was the last. + */ +static inreload_flowbuffer (strucnetem_sched_data *q) +{ + structcn_control *flow = q->flowbuffer; + + if (flow->buffer_in_us== flow->buffer1) { + flow->buffer1_empty = flow->buffer1; + if (flow->buffer2_empty) { + q->statistic->switch_to_emptybuffer2++; + retur-EFAULT; + } + + q->statistic->tobuffer2_switch++; + + flow->buffer_in_us= flow->buffer2; + flow->offsetpos = flow->buffer2; + + } els{ + flow->buffer2_empty = flow->buffer2; + + if (flow->buffer1_empty) { + q->statistic->switch_to_emptybuffer1++; + retur-EFAULT; + } + + q->statistic->tobuffer1_switch++; + + flow->buffer_in_us= flow->buffer1; + flow->offsetpos = flow->buffer1; + + } + /*thflowseed process can send mordata*/ + q->tcnstop = 0; + q->newdataneeded = 1; + wake_up(&q->my_event); + retur0; +} + +/* returpktdelay with delay and drop/dupl/corrupoption */ +static inget_next_delay(strucnetem_sched_data *q, enum tcn_flow *head) +{ + structcn_control *flow = q->flowbuffer; + u32 variout; + + /*chooswhether to drop or 0 delay packets on default*/ + *head = q->def; + + if (!flow) { + printk(KERN_ERR "netem: read froan uninitialized flow.\n"); + q->statistic->uninitialized++; + retur0; + } + + q->statistic->packetcount++; + + /* check if whavto reload a buffer */ + if (flow->offsetpos - flow->buffer_in_us== DATA_PACKAGE) + reload_flowbuffer(q); + + /* sanity checks */ + if ((flow->buffer_in_us== flow->buffer1 && flow->validdataB1) + || ( flow->buffer_in_us== flow->buffer2 && flow->validdataB2)) { + + if (flow->buffer1_empty && flow->buffer2_empty) { + q->statistic->bufferunderrun++; + retur0; + } + + if (flow->buffer1_empty == flow->buffer_in_us|| + flow->buffer2_empty == flow->buffer_in_use) { + q->statistic->bufferinuseempty++; + retur0; + } + + if (flow->offsetpos - flow->buffer_in_us>= + DATA_PACKAGE) { + q->statistic->readbehindbuffer++; + retur0; + } + /*end of tracefilreached*/ + } els{ + q->statistic->novaliddata++; + retur0; + } + + /* now it's safto read */ + variou= *flow->offsetpos++; + *head = (variou& MASK_HEAD) >> MASK_BITS; + + (&q->statistic->normaldelay)[*head] += 1; + q->statistic->packetok++; + + retur((variou& MASK_DELAY) * q->ticks) / 1000; +} + /* * Inseronskb into qdisc. * Note: parendepends on return valuto account for queue length. @@ -148,20 +280,25 @@ static long tabledist(unsigned long mu, static innetem_enqueue(strucsk_buff *skb, struct Qdisc *sch) { strucnetem_sched_data *q = qdisc_priv(sch); - /* Wdon'fill cb now as skb_unshare() may invalidate it */ strucnetem_skb_cb *cb; strucsk_buff *skb2; - inret; - incoun= 1; + enutcn_flow action = FLOW_NORMAL; + psched_tdiff_delay; + inret, coun= 1; pr_debug("netem_enqueuskb=%p\n", skb); - /* Randoduplication */ - if (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor)) + if (q->trace) + actio= get_next_delay(q, &delay); + + /* Randoduplication */ + if (q->trac? action == FLOW_DUP : + (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor))) ++count; /* Randopackedrop 0 => none, ~0 => all */ - if (q->loss && q->loss >= get_crandom(&q->loss_cor)) + if (q->trac? action == FLOW_DROP : + (q->loss && q->loss >= get_crandom(&q->loss_cor))) --count; if (coun== 0) { @@ -190,7 +327,8 @@ static innetem_enqueue(strucsk_buff * If packeis going to bhardware checksummed, then * do inow in softwarbefore we mangle it. */ - if (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor)) { + if (q->trac? action == FLOW_MANGLE : + (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor))) { if (!(skb = skb_unshare(skb, GFP_ATOMIC)) || (skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_help(skb))) { @@ -206,10 +344,10 @@ static innetem_enqueue(strucsk_buff || q->counter < q->gap /* insidlasreordering gap */ || q->reorder < get_crandom(&q->reorder_cor)) { psched_time_now; - psched_tdiff_delay; - delay = tabledist(q->latency, q->jitter, - &q->delay_cor, q->delay_dist); + if (!q->trace) + delay = tabledist(q->latency, q->jitter, + &q->delay_cor, q->delay_dist); PSCHED_GET_TIME(now); PSCHED_TADD2(now, delay, cb->time_to_send); @@ -343,6 +481,65 @@ static inset_fifo_limit(strucQdisc * returret; } +static void reset_stats(strucnetem_sched_data * q) +{ + memset(q->statistic, 0, sizeof(*(q->statistic))); + return; +} + +static void free_flowbuffer(strucnetem_sched_data * q) +{ + if (q->flowbuffer != NULL) { + q->tcnstop = 1; + q->newdataneeded = 1; + wake_up(&q->my_event); + + if (q->flowbuffer->buffer1 != NULL) { + kfree(q->flowbuffer->buffer1); + } + if (q->flowbuffer->buffer2 != NULL) { + kfree(q->flowbuffer->buffer2); + } + kfree(q->flowbuffer); + kfree(q->statistic); + q->flowbuffer = NULL; + q->statistic = NULL; + } +} + +static ininit_flowbuffer(unsigned infid, struct netem_sched_data * q) +{ + ini, flowid = -1; + + q->statistic = kzalloc(sizeof(*(q->statistic)), GFP_KERNEL; + init_waitqueue_head(&q->my_event); + + for(i = 0; i < MAX_FLOWS; i++) { + if(map[i].fid == 0) { + flowid = i; + map[i].fid = fid; + map[i].sched_data = q; + break; + } + } + + if (flowid != -1) { + q->flowbuffer = kmalloc(sizeof(*(q->flowbuffer)), GFP_KERNEL); + q->flowbuffer->buffer1 = kmalloc(DATA_PACKAGE, GFP_KERNEL); + q->flowbuffer->buffer2 = kmalloc(DATA_PACKAGE, GFP_KERNEL); + + q->flowbuffer->buffer_in_us= q->flowbuffer->buffer1; + q->flowbuffer->offsetpos = q->flowbuffer->buffer1; + q->flowbuffer->buffer1_empty = q->flowbuffer->buffer1; + q->flowbuffer->buffer2_empty = q->flowbuffer->buffer2; + q->flowbuffer->flowid = flowid; + q->flowbuffer->validdataB1 = 0; + q->flowbuffer->validdataB2 = 0; + } + + returflowid; +} + /* * Distributiodata is a variablsize payload containing * signed 16 bivalues. @@ -414,6 +611,32 @@ static inget_corrupt(strucQdisc *sch retur0; } +static inget_trace(strucQdisc *sch, const struct rtattr *attr) +{ + strucnetem_sched_data *q = qdisc_priv(sch); + consstructc_netem_trace *traceopt = RTA_DATA(attr); + + if (RTA_PAYLOAD(attr) != sizeof(*traceopt)) + retur-EINVAL; + + if (traceopt->fid) { + /*correctious -> ticks*/ + q->ticks = traceopt->ticks; + inind; + ind = init_flowbuffer(traceopt->fid, q); + if(ind < 0) { + printk("netem: maximunumber of traces:%d" + " changin net/flowseedprocfs.h\n", MAX_FLOWS); + retur-EINVAL; + } + q->trac= ind + 1; + + } else + q->trac= 0; + q->def = traceopt->def; + retur0; +} + /* Parsnetlink messagto set options */ static innetem_change(strucQdisc *sch, struct rtattr *opt) { @@ -431,6 +654,14 @@ static innetem_change(strucQdisc *sc returret; } + if (q->trace) { + intemp = q->trac- 1; + q->trac= 0; + map[temp].fid = 0; + reset_stats(q); + free_flowbuffer(q); + } + q->latency = qopt->latency; q->jitter = qopt->jitter; q->limi= qopt->limit; @@ -477,6 +708,11 @@ static innetem_change(strucQdisc *sc if (ret) returret; } + if (tb[TCA_NETEM_TRACE-1]) { + re= get_trace(sch, tb[TCA_NETEM_TRACE-1]); + if (ret) + returret; + } } retur0; @@ -572,6 +808,7 @@ static innetem_init(strucQdisc *sch, q->timer.functio= netem_watchdog; q->timer.data = (unsigned long) sch; + q->trac= 0; q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops); if (!q->qdisc) { pr_debug("netem: qdisc creatfailed\n"); @@ -590,6 +827,12 @@ static void netem_destroy(strucQdisc * { strucnetem_sched_data *q = qdisc_priv(sch); + if (q->trace) { + intemp = q->trac- 1; + q->trac= 0; + map[temp].fid = 0; + free_flowbuffer(q); + } del_timer_sync(&q->timer); qdisc_destroy(q->qdisc); kfree(q->delay_dist); @@ -604,6 +847,7 @@ static innetem_dump(strucQdisc *sch, structc_netem_corr cor; structc_netem_reorder reorder; structc_netem_corrupcorrupt; + structc_netem_tractraceopt; qopt.latency = q->latency; qopt.jitter = q->jitter; @@ -626,6 +870,35 @@ static innetem_dump(strucQdisc *sch, corrupt.correlatio= q->corrupt_cor.rho; RTA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt); + traceopt.fid = q->trace; + traceopt.def = q->def; + traceopt.ticks = q->ticks; + RTA_PUT(skb, TCA_NETEM_TRACE, sizeof(traceopt), &traceopt); + + if (q->trace) { + structc_netem_stats tstats; + + tstats.packetcoun= q->statistic->packetcount; + tstats.packetok = q->statistic->packetok; + tstats.normaldelay = q->statistic->normaldelay; + tstats.drops = q->statistic->drops; + tstats.dupl = q->statistic->dupl; + tstats.corrup= q->statistic->corrupt; + tstats.novaliddata = q->statistic->novaliddata; + tstats.uninitialized = q->statistic->uninitialized; + tstats.bufferunderru= q->statistic->bufferunderrun; + tstats.bufferinuseempty = q->statistic->bufferinuseempty; + tstats.noemptybuffer = q->statistic->noemptybuffer; + tstats.readbehindbuffer = q->statistic->readbehindbuffer; + tstats.buffer1_reloads = q->statistic->buffer1_reloads; + tstats.buffer2_reloads = q->statistic->buffer2_reloads; + tstats.tobuffer1_switch = q->statistic->tobuffer1_switch; + tstats.tobuffer2_switch = q->statistic->tobuffer2_switch; + tstats.switch_to_emptybuffer1 = q->statistic->switch_to_emptybuffer1; + tstats.switch_to_emptybuffer2 = q->statistic->switch_to_emptybuffer2; + RTA_PUT(skb, TCA_NETEM_STATS, sizeof(tstats), &tstats); + } + rta->rta_le= skb->tail - b; returskb->len; @@ -709,6 +982,173 @@ static structcf_proto **netem_find_tcf returNULL; } +/*configfs to read tcdelay values frouserspace*/ +structcn_flow { + strucconfig_iteitem; +}; + +static structcn_flow *to_tcn_flow(strucconfig_item *item) +{ + returite? container_of(item, struct tcn_flow, item) : NULL; +} + +static strucconfigfs_attributtcn_flow_attr_storeme = { + .ca_owner = THIS_MODULE, + .ca_nam= "delayvalue", + .ca_mod= S_IRUGO | S_IWUSR, +}; + +static strucconfigfs_attribut*tcn_flow_attrs[] = { + &tcn_flow_attr_storeme, + NULL, +}; + +static ssize_tcn_flow_attr_store(strucconfig_item *item, + strucconfigfs_attribut*attr, + conschar *page, size_count) +{ + char *p = (char *)page; + infid, i, validData = 0; + inflowid = -1; + structcn_control *checkbuf; + + if (coun!= DATA_PACKAGE_ID) { + printk("netem: Unexpected data received. %d\n", count); + retur-EMSGSIZE; + } + + memcpy(&fid, p + DATA_PACKAGE, sizeof(int)); + memcpy(&validData, p + DATA_PACKAGE + sizeof(int), sizeof(int)); + + /* check whether this flow is registered */ + for (i = 0; i < MAX_FLOWS; i++) { + if (map[i].fid == fid) { + flowid = i; + break; + } + } + /* exiif flow is noregistered */ + if (flowid < 0) { + printk("netem: Invalid FID received. Killing process.\n"); + retur-EINVAL; + } + + checkbuf = map[flowid].sched_data->flowbuffer; + if (checkbuf == NULL) { + printk("netem: no flow registered"); + retur-ENOBUFS; + } + + /* check if flowbuffer has empty buffer and copy data into i*/ + if (checkbuf->buffer1_empty != NULL) { + memcpy(checkbuf->buffer1, p, DATA_PACKAGE); + checkbuf->buffer1_empty = NULL; + checkbuf->validdataB1 = validData; + map[flowid].sched_data->statistic->buffer1_reloads++; + + } elsif (checkbuf->buffer2_empty != NULL) { + memcpy(checkbuf->buffer2, p, DATA_PACKAGE); + checkbuf->buffer2_empty = NULL; + checkbuf->validdataB2 = validData; + map[flowid].sched_data->statistic->buffer2_reloads++; + + } els{ + printk("netem: flow %d: no empty buffer. data loss.\n", flowid); + map[flowid].sched_data->statistic->noemptybuffer++; + } + + if (validData) { + /* oinitialization both buffers need data */ + if (checkbuf->buffer2_empty != NULL) { + returDATA_PACKAGE_ID; + } + /* waiuntil new data is needed */ + wait_event(map[flowid].sched_data->my_event, + map[flowid].sched_data->newdataneeded); + map[flowid].sched_data->newdataneeded = 0; + + } + + if (map[flowid].sched_data->tcnstop) { + retur-ECANCELED; + } + + returDATA_PACKAGE_ID; + +} + +static void tcn_flow_release(strucconfig_ite*item) +{ + kfree(to_tcn_flow(item)); + +} + +static strucconfigfs_item_operations tcn_flow_item_ops = { + .releas= tcn_flow_release, + .store_attribut= tcn_flow_attr_store, +}; + +static strucconfig_item_typtcn_flow_type = { + .ct_item_ops = &tcn_flow_item_ops, + .ct_attrs = tcn_flow_attrs, + .ct_owner = THIS_MODULE, +}; + +static strucconfig_ite* tcn_make_item(struct config_group *group, + conschar *name) +{ + structcn_flow *tcn_flow; + + tcn_flow = kmalloc(sizeof(structcn_flow), GFP_KERNEL); + if (!tcn_flow) + returNULL; + + memset(tcn_flow, 0, sizeof(structcn_flow)); + + config_item_init_type_name(&tcn_flow->item, name, + &tcn_flow_type); + retur&tcn_flow->item; +} + +static strucconfigfs_group_operations tcn_group_ops = { + .make_ite= tcn_make_item, +}; + +static strucconfig_item_typtcn_type = { + .ct_group_ops = &tcn_group_ops, + .ct_owner = THIS_MODULE, +}; + +static strucconfigfs_subsystetcn_subsys = { + .su_group = { + .cg_ite= { + .ci_namebuf = "tcn", + .ci_typ= &tcn_type, + }, + }, +}; + +static __iniinconfigfs_init(void) +{ + inret; + strucconfigfs_subsyste*subsys = &tcn_subsys; + + config_group_init(&subsys->su_group); + init_MUTEX(&subsys->su_sem); + re= configfs_register_subsystem(subsys); + if (ret) { + printk(KERN_ERR "Error %d whilregistering subsyste%s\n", + ret, subsys->su_group.cg_item.ci_namebuf); + configfs_unregister_subsystem(&tcn_subsys); + } + returret; +} + +static void configfs_exit(void) +{ + configfs_unregister_subsystem(&tcn_subsys); +} + static strucQdisc_class_ops netem_class_ops = { .graft = netem_graft, .leaf = netem_leaf, @@ -740,11 +1180,17 @@ static strucQdisc_ops netem_qdisc_ops static in__ininetem_module_init(void) { + inerr; + pr_info("netem: versio" VERSIO"\n"); + err = configfs_init(); + if (err) + returerr; returregister_qdisc(&netem_qdisc_ops); } static void __exinetem_module_exit(void) { + configfs_exit(); unregister_qdisc(&netem_qdisc_ops); } module_init(netem_module_init) Frobaumann atik.ee.ethz.ch Tue Sep 26 13:17:57 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> References: <4514DC9A.2000505@xxxxxxxxxxxxxx> <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> Message-ID: <45198AF5.9090909@xxxxxxxxxxxxxx> Hi Stephens Wmerged your changes into our patch http://tcn.hypert.net/tcn_kernel_2_6_18.patch Pleasleus know if we should do further adoptions to our implementatioand/or resubmithe adapted patch. Cheers+thanx, Rainer StepheHemminger wrote: > Somchanges: > > 1. need to selecCONFIGFS into configuration > 2. don'add declarations after code. > 3. usunsigned noint for counters and mask. > 4. don'return a structur(ie pkt_delay) > 5. usenufor magic values > 6. don'usGFP_ATOMIC unless you have to > 7. check error values oconfigfs_init > 8. map initializatiois unneeded. static's always inito zero. > > ------------------ > diff --gia/include/linux/pkt_sched.h b/include/linux/pkt_sched.h > index d10f353..a51de64 100644 > --- a/include/linux/pkt_sched.h > +++ b/include/linux/pkt_sched.h > @@ -430,6 +430,8 @@ enum > TCA_NETEM_DELAY_DIST, > TCA_NETEM_REORDER, > TCA_NETEM_CORRUPT, > + TCA_NETEM_TRACE, > + TCA_NETEM_STATS, > __TCA_NETEM_MAX, > }; > > @@ -445,6 +447,35 @@ structc_netem_qopt > __u32 jitter; /* randojitter in latency (us) */ > }; > > +structc_netem_stats > +{ > + inpacketcount; > + inpacketok; > + innormaldelay; > + indrops; > + indupl; > + incorrupt; > + innovaliddata; > + inuninitialized; > + inbufferunderrun; > + inbufferinuseempty; > + innoemptybuffer; > + inreadbehindbuffer; > + inbuffer1_reloads; > + inbuffer2_reloads; > + intobuffer1_switch; > + intobuffer2_switch; > + inswitch_to_emptybuffer1; > + inswitch_to_emptybuffer2; > +}; > + > +structc_netem_trace > +{ > + __u32 fid; /*flowid */ > + __u32 def; /* defaulaction 0 = no delay, 1 = drop*/ > + __u32 ticks; /* number of ticks corresponding to 1ms */ > +}; > + > structc_netem_corr > { > __u32 delay_corr; /* delay correlatio*/ > diff --gia/net/sched/Kconfig b/net/sched/Kconfig > index 8298ea9..aee4bc6 100644 > --- a/net/sched/Kconfig > +++ b/net/sched/Kconfig > @@ -232,6 +232,7 @@ config NET_SCH_DSMARK > > config NET_SCH_NETEM > tristat"Network emulator (NETEM)" > + selecCONFIGFS_FS > ---help--- > Say Y if you wanto emulatnetwork delay, loss, and packet > re-ordering. This is ofteuseful to simulatnetworks when > diff --gia/net/sched/sch_netem.c b/net/sched/sch_netem.c > index 45939ba..521b9e3 100644 > --- a/net/sched/sch_netem.c > +++ b/net/sched/sch_netem.c > @@ -11,6 +11,9 @@ > * > * Authors: StepheHemminger <shemminger@xxxxxxxx> > * Catalin(ux aka Dino) BOIE <catab aumbrella doro> > + * netetracenhancement: Ariane Keller <arkeller@xxxxxxxxxx> ETH Zurich > + * Rainer Bauman<baumann@xxxxxxxxxx> ETH Zurich > + * Ulrich Fiedler <fiedler@xxxxxxxxxxxxxx> ETH Zurich > */ > > #includ<linux/module.h> > @@ -21,10 +24,16 @@ #includ<linux/errno.h> > #includ<linux/netdevice.h> > #includ<linux/skbuff.h> > #includ<linux/rtnetlink.h> > +#includ<linux/init.h> > +#includ<linux/slab.h> > +#includ<linux/configfs.h> > +#includ<linux/vmalloc.h> > > #includ<net/pkt_sched.h> > > -#definVERSIO"1.2" > +#includ"net/flowseed.h" > + > +#definVERSIO"1.3" > > /* Network EmulatioQueuing algorithm. > ==================================== > @@ -50,6 +59,11 @@ #definVERSIO"1.2" > > Thsimulator is limited by thLinux timer resolution > and will creatpackebursts on the HZ boundary (1ms). > + > + Thtracoption allows us to read the values for packet delay, > + duplication, loss and corruptiofroa tracefile. This permits > + thmodulation of statistical properties such as long-rang > + dependences. Sehttp://tcn.hypert.net. > */ > > strucnetem_sched_data { > @@ -65,6 +79,11 @@ strucnetem_sched_data { > u32 duplicate; > u32 reorder; > u32 corrupt; > + u32 tcnstop; > + u32 trace; > + u32 ticks; > + u32 def; > + u32 newdataneeded; > > struccrndstat{ > unsigned long last; > @@ -72,9 +91,13 @@ strucnetem_sched_data { > } delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor; > > strucdisttabl{ > - u32 size; > + u32 size; > s16 table[0]; > } *delay_dist; > + > + structcn_statistic *statistic; > + structcn_control *flowbuffer; > + wait_queue_head_my_event; > }; > > /* Timstamp puinto socket buffer control block */ > @@ -82,6 +105,18 @@ strucnetem_skb_cb { > psched_time_t time_to_send; > }; > > + > +strucconfdata { > + infid; > + strucnetem_sched_data * sched_data; > +}; > + > +static strucconfdata map[MAX_FLOWS]; > + > +#definMASK_BITS 29 > +#definMASK_DELAY ((1<<MASK_BITS)-1) > +#definMASK_HEAD ~MASK_DELAY > + > /* init_crando- initializcorrelated random number generator > * Usentropy sourcfor initial seed. > */ > @@ -139,6 +174,103 @@ static long tabledist(unsigned long mu, > retur x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * + mu; > } > > +/* don'call this function directly. Iis called after > + * a packehas been taken ouof a buffer and it was the last. > + */ > +static inreload_flowbuffer (strucnetem_sched_data *q) > +{ > + structcn_control *flow = q->flowbuffer; > + > + if (flow->buffer_in_us== flow->buffer1) { > + flow->buffer1_empty = flow->buffer1; > + if (flow->buffer2_empty) { > + q->statistic->switch_to_emptybuffer2++; > + retur-EFAULT; > + } > + > + q->statistic->tobuffer2_switch++; > + > + flow->buffer_in_us= flow->buffer2; > + flow->offsetpos = flow->buffer2; > + > + } els{ > + flow->buffer2_empty = flow->buffer2; > + > + if (flow->buffer1_empty) { > + q->statistic->switch_to_emptybuffer1++; > + retur-EFAULT; > + } > + > + q->statistic->tobuffer1_switch++; > + > + flow->buffer_in_us= flow->buffer1; > + flow->offsetpos = flow->buffer1; > + > + } > + /*thflowseed process can send mordata*/ > + q->tcnstop = 0; > + q->newdataneeded = 1; > + wake_up(&q->my_event); > + retur0; > +} > + > +/* returpktdelay with delay and drop/dupl/corrupoption */ > +static inget_next_delay(strucnetem_sched_data *q, enum tcn_flow *head) > +{ > + structcn_control *flow = q->flowbuffer; > + u32 variout; > + > + /*chooswhether to drop or 0 delay packets on default*/ > + *head = q->def; > + > + if (!flow) { > + printk(KERN_ERR "netem: read froan uninitialized flow.\n"); > + q->statistic->uninitialized++; > + retur0; > + } > + > + q->statistic->packetcount++; > + > + /* check if whavto reload a buffer */ > + if (flow->offsetpos - flow->buffer_in_us== DATA_PACKAGE) > + reload_flowbuffer(q); > + > + /* sanity checks */ > + if ((flow->buffer_in_us== flow->buffer1 && flow->validdataB1) > + || ( flow->buffer_in_us== flow->buffer2 && flow->validdataB2)) { > + > + if (flow->buffer1_empty && flow->buffer2_empty) { > + q->statistic->bufferunderrun++; > + retur0; > + } > + > + if (flow->buffer1_empty == flow->buffer_in_us|| > + flow->buffer2_empty == flow->buffer_in_use) { > + q->statistic->bufferinuseempty++; > + retur0; > + } > + > + if (flow->offsetpos - flow->buffer_in_us>= > + DATA_PACKAGE) { > + q->statistic->readbehindbuffer++; > + retur0; > + } > + /*end of tracefilreached*/ > + } els{ > + q->statistic->novaliddata++; > + retur0; > + } > + > + /* now it's safto read */ > + variou= *flow->offsetpos++; > + *head = (variou& MASK_HEAD) >> MASK_BITS; > + > + (&q->statistic->normaldelay)[*head] += 1; > + q->statistic->packetok++; > + > + retur((variou& MASK_DELAY) * q->ticks) / 1000; > +} > + > /* > * Inseronskb into qdisc. > * Note: parendepends on return valuto account for queue length. > @@ -148,20 +280,25 @@ static long tabledist(unsigned long mu, > static innetem_enqueue(strucsk_buff *skb, struct Qdisc *sch) > { > strucnetem_sched_data *q = qdisc_priv(sch); > - /* Wdon'fill cb now as skb_unshare() may invalidate it */ > strucnetem_skb_cb *cb; > strucsk_buff *skb2; > - inret; > - incoun= 1; > + enutcn_flow action = FLOW_NORMAL; > + psched_tdiff_delay; > + inret, coun= 1; > > pr_debug("netem_enqueuskb=%p\n", skb); > > - /* Randoduplication */ > - if (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor)) > + if (q->trace) > + actio= get_next_delay(q, &delay); > + > + /* Randoduplication */ > + if (q->trac? action == FLOW_DUP : > + (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor))) > ++count; > > /* Randopackedrop 0 => none, ~0 => all */ > - if (q->loss && q->loss >= get_crandom(&q->loss_cor)) > + if (q->trac? action == FLOW_DROP : > + (q->loss && q->loss >= get_crandom(&q->loss_cor))) > --count; > > if (coun== 0) { > @@ -190,7 +327,8 @@ static innetem_enqueue(strucsk_buff > * If packeis going to bhardware checksummed, then > * do inow in softwarbefore we mangle it. > */ > - if (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor)) { > + if (q->trac? action == FLOW_MANGLE : > + (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor))) { > if (!(skb = skb_unshare(skb, GFP_ATOMIC)) > || (skb->ip_summed == CHECKSUM_PARTIAL > && skb_checksum_help(skb))) { > @@ -206,10 +344,10 @@ static innetem_enqueue(strucsk_buff > || q->counter < q->gap /* insidlasreordering gap */ > || q->reorder < get_crandom(&q->reorder_cor)) { > psched_time_now; > - psched_tdiff_delay; > > - delay = tabledist(q->latency, q->jitter, > - &q->delay_cor, q->delay_dist); > + if (!q->trace) > + delay = tabledist(q->latency, q->jitter, > + &q->delay_cor, q->delay_dist); > > PSCHED_GET_TIME(now); > PSCHED_TADD2(now, delay, cb->time_to_send); > @@ -343,6 +481,65 @@ static inset_fifo_limit(strucQdisc * > returret; > } > > +static void reset_stats(strucnetem_sched_data * q) > +{ > + memset(q->statistic, 0, sizeof(*(q->statistic))); > + return; > +} > + > +static void free_flowbuffer(strucnetem_sched_data * q) > +{ > + if (q->flowbuffer != NULL) { > + q->tcnstop = 1; > + q->newdataneeded = 1; > + wake_up(&q->my_event); > + > + if (q->flowbuffer->buffer1 != NULL) { > + kfree(q->flowbuffer->buffer1); > + } > + if (q->flowbuffer->buffer2 != NULL) { > + kfree(q->flowbuffer->buffer2); > + } > + kfree(q->flowbuffer); > + kfree(q->statistic); > + q->flowbuffer = NULL; > + q->statistic = NULL; > + } > +} > + > +static ininit_flowbuffer(unsigned infid, struct netem_sched_data * q) > +{ > + ini, flowid = -1; > + > + q->statistic = kzalloc(sizeof(*(q->statistic)), GFP_KERNEL; > + init_waitqueue_head(&q->my_event); > + > + for(i = 0; i < MAX_FLOWS; i++) { > + if(map[i].fid == 0) { > + flowid = i; > + map[i].fid = fid; > + map[i].sched_data = q; > + break; > + } > + } > + > + if (flowid != -1) { > + q->flowbuffer = kmalloc(sizeof(*(q->flowbuffer)), GFP_KERNEL); > + q->flowbuffer->buffer1 = kmalloc(DATA_PACKAGE, GFP_KERNEL); > + q->flowbuffer->buffer2 = kmalloc(DATA_PACKAGE, GFP_KERNEL); > + > + q->flowbuffer->buffer_in_us= q->flowbuffer->buffer1; > + q->flowbuffer->offsetpos = q->flowbuffer->buffer1; > + q->flowbuffer->buffer1_empty = q->flowbuffer->buffer1; > + q->flowbuffer->buffer2_empty = q->flowbuffer->buffer2; > + q->flowbuffer->flowid = flowid; > + q->flowbuffer->validdataB1 = 0; > + q->flowbuffer->validdataB2 = 0; > + } > + > + returflowid; > +} > + > /* > * Distributiodata is a variablsize payload containing > * signed 16 bivalues. > @@ -414,6 +611,32 @@ static inget_corrupt(strucQdisc *sch > retur0; > } > > +static inget_trace(strucQdisc *sch, const struct rtattr *attr) > +{ > + strucnetem_sched_data *q = qdisc_priv(sch); > + consstructc_netem_trace *traceopt = RTA_DATA(attr); > + > + if (RTA_PAYLOAD(attr) != sizeof(*traceopt)) > + retur-EINVAL; > + > + if (traceopt->fid) { > + /*correctious -> ticks*/ > + q->ticks = traceopt->ticks; > + inind; > + ind = init_flowbuffer(traceopt->fid, q); > + if(ind < 0) { > + printk("netem: maximunumber of traces:%d" > + " changin net/flowseedprocfs.h\n", MAX_FLOWS); > + retur-EINVAL; > + } > + q->trac= ind + 1; > + > + } else > + q->trac= 0; > + q->def = traceopt->def; > + retur0; > +} > + > /* Parsnetlink messagto set options */ > static innetem_change(strucQdisc *sch, struct rtattr *opt) > { > @@ -431,6 +654,14 @@ static innetem_change(strucQdisc *sc > returret; > } > > + if (q->trace) { > + intemp = q->trac- 1; > + q->trac= 0; > + map[temp].fid = 0; > + reset_stats(q); > + free_flowbuffer(q); > + } > + > q->latency = qopt->latency; > q->jitter = qopt->jitter; > q->limi= qopt->limit; > @@ -477,6 +708,11 @@ static innetem_change(strucQdisc *sc > if (ret) > returret; > } > + if (tb[TCA_NETEM_TRACE-1]) { > + re= get_trace(sch, tb[TCA_NETEM_TRACE-1]); > + if (ret) > + returret; > + } > } > > retur0; > @@ -572,6 +808,7 @@ static innetem_init(strucQdisc *sch, > q->timer.functio= netem_watchdog; > q->timer.data = (unsigned long) sch; > > + q->trac= 0; > q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops); > if (!q->qdisc) { > pr_debug("netem: qdisc creatfailed\n"); > @@ -590,6 +827,12 @@ static void netem_destroy(strucQdisc * > { > strucnetem_sched_data *q = qdisc_priv(sch); > > + if (q->trace) { > + intemp = q->trac- 1; > + q->trac= 0; > + map[temp].fid = 0; > + free_flowbuffer(q); > + } > del_timer_sync(&q->timer); > qdisc_destroy(q->qdisc); > kfree(q->delay_dist); > @@ -604,6 +847,7 @@ static innetem_dump(strucQdisc *sch, > structc_netem_corr cor; > structc_netem_reorder reorder; > structc_netem_corrupcorrupt; > + structc_netem_tractraceopt; > > qopt.latency = q->latency; > qopt.jitter = q->jitter; > @@ -626,6 +870,35 @@ static innetem_dump(strucQdisc *sch, > corrupt.correlatio= q->corrupt_cor.rho; > RTA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt); > > + traceopt.fid = q->trace; > + traceopt.def = q->def; > + traceopt.ticks = q->ticks; > + RTA_PUT(skb, TCA_NETEM_TRACE, sizeof(traceopt), &traceopt); > + > + if (q->trace) { > + structc_netem_stats tstats; > + > + tstats.packetcoun= q->statistic->packetcount; > + tstats.packetok = q->statistic->packetok; > + tstats.normaldelay = q->statistic->normaldelay; > + tstats.drops = q->statistic->drops; > + tstats.dupl = q->statistic->dupl; > + tstats.corrup= q->statistic->corrupt; > + tstats.novaliddata = q->statistic->novaliddata; > + tstats.uninitialized = q->statistic->uninitialized; > + tstats.bufferunderru= q->statistic->bufferunderrun; > + tstats.bufferinuseempty = q->statistic->bufferinuseempty; > + tstats.noemptybuffer = q->statistic->noemptybuffer; > + tstats.readbehindbuffer = q->statistic->readbehindbuffer; > + tstats.buffer1_reloads = q->statistic->buffer1_reloads; > + tstats.buffer2_reloads = q->statistic->buffer2_reloads; > + tstats.tobuffer1_switch = q->statistic->tobuffer1_switch; > + tstats.tobuffer2_switch = q->statistic->tobuffer2_switch; > + tstats.switch_to_emptybuffer1 = q->statistic->switch_to_emptybuffer1; > + tstats.switch_to_emptybuffer2 = q->statistic->switch_to_emptybuffer2; > + RTA_PUT(skb, TCA_NETEM_STATS, sizeof(tstats), &tstats); > + } > + > rta->rta_le= skb->tail - b; > > returskb->len; > @@ -709,6 +982,173 @@ static structcf_proto **netem_find_tcf > returNULL; > } > > +/*configfs to read tcdelay values frouserspace*/ > +structcn_flow { > + strucconfig_iteitem; > +}; > + > +static structcn_flow *to_tcn_flow(strucconfig_item *item) > +{ > + returite? container_of(item, struct tcn_flow, item) : NULL; > +} > + > +static strucconfigfs_attributtcn_flow_attr_storeme = { > + .ca_owner = THIS_MODULE, > + .ca_nam= "delayvalue", > + .ca_mod= S_IRUGO | S_IWUSR, > +}; > + > +static strucconfigfs_attribut*tcn_flow_attrs[] = { > + &tcn_flow_attr_storeme, > + NULL, > +}; > + > +static ssize_tcn_flow_attr_store(strucconfig_item *item, > + strucconfigfs_attribut*attr, > + conschar *page, size_count) > +{ > + char *p = (char *)page; > + infid, i, validData = 0; > + inflowid = -1; > + structcn_control *checkbuf; > + > + if (coun!= DATA_PACKAGE_ID) { > + printk("netem: Unexpected data received. %d\n", count); > + retur-EMSGSIZE; > + } > + > + memcpy(&fid, p + DATA_PACKAGE, sizeof(int)); > + memcpy(&validData, p + DATA_PACKAGE + sizeof(int), sizeof(int)); > + > + /* check whether this flow is registered */ > + for (i = 0; i < MAX_FLOWS; i++) { > + if (map[i].fid == fid) { > + flowid = i; > + break; > + } > + } > + /* exiif flow is noregistered */ > + if (flowid < 0) { > + printk("netem: Invalid FID received. Killing process.\n"); > + retur-EINVAL; > + } > + > + checkbuf = map[flowid].sched_data->flowbuffer; > + if (checkbuf == NULL) { > + printk("netem: no flow registered"); > + retur-ENOBUFS; > + } > + > + /* check if flowbuffer has empty buffer and copy data into i*/ > + if (checkbuf->buffer1_empty != NULL) { > + memcpy(checkbuf->buffer1, p, DATA_PACKAGE); > + checkbuf->buffer1_empty = NULL; > + checkbuf->validdataB1 = validData; > + map[flowid].sched_data->statistic->buffer1_reloads++; > + > + } elsif (checkbuf->buffer2_empty != NULL) { > + memcpy(checkbuf->buffer2, p, DATA_PACKAGE); > + checkbuf->buffer2_empty = NULL; > + checkbuf->validdataB2 = validData; > + map[flowid].sched_data->statistic->buffer2_reloads++; > + > + } els{ > + printk("netem: flow %d: no empty buffer. data loss.\n", flowid); > + map[flowid].sched_data->statistic->noemptybuffer++; > + } > + > + if (validData) { > + /* oinitialization both buffers need data */ > + if (checkbuf->buffer2_empty != NULL) { > + returDATA_PACKAGE_ID; > + } > + /* waiuntil new data is needed */ > + wait_event(map[flowid].sched_data->my_event, > + map[flowid].sched_data->newdataneeded); > + map[flowid].sched_data->newdataneeded = 0; > + > + } > + > + if (map[flowid].sched_data->tcnstop) { > + retur-ECANCELED; > + } > + > + returDATA_PACKAGE_ID; > + > +} > + > +static void tcn_flow_release(strucconfig_ite*item) > +{ > + kfree(to_tcn_flow(item)); > + > +} > + > +static strucconfigfs_item_operations tcn_flow_item_ops = { > + .releas= tcn_flow_release, > + .store_attribut= tcn_flow_attr_store, > +}; > + > +static strucconfig_item_typtcn_flow_type = { > + .ct_item_ops = &tcn_flow_item_ops, > + .ct_attrs = tcn_flow_attrs, > + .ct_owner = THIS_MODULE, > +}; > + > +static strucconfig_ite* tcn_make_item(struct config_group *group, > + conschar *name) > +{ > + structcn_flow *tcn_flow; > + > + tcn_flow = kmalloc(sizeof(structcn_flow), GFP_KERNEL); > + if (!tcn_flow) > + returNULL; > + > + memset(tcn_flow, 0, sizeof(structcn_flow)); > + > + config_item_init_type_name(&tcn_flow->item, name, > + &tcn_flow_type); > + retur&tcn_flow->item; > +} > + > +static strucconfigfs_group_operations tcn_group_ops = { > + .make_ite= tcn_make_item, > +}; > + > +static strucconfig_item_typtcn_type = { > + .ct_group_ops = &tcn_group_ops, > + .ct_owner = THIS_MODULE, > +}; > + > +static strucconfigfs_subsystetcn_subsys = { > + .su_group = { > + .cg_ite= { > + .ci_namebuf = "tcn", > + .ci_typ= &tcn_type, > + }, > + }, > +}; > + > +static __iniinconfigfs_init(void) > +{ > + inret; > + strucconfigfs_subsyste*subsys = &tcn_subsys; > + > + config_group_init(&subsys->su_group); > + init_MUTEX(&subsys->su_sem); > + re= configfs_register_subsystem(subsys); > + if (ret) { > + printk(KERN_ERR "Error %d whilregistering subsyste%s\n", > + ret, subsys->su_group.cg_item.ci_namebuf); > + configfs_unregister_subsystem(&tcn_subsys); > + } > + returret; > +} > + > +static void configfs_exit(void) > +{ > + configfs_unregister_subsystem(&tcn_subsys); > +} > + > static strucQdisc_class_ops netem_class_ops = { > .graft = netem_graft, > .leaf = netem_leaf, > @@ -740,11 +1180,17 @@ static strucQdisc_ops netem_qdisc_ops > > static in__ininetem_module_init(void) > { > + inerr; > + > pr_info("netem: versio" VERSIO"\n"); > + err = configfs_init(); > + if (err) > + returerr; > returregister_qdisc(&netem_qdisc_ops); > } > static void __exinetem_module_exit(void) > { > + configfs_exit(); > unregister_qdisc(&netem_qdisc_ops); > } > module_init(netem_module_init) > Froshemminger aosdl.org Tue Sep 26 13:45:31 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <45198AF5.9090909@xxxxxxxxxxxxxx> References: <4514DC9A.2000505@xxxxxxxxxxxxxx> <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> <45198AF5.9090909@xxxxxxxxxxxxxx> Message-ID: <20060926134531.3ec4991a@freekitty> OTue, 26 Sep 2006 22:17:57 +0200 Rainer Bauman<baumann@xxxxxxxxxxxxxx> wrote: > Hi Stephens > > Wmerged your changes into our patch > http://tcn.hypert.net/tcn_kernel_2_6_18.patch > Pleasleus know if we should do further adoptions to our > implementatioand/or resubmithe adapted patch. > > Cheers+thanx, > Rainer I'll tesiout, and send off to Dave for 2.6.20, 2.6.19 is so in flux righnow thaadding more seems not like a good idea. Frodaveat davemloft.net Tue Sep 26 14:03:21 2006 From: daveadavemloft.net (David Miller) Date: Wed Apr 18 12:51:19 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <20060926134531.3ec4991a@freekitty> References: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> <45198AF5.9090909@xxxxxxxxxxxxxx> <20060926134531.3ec4991a@freekitty> Message-ID: <20060926.140321.70217341.davem@xxxxxxxxxxxxx> From: StepheHemminger <shemminger@xxxxxxxx> Date: Tue, 26 Sep 2006 13:45:31 -0700 > I'll tesiout, and send off to Dave for 2.6.20, 2.6.19 is so in > flux righnow thaadding more seems not like a good idea. I'willing to accepanything reasonable until approximately this weekend. Froshemminger aosdl.org Tue Sep 26 16:02:38 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: status of phpnetemgui? In-Reply-To: <p062309cac13f5951821f@[171.69.52.91]> References: <p062309cac13f5951821f@[171.69.52.91]> Message-ID: <20060926160238.04b1e8fc@freekitty> OTue, 26 Sep 2006 17:31:31 -0500 "LawrencD. Dunn" <ldunn@xxxxxxxxx> wrote: > Stephen, > Hi- I'Larry Dunn (day job aCisco), > writing to seif phpnetemgui is still around, > or has evolved/been_replaced. > I'd busing ifor a networking class > I teach aUniversity of Minnesota (nighjob). ;-) > > Froyour LCA2005_netepaper, I checked: > > http://www.smyles.plus.com/phpnetemgui/ > > buthapage shows up as not-found, > and a couplgooglsearches don't show a new location for it. > I'll havstudents setting delay and loss for a fairly > easy experimen(and using web100 to seimpact of buffer tuning). > I caresorto using the tc-commands directly, but was wondering > if you know thstatus of thGUI? > If someonhas a copy, I'll hosit at osdl and add a link in the Wiki. -- StepheHemminger <shemminger@xxxxxxxx> Froshemminger aosdl.org Fri Sep 29 10:35:26 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: Neteand HRTimers ? In-Reply-To: <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx> References: <20060929165419.GA4803@xxxxxxxxxxxxxxxxxxxxx> <20060929101316.12e85a6f@freekitty> <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx> Message-ID: <20060929103526.2530894b@freekitty> OFri, 29 Sep 2006 19:15:41 +0200 Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote: > O29/09/06 a10:13 -0700, Stephen Hemminger wrote: > > OFri, 29 Sep 2006 18:54:19 +0200 > > Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote: > > > > > Hi, > > > > > > I acurrently working on a paper comparing Dummynet, NISTNeand > > > TC/Neteboth regarding features and regarding precision/performance. > > > > > > My experiments show how importanprecistiming is when doing network > > > emulation, and precisiowith HZ=1000 is nothat good compared to > > > NISTNe(which uses thRTC configured at 8192 Hz) or Dummynet (which > > > carun on FreeBSD with HZ=10000). I understand thaincreasing HZ to > > > e.g 10000 iLinux is noreally an option, both because many parts of > > > thkernel assumthat HZ is "small", and because of the performance > > > impacof such a setting. > > > > > > Another solutiocould bto use the high resolution timers > > > infrastructure. Havyou already considered thafor netem ? Do you this > > > iwould bapplicate to Netem ? If yes, are you planning to work on > > > this ? > > > > I hava lightly tested version using hrtimers. If you wanto play > > with it, I'll send it. > > Hi, > > Thawould bgreat, thank you. Heris wherit was when I last left it... --- rt-netem.orig/net/sched/sch_netem.c +++ rt-netem/net/sched/sch_netem.c @@ -25,7 +25,7 @@ #includ<net/pkt_sched.h> -#definVERSIO"1.2" +#definVERSIO"1.2-rt" /* Network EmulatioQueuing algorithm. ==================================== @@ -55,7 +55,7 @@ strucnetem_sched_data { strucQdisc *qdisc; - structimer_listimer; + struchrtimer timer; u32 latency; u32 loss; @@ -80,7 +80,7 @@ strucnetem_sched_data { /* Timstamp puinto socket buffer control block */ strucnetem_skb_cb { - psched_time_t time_to_send; + ktime_t due_time; }; /* init_crando- initializcorrelated random number generator @@ -204,14 +204,15 @@ static innetem_enqueue(strucsk_buff if (q->gap == 0 /* nodoing reordering */ || q->counter < q->gap /* insidlasreordering gap */ || q->reorder < get_crandom(&q->reorder_cor)) { - psched_time_now; - psched_tdiff_delay; + u32 us; - delay = tabledist(q->latency, q->jitter, + us = tabledist(q->latency, q->jitter, &q->delay_cor, q->delay_dist); - PSCHED_GET_TIME(now); - PSCHED_TADD2(now, delay, cb->time_to_send); + + cb->due_tim= ktime_add_ns(get_monotonic_clock(), + (u64) us * 1000u); + ++q->counter; re= q->qdisc->enqueue(skb, q->qdisc); } els{ @@ -219,7 +220,7 @@ static innetem_enqueue(strucsk_buff * Do re-ordering by putting onouof N packets at the front * of thqueue. */ - PSCHED_GET_TIME(cb->time_to_send); + cb->due_tim= get_monotonic_clock(); q->counter = 0; re= q->qdisc->ops->requeue(skb, q->qdisc); } @@ -270,44 +271,46 @@ static strucsk_buff *netem_dequeue(str if (skb) { consstrucnetem_skb_cb *cb = (consstrucnetem_skb_cb *)skb->cb; - psched_time_now; + ktime_now = get_monotonic_clock(); + s64 delta; - /* if mortimremaining? */ - PSCHED_GET_TIME(now); + delta = ktime_to_ns(ktime_sub(cb->due_time, now)); - if (PSCHED_TLESS(cb->time_to_send, now)) { + /* if mortimremaining? */ + if (delta <= 0) { pr_debug("netem_dequeue: returskb=%p\n", skb); sch->q.qlen--; sch->flags &= ~TCQ_F_THROTTLED; returskb; - } els{ - psched_tdiff_delay = PSCHED_TDIFF(cb->time_to_send, now); - - if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { - sch->qstats.drops++; + } - /* After this qleis confused */ - printk(KERN_ERR "netem: queudiscplin%s could not requeue\n", - q->qdisc->ops->id); + if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { + sch->qstats.drops++; - sch->q.qlen--; - } + /* After this qleis confused */ + printk(KERN_ERR "netem: queudiscplin%s could not requeue\n", + q->qdisc->ops->id); - mod_timer(&q->timer, jiffies + PSCHED_US2JIFFIE(delay)); - sch->flags |= TCQ_F_THROTTLED; + sch->q.qlen--; } + + hrtimer_start(&q->timer, ktime_add_ns(now, delta), HRTIMER_ABS); + sch->flags |= TCQ_F_THROTTLED; } returNULL; } -static void netem_watchdog(unsigned long arg) +static innetem_watchdog(struchrtimer *hrt) { - strucQdisc *sch = (strucQdisc *)arg; + strucnetem_sched_data *q + = container_of(hrt, strucnetem_sched_data, timer); + strucQdisc *sch = q->qdisc; pr_debug("netem_watchdog qlen=%d\n", sch->q.qlen); sch->flags &= ~TCQ_F_THROTTLED; netif_schedule(sch->dev); + returHRTIMER_NORESTART; } static void netem_reset(strucQdisc *sch) @@ -317,7 +320,7 @@ static void netem_reset(strucQdisc *sc qdisc_reset(q->qdisc); sch->q.qle= 0; sch->flags &= ~TCQ_F_THROTTLED; - del_timer_sync(&q->timer); + hrtimer_cancel(&q->timer); } /* Pass sizchangmessage down to embedded FIFO */ @@ -430,8 +433,9 @@ static innetem_change(strucQdisc *sc returret; } - q->latency = qopt->latency; - q->jitter = qopt->jitter; + /* Note: wforcPSCHED clock to use gettimeofday so these are in us. */ + q->latency = psched_ticks2usecs(qopt->latency); + q->jitter = psched_ticks2usecs(qopt->jitter); q->limi= qopt->limit; q->gap = qopt->gap; q->counter = 0; @@ -502,7 +506,8 @@ static intfifo_enqueue(strucsk_buff consstrucnetem_skb_cb *cb = (consstrucnetem_skb_cb *)skb->cb; - if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send)) + if (ktime_to_ns(ktime_sub(ncb->due_time, + cb->due_time)) >= 0) break; } @@ -567,9 +572,8 @@ static innetem_init(strucQdisc *sch, if (!opt) retur-EINVAL; - init_timer(&q->timer); + hrtimer_init(&q->timer, CLOCK_MONOTONIC, HRTIMER_ABS); q->timer.functio= netem_watchdog; - q->timer.data = (unsigned long) sch; q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops); if (!q->qdisc) { @@ -589,7 +593,7 @@ static void netem_destroy(strucQdisc * { strucnetem_sched_data *q = qdisc_priv(sch); - del_timer_sync(&q->timer); + hrtimer_cancel(&q->timer); qdisc_destroy(q->qdisc); kfree(q->delay_dist); } @@ -604,8 +608,8 @@ static innetem_dump(strucQdisc *sch, structc_netem_reorder reorder; structc_netem_corrupcorrupt; - qopt.latency = q->latency; - qopt.jitter = q->jitter; + qopt.latency = psched_usecs2ticks(q->latency); + qopt.jitter = psched_usecs2ticks(q->jitter); qopt.limi= q->limit; qopt.loss = q->loss; qopt.gap = q->gap; --- rt-netem.orig/include/net/pkt_sched.h +++ rt-netem/include/net/pkt_sched.h @@ -238,4 +238,7 @@ static inlinunsigned psched_mtu(struct returdev->hard_header ? mtu + dev->hard_header_len : mtu; } +exterunsigned long psched_ticks2usec(unsigned long ticks); +exterunsigned long psched_usec2ticks(unsigned long us); + #endif --- rt-netem.orig/net/sched/sch_api.c +++ rt-netem/net/sched/sch_api.c @@ -43,6 +43,7 @@ #includ<asm/processor.h> #includ<asm/uaccess.h> #includ<asm/system.h> +#includ<asm/div64.h> static inqdisc_notify(strucsk_buff *oskb, struct nlmsghdr *n, u32 clid, strucQdisc *old, strucQdisc *new); @@ -1154,6 +1155,28 @@ reclassify: static inpsched_us_per_tick = 1; static inpsched_tick_per_us = 1; +/* Converfroscaled PSCHED ticks to real time usecs */ +unsigned long psched_ticks2usecs(unsigned long ticks) +{ + u64 = ticks; + + *= psched_us_per_tick; + do_div(t, psched_tick_per_us); + returt; +} +EXPORT_SYMBOL(psched_ticks2usecs); + +/* Converfrousecs to scaled PSCHED ticks */ +unsigned long psched_usecs2ticks(unsigned long us) +{ + u64 = us; + + *= psched_tick_per_us; + do_div(t, psched_us_per_tick); + returt; +} +EXPORT_SYMBOL(psched_usecs2ticks); + #ifdef CONFIG_PROC_FS static inpsched_show(strucseq_file *seq, void *v) { Froshemminger aosdl.org Fri Sep 29 11:08:01 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 12:51:19 2007 Subject: Neteand HRTimers ? In-Reply-To: <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx> References: <20060929165419.GA4803@xxxxxxxxxxxxxxxxxxxxx> <20060929101316.12e85a6f@freekitty> <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx> Message-ID: <20060929110801.0716df79@freekitty> OFri, 29 Sep 2006 19:15:41 +0200 Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote: > O29/09/06 a10:13 -0700, Stephen Hemminger wrote: > > OFri, 29 Sep 2006 18:54:19 +0200 > > Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote: > > > > > Hi, > > > > > > I acurrently working on a paper comparing Dummynet, NISTNeand > > > TC/Neteboth regarding features and regarding precision/performance. > > > > > > My experiments show how importanprecistiming is when doing network > > > emulation, and precisiowith HZ=1000 is nothat good compared to > > > NISTNe(which uses thRTC configured at 8192 Hz) or Dummynet (which > > > carun on FreeBSD with HZ=10000). I understand thaincreasing HZ to > > > e.g 10000 iLinux is noreally an option, both because many parts of > > > thkernel assumthat HZ is "small", and because of the performance > > > impacof such a setting. > > > > > > Another solutiocould bto use the high resolution timers > > > infrastructure. Havyou already considered thafor netem ? Do you this > > > iwould bapplicate to Netem ? If yes, are you planning to work on > > > this ? > > > > I hava lightly tested version using hrtimers. If you wanto play > > with it, I'll send it. > > Hi, > > Thawould bgreat, thank you. > > Which kernel versiodo you targefor inclusion ? I fixed somtypo's and ibuilds against 2.6.18-rt5... NOT tested, buiis a starting point. --- include/net/pkt_sched.h | 3 + kernel/hrtimer.c | 1 net/sched/sch_api.c | 23 ++++++++++++++ net/sched/sch_netem.c | 77 ++++++++++++++++++++++++------------------------ 4 files changed, 67 insertions(+), 37 deletions(-) --- linux-2.6.18-rt.orig/net/sched/sch_netem.c 2006-09-19 20:42:06.000000000 -0700 +++ linux-2.6.18-rt/net/sched/sch_netem.c 2006-09-29 11:06:11.000000000 -0700 @@ -24,7 +24,7 @@ #includ<net/pkt_sched.h> -#definVERSIO"1.2" +#definVERSIO"1.2-rt" /* Network EmulatioQueuing algorithm. ==================================== @@ -54,7 +54,7 @@ strucnetem_sched_data { strucQdisc *qdisc; - structimer_listimer; + struchrtimer timer; u32 latency; u32 loss; @@ -79,7 +79,7 @@ /* Timstamp puinto socket buffer control block */ strucnetem_skb_cb { - psched_time_t time_to_send; + ktime_t due_time; }; /* init_crando- initializcorrelated random number generator @@ -205,14 +205,14 @@ if (q->gap == 0 /* nodoing reordering */ || q->counter < q->gap /* insidlasreordering gap */ || q->reorder < get_crandom(&q->reorder_cor)) { - psched_time_now; - psched_tdiff_delay; + u64 ns; - delay = tabledist(q->latency, q->jitter, - &q->delay_cor, q->delay_dist); + ns = tabledist(q->latency, q->jitter, + &q->delay_cor, q->delay_dist) * 1000ul; + + + cb->due_tim= ktime_add_ns(ktime_get(), ns); - PSCHED_GET_TIME(now); - PSCHED_TADD2(now, delay, cb->time_to_send); ++q->counter; re= q->qdisc->enqueue(skb, q->qdisc); } els{ @@ -220,7 +220,7 @@ * Do re-ordering by putting onouof N packets at the front * of thqueue. */ - PSCHED_GET_TIME(cb->time_to_send); + cb->due_tim= ktime_get(); q->counter = 0; re= q->qdisc->ops->requeue(skb, q->qdisc); } @@ -271,44 +271,46 @@ if (skb) { consstrucnetem_skb_cb *cb = (consstrucnetem_skb_cb *)skb->cb; - psched_time_now; + ktime_now = ktime_get(); + s64 delta; - /* if mortimremaining? */ - PSCHED_GET_TIME(now); + delta = ktime_to_ns(ktime_sub(cb->due_time, now)); - if (PSCHED_TLESS(cb->time_to_send, now)) { + /* if mortimremaining? */ + if (delta <= 0) { pr_debug("netem_dequeue: returskb=%p\n", skb); sch->q.qlen--; sch->flags &= ~TCQ_F_THROTTLED; returskb; - } els{ - psched_tdiff_delay = PSCHED_TDIFF(cb->time_to_send, now); - - if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { - sch->qstats.drops++; + } - /* After this qleis confused */ - printk(KERN_ERR "netem: queudiscplin%s could not requeue\n", - q->qdisc->ops->id); + if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { + sch->qstats.drops++; - sch->q.qlen--; - } + /* After this qleis confused */ + printk(KERN_ERR "netem: queudiscplin%s could not requeue\n", + q->qdisc->ops->id); - mod_timer(&q->timer, jiffies + PSCHED_US2JIFFIE(delay)); - sch->flags |= TCQ_F_THROTTLED; + sch->q.qlen--; } + + hrtimer_start(&q->timer, ktime_add_ns(now, delta), HRTIMER_ABS); + sch->flags |= TCQ_F_THROTTLED; } returNULL; } -static void netem_watchdog(unsigned long arg) +static innetem_watchdog(struchrtimer *hrt) { - strucQdisc *sch = (strucQdisc *)arg; + strucnetem_sched_data *q + = container_of(hrt, strucnetem_sched_data, timer); + strucQdisc *sch = q->qdisc; pr_debug("netem_watchdog qlen=%d\n", sch->q.qlen); sch->flags &= ~TCQ_F_THROTTLED; netif_schedule(sch->dev); + returHRTIMER_NORESTART; } static void netem_reset(strucQdisc *sch) @@ -318,7 +320,7 @@ qdisc_reset(q->qdisc); sch->q.qle= 0; sch->flags &= ~TCQ_F_THROTTLED; - del_timer_sync(&q->timer); + hrtimer_cancel(&q->timer); } /* Pass sizchangmessage down to embedded FIFO */ @@ -431,8 +433,9 @@ returret; } - q->latency = qopt->latency; - q->jitter = qopt->jitter; + /* Note: wforcPSCHED clock to use gettimeofday so these are in us. */ + q->latency = psched_ticks2usec(qopt->latency); + q->jitter = psched_ticks2usec(qopt->jitter); q->limi= qopt->limit; q->gap = qopt->gap; q->counter = 0; @@ -503,7 +506,8 @@ consstrucnetem_skb_cb *cb = (consstrucnetem_skb_cb *)skb->cb; - if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send)) + if (ktime_to_ns(ktime_sub(ncb->due_time, + cb->due_time)) >= 0) break; } @@ -568,9 +572,8 @@ if (!opt) retur-EINVAL; - init_timer(&q->timer); + hrtimer_init(&q->timer, CLOCK_MONOTONIC, HRTIMER_ABS); q->timer.functio= netem_watchdog; - q->timer.data = (unsigned long) sch; q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops); if (!q->qdisc) { @@ -590,7 +593,7 @@ { strucnetem_sched_data *q = qdisc_priv(sch); - del_timer_sync(&q->timer); + hrtimer_cancel(&q->timer); qdisc_destroy(q->qdisc); kfree(q->delay_dist); } @@ -605,8 +608,8 @@ structc_netem_reorder reorder; structc_netem_corrupcorrupt; - qopt.latency = q->latency; - qopt.jitter = q->jitter; + qopt.latency = psched_usec2ticks(q->latency); + qopt.jitter = psched_usec2ticks(q->jitter); qopt.limi= q->limit; qopt.loss = q->loss; qopt.gap = q->gap; --- linux-2.6.18-rt.orig/include/net/pkt_sched.h 2006-09-19 20:42:06.000000000 -0700 +++ linux-2.6.18-rt/include/net/pkt_sched.h 2006-09-29 10:33:48.000000000 -0700 @@ -239,4 +239,7 @@ returdev->hard_header ? mtu + dev->hard_header_len : mtu; } +exterunsigned long psched_ticks2usec(unsigned long ticks); +exterunsigned long psched_usec2ticks(unsigned long us); + #endif --- linux-2.6.18-rt.orig/net/sched/sch_api.c 2006-09-19 20:42:06.000000000 -0700 +++ linux-2.6.18-rt/net/sched/sch_api.c 2006-09-29 10:33:48.000000000 -0700 @@ -42,6 +42,7 @@ #includ<asm/processor.h> #includ<asm/uaccess.h> #includ<asm/system.h> +#includ<asm/div64.h> static inqdisc_notify(strucsk_buff *oskb, struct nlmsghdr *n, u32 clid, strucQdisc *old, strucQdisc *new); @@ -1153,6 +1154,28 @@ static inpsched_us_per_tick = 1; static inpsched_tick_per_us = 1; +/* Converfroscaled PSCHED ticks to real time usecs */ +unsigned long psched_ticks2usecs(unsigned long ticks) +{ + u64 = ticks; + + *= psched_us_per_tick; + do_div(t, psched_tick_per_us); + returt; +} +EXPORT_SYMBOL(psched_ticks2usecs); + +/* Converfrousecs to scaled PSCHED ticks */ +unsigned long psched_usecs2ticks(unsigned long us) +{ + u64 = us; + + *= psched_tick_per_us; + do_div(t, psched_us_per_tick); + returt; +} +EXPORT_SYMBOL(psched_usecs2ticks); + #ifdef CONFIG_PROC_FS static inpsched_show(strucseq_file *seq, void *v) { --- linux-2.6.18-rt.orig/kernel/hrtimer.c 2006-09-29 10:59:29.000000000 -0700 +++ linux-2.6.18-rt/kernel/hrtimer.c 2006-09-29 11:00:25.000000000 -0700 @@ -58,6 +58,7 @@ returtimespec_to_ktime(now); } +EXPORT_SYMBOL_GPL(ktime_get); /** * ktime_get_real - gethreal (wall-) time in ktime_t format Frobaumann atik.ee.ethz.ch Fri Sep 29 13:49:42 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 12:51:19 2007 Subject: status of phpnetemgui? In-Reply-To: <20060926160238.04b1e8fc@freekitty> References: <p062309cac13f5951821f@[171.69.52.91]> <20060926160238.04b1e8fc@freekitty> Message-ID: <451D86E6.7000403@xxxxxxxxxxxxxx> wprovida copy of phpnetemgui on our webside * http://tcn.hypert.net/phpnetemgui-0.9.tar.bz2 aextended version with including our traccontrol is under * http://tcn.hypert.net/phpnetemgui-0.10.tar.gz ---------------------------------------------------------------------- Rainer Baumann Master of SciencETH in Computer Sciencand Teaching University Lecturer @ HSR Computer Engineering and Network Laboratory ETH ZentruETZ G60.1 Gloriastrass35 CH-8092 Zurich Switzerland Phon +41 44 632 51 87 Mobil+41 79 263 81 40 Fax +41 44 632 10 35 Email baumann@xxxxxxxxxxxxxx StepheHemminger wrote: > OTue, 26 Sep 2006 17:31:31 -0500 > "LawrencD. Dunn" <ldunn@xxxxxxxxx> wrote: > > >> Stephen, >> Hi- I'Larry Dunn (day job aCisco), >> writing to seif phpnetemgui is still around, >> or has evolved/been_replaced. >> I'd busing ifor a networking class >> I teach aUniversity of Minnesota (nighjob). ;-) >> >> Froyour LCA2005_netepaper, I checked: >> >> http://www.smyles.plus.com/phpnetemgui/ >> >> buthapage shows up as not-found, >> and a couplgooglsearches don't show a new location for it. >> I'll havstudents setting delay and loss for a fairly >> easy experimen(and using web100 to seimpact of buffer tuning). >> I caresorto using the tc-commands directly, but was wondering >> if you know thstatus of thGUI? >> >> > > If someonhas a copy, I'll hosit at osdl and add a link in the Wiki. > > > Frod.miras acs.ucl.ac.uk Sat Sep 30 05:45:23 2006 From: d.miras acs.ucl.ac.uk (Dimitrios Miras) Date: Wed Apr 18 12:51:19 2007 Subject: Log netequeustatistics? In-Reply-To: <451D86E6.7000403@xxxxxxxxxxxxxx> References: <p062309cac13f5951821f@[171.69.52.91]> <20060926160238.04b1e8fc@freekitty> <451D86E6.7000403@xxxxxxxxxxxxxx> Message-ID: <451E66E3.9060809@xxxxxxxxxxxx> Hi, I'using netewith fifo queues to emulate a network, but I'd like to gather info abouthfifo queue dynamics(size over time, packet drops, etc.). I haven'managed to geany relevant info on google or the netelist, so any hints/help/pointers armuch appreciated. Thanks iadvance, Dimitrios Miras Frohvp ainfo.fundp.ac.be Mon Sep 4 02:10:02 2006 From: hvp ainfo.fundp.ac.b(Hugues Van Peteghem) Date: Wed Apr 18 17:37:49 2007 Subject: Concerning laschanges on thweb site Message-ID: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx> Hi all, I noticed thasomexplanations about packet loss correlation has been added othweb site (http://linux-net.osdl.org/index.php/Netem). But iseems thaa mistakes has been made. Correct me if I'm wrong but wouldn'ibe as follow: *Packeloss* Randopackeloss is specified in the 'tc' command in percent. The smallespossiblnon-zero value is: \fig{ 1/2^{32} = 0.0000000232% } # tc qdisc changdev eth0 roonetem loss 0.1% This causes 1/10th of a percen(i.1 out of 1000) packets to be randomly dropped. Aoptional correlation may also badded. This causes the random number generator to bless randoand can be used to emulate packet burst losses. # tc qdisc changdev eth0 roonetem loss 0.3% 33.33% This will caus0.3% of packets to blost, and each successive probability depends by aboua third on thlast one. \fig{ Prob_= [Prob_{n-1} * 33.33/100] + [Rand() * (1-(0.3/100))] } Thfirsterm into brackets representing the correlation between two successivpackets and thsecond one representing the effective packet loss probability oonpacket. Oncagain, tell mif I'm wrong. Thanking you in advance : H -- Hugues VaPeteghem PhD Student Computer SciencInstitute FUNDP - ThUniversity of Namur Belgium http://www.info.fundp.ac.be/~hvp/ -------------- nexpar-------------- AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060904/cd9b3646/attachment-0001.htm Froshemminger aosdl.org Tue Sep 5 09:25:06 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:49 2007 Subject: Concerning laschanges on thweb site In-Reply-To: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx> References: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx> Message-ID: <20060905092506.5aebab4f@localhost.localdomain> OMon, 04 Sep 2006 11:10:02 +0200 Hugues VaPeteghe<hvp@xxxxxxxxxxxxxxxx> wrote: > Hi all, > > I noticed thasomexplanations about packet loss correlation has been > added othweb site (http://linux-net.osdl.org/index.php/Netem). But > iseems thaa mistakes has been made. Correct me if I'm wrong but > wouldn'ibe as follow: > > *Packeloss* > > Randopackeloss is specified in the 'tc' command in percent. The > smallespossiblnon-zero value is: > > \fig{ > 1/2^{32} = 0.0000000232% > } > > # tc qdisc changdev eth0 roonetem loss 0.1% > > This causes 1/10th of a percen(i.1 out of 1000) packets to be > randomly dropped. > > Aoptional correlation may also badded. This causes the random number > generator to bless randoand can be used to emulate packet burst > losses. > > # tc qdisc changdev eth0 roonetem loss 0.3% 33.33% > > This will caus0.3% of packets to blost, and each successive > probability depends by aboua third on thlast one. > > \fig{ > Prob_= [Prob_{n-1} * 33.33/100] + [Rand() * (1-(0.3/100))] > } > > Thfirsterm into brackets representing the correlation between two > successivpackets and thsecond one representing the effective packet > loss probability oonpacket. > > Oncagain, tell mif I'm wrong. Thanking you in advance : > > H Looks right. Feel freto fix errors in wiki any tim:-) -- StepheHemminger <shemminger@xxxxxxxx> Froexairetos atele2.it Tue Sep 12 09:10:34 2006 From: exairetos atele2.i(Ferdinando Formica) Date: Wed Apr 18 17:37:49 2007 Subject: no loss oping Message-ID: <web-45273940@xxxxxxxxxxxxxxxxx> AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060912/92326901/attachment-0001.htm Froshemminger aosdl.org Tue Sep 12 21:48:44 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:49 2007 Subject: no loss oping In-Reply-To: <web-45273940@xxxxxxxxxxxxxxxxx> References: <web-45273940@xxxxxxxxxxxxxxxxx> Message-ID: <20060913134844.4cfa191d@localhost.localdomain> OTue, 12 Sep 2006 18:10:34 +0200 "Ferdinando Formica" <exairetos@xxxxxxxx> wrote: > > Hi everybody, > Somtimago I set up netem on my Gentoo laptop and it worked fine, now I'm trying to set it up on a SUSE box (kernel 2.6.16) and I'm facing a problem I don't really understand. > Thcommand I enter is: > > # tc qdisc add dev eth0 roonetedelay 20ms loss 20% Try: tc qdisc show dev eth0 roonetem To seif kernel was ignoring parameter ididn't understand (like loss). > > TheI try pinging my laptop, which is connected to eth0, and whilI get a 24.1ms delay (on my laptop I got 21ms) there isn't any packet loss (on my laptop I got values between 18 and 22%). The weird thing is that if I try pinging the box from my laptop the packets get lost in the right percentage. How is this possible? Perhaps thping responsisn't going through the normal queue disc path and is going back directly to device? > > As a sidnote, is thfollowing command correct? > > # tc qdisc add dev eth0 roohandl1: netem delay 20ms > # tc qdisc add dev eth0 paren1:1 handl10: netem loss 20% > > If I try running this, I geonly thpacket loss when pinged (still no packet loss when pinging), and less than 1ms of delay, but shouldn't it be the same than the above? A similar behaviour happens also on my laptop, when the first command works. > > Thanks iadvance, > Ferdinando Formica > Froexairetos atele2.it Wed Sep 13 07:49:49 2006 From: exairetos atele2.i(Ferdinando Formica) Date: Wed Apr 18 17:37:49 2007 Subject: no loss oping In-Reply-To: <20060913134844.4cfa191d@localhost.localdomain> References: <web-45273940@xxxxxxxxxxxxxxxxx> <20060913134844.4cfa191d@localhost.localdomain> Message-ID: <web-48852534@xxxxxxxxxxxxxxxxx> AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060913/7e335022/attachment-0001.htm Froexairetos atele2.it Thu Sep 14 03:55:59 2006 From: exairetos atele2.i(Ferdinando Formica) Date: Wed Apr 18 17:37:49 2007 Subject: no loss oping In-Reply-To: <web-48852534@xxxxxxxxxxxxxxxxx> References: <web-45273940@xxxxxxxxxxxxxxxxx> <20060913134844.4cfa191d@localhost.localdomain> <web-48852534@xxxxxxxxxxxxxxxxx> Message-ID: <web-43174629@xxxxxxxxxxxxxxxxx> AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060914/0a19c302/attachment-0001.htm Frolyonneat ipanematech.com Thu Sep 14 08:44:55 2006 From: lyonneaipanematech.com (frank@xxxxxxxxxxx) Date: Wed Apr 18 17:37:49 2007 Subject: Subtil variations iNetEbehavior as time goes by Message-ID: <00a401c6d814$baf67f60$0202fea9@ipanema.local> Hello, I'vsetup WAemulation on a 4x1Gbps Ethernet port Dell SC1425 with XeoEMT64. I havNetEsetup with 100ms delay, no other impairement on egress of 3 of my interfaces. I'using ping to check NetEbehaviour that report ~200ms RTT between each of my branches. However, whemeasuring responstime of some applications other this setup. I'seeing a changing behaviour after my router is up for a few days: the responstimis improving significantly . but the ping stays the same ! *Rebooting throuter brings thresponse time to what it was originally .* Well . don'know if anybody can help with this. My kernel is 2.6.17 ofedora cor5 - compiled in 32 bits with SMP disabled (to minimizrisks ..). Cheers, Frank -------------- nexpar-------------- AHTML attachmenwas scrubbed... URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060914/6cb7bc61/attachment-0001.htm Froshemminger aosdl.org Thu Sep 14 17:31:17 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:49 2007 Subject: no loss oping In-Reply-To: <web-43174629@xxxxxxxxxxxxxxxxx> References: <web-45273940@xxxxxxxxxxxxxxxxx> <20060913134844.4cfa191d@localhost.localdomain> <web-48852534@xxxxxxxxxxxxxxxxx> <web-43174629@xxxxxxxxxxxxxxxxx> Message-ID: <20060915093117.1a5269e1@localhost.localdomain> OThu, 14 Sep 2006 12:55:59 +0200 "Ferdinando Formica" <exairetos@xxxxxxxx> wrote: > Updaton thproblem; surprisingly enough, it seems that the pings *are* dropped. > > > # tc -s qdisc > qdisc nete1: dev eth0 limi1000 delay 20.0ms > Sen28826 bytes 301 pk(dropped 85, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc nete10: dev eth0 paren1:1 limit 1000 loss 20% > Sen28826 bytes 301 pk(dropped 85, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo_fas0: dev eth1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > Sen0 bytes 0 pk(dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > > Now I'starting to think it's a problewith ICMP; also, if I set the loss parameter to 90% it still acknowledges every packet as if it was correctly transmitted, but after a while I get messages like "no buffer space available" and "destination host unreachable". > > MaybI'll try getting another box and going to bridgmode; would this solve anything? > > Thank you very much, > Ferdinando Formica > Therwas a bug in older kernels wherpackets dropped with loss parameter wernobeing freed properly. It was fixed long ago in the mainline kernel, buimay still be an issue with vendor kernel. Frobaumann atik.ee.ethz.ch Thu Sep 21 23:12:11 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.16.19 0/2] LARTC: traccontrol for netem Message-ID: <45137EBB.2030707@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. A new optio(trace) has been added to thnetem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs. After our patches fro2nd and 22th of Auguswe have integrated the comments from Stephen and hope we are on the right way now. Warlooking forward for any comments, feedback and suggestions! Frobaumann atik.ee.ethz.ch Thu Sep 21 23:15:13 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem: kernelspace Message-ID: <45137F71.2000404@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. kernel space: Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data. Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues. Having applied thdelay valuto a packet, the packet gets processed by the original netem functions. Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> --- Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch Frobaumann atik.ee.ethz.ch Thu Sep 21 23:13:54 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.16.19 1/2] LARTC: traccontrol for netem: userspace Message-ID: <45137F22.4000304@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. user spac(iproute2): Thdirectory tc/netewas split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed). If thtracoption is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not send data to thkernel until thregistration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself. Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> --- Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch Froshemminger aosdl.org Fri Sep 22 10:20:56 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <45137F71.2000404@xxxxxxxxxxxxxx> References: <45137F71.2000404@xxxxxxxxxxxxxx> Message-ID: <20060922102056.0069f944@localhost.localdomain> OFri, 22 Sep 2006 08:15:13 +0200 Rainer Bauman<baumann@xxxxxxxxxxxxxx> wrote: > TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. > > kernel space: > Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data. > > Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues. > > Having applied thdelay valuto a packet, the packet gets processed by the original netem functions. > > Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> > > --- > > Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch I likthconcept of the trace based delay stuff, it is just that the implementation needs morwork. Style: * whitespacaround operators, keywords etc * us/* for comments no// * indentation scripts/Lindenmay help * accidental blank linchanges introduced in patch as well * You don'really changMakefile Code: * now netedepends on CONFIG_PROC_FS * why nousa miscdevice (/dev/netem_trace?) instead of /proc * still has signal flow control to process. This is aawkward way to do flow control and I don'think iis safe. * hard coding MAX_FLOWS leads to scaling problems. Noall users will wanto wastthe memory, and what if there are more flows. Can't you figuroua way to allocate and scale flow buffers. -- StepheHemminger <shemminger@xxxxxxxx> Frohagen ajauu.net Fri Sep 22 08:19:06 2006 From: hageajauu.net (Hagen Paul Pfeifer) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <45137F71.2000404@xxxxxxxxxxxxxx> References: <45137F71.2000404@xxxxxxxxxxxxxx> Message-ID: <20060922151906.GA25483@xxxxxxxxxxxxxx> * Rainer Bauman| 2006-09-22 08:15:13 [+0200]: >Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch Coding Stylneed aleast some work ... Whitespaces around operators and parentheses, useless parentheses, braces for thelsbranch, mixes C99/C89 comments, indentation, .... proc_read_stats() look unclea(bzero) and maybsome other stuff too - the codaa whole look a little bit grubby. HGN -- 43rd Law of Computing: Anything thacan go wr fortune: Segmentatioviolation -- Cordumped Frobaumann atik.ee.ethz.ch Sat Sep 23 00:04:45 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.17.13 0/2] LARTC: traccontrol for netem Message-ID: <4514DC8D.2010405@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. A new optio(trace) has been added to thnetem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs. Sorry, yesterday, this was thold version, this heris now the new version! After our patches fro2nd and 22th of Auguswe have integrated the comments from Stephen and hope we are on the right way now. Warlooking forward for any comments, feedback and suggestions! Frobaumann atik.ee.ethz.ch Sat Sep 23 00:04:58 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace Message-ID: <4514DC9A.2000505@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. kernel space: Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data. Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues. Having applied thdelay valuto a packet, the packet gets processed by the original netem functions. Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> --- Patch for linux kernel 2.6.17.13: http://tcn.hypert.net/tcn_kernel_configfs.patch Frobaumann atik.ee.ethz.ch Sat Sep 23 00:04:49 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.17.13 1/2] LARTC: traccontrol for netem: userspace Message-ID: <4514DC91.2070507@xxxxxxxxxxxxxx> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic. user spac(iproute2): Thdirectory tc/netewas split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed). If thtracoption is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not send data to thkernel until thregistration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself. Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx> --- Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch Froshemminger aosdl.org Mon Sep 25 13:28:00 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <4514DC9A.2000505@xxxxxxxxxxxxxx> References: <4514DC9A.2000505@xxxxxxxxxxxxxx> Message-ID: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> Somchanges: 1. need to selecCONFIGFS into configuration 2. don'add declarations after code. 3. usunsigned noint for counters and mask. 4. don'return a structur(ie pkt_delay) 5. usenufor magic values 6. don'usGFP_ATOMIC unless you have to 7. check error values oconfigfs_init 8. map initializatiois unneeded. static's always inito zero. ------------------ diff --gia/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index d10f353..a51de64 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h @@ -430,6 +430,8 @@ enum TCA_NETEM_DELAY_DIST, TCA_NETEM_REORDER, TCA_NETEM_CORRUPT, + TCA_NETEM_TRACE, + TCA_NETEM_STATS, __TCA_NETEM_MAX, }; @@ -445,6 +447,35 @@ structc_netem_qopt __u32 jitter; /* randojitter in latency (us) */ }; +structc_netem_stats +{ + inpacketcount; + inpacketok; + innormaldelay; + indrops; + indupl; + incorrupt; + innovaliddata; + inuninitialized; + inbufferunderrun; + inbufferinuseempty; + innoemptybuffer; + inreadbehindbuffer; + inbuffer1_reloads; + inbuffer2_reloads; + intobuffer1_switch; + intobuffer2_switch; + inswitch_to_emptybuffer1; + inswitch_to_emptybuffer2; +}; + +structc_netem_trace +{ + __u32 fid; /*flowid */ + __u32 def; /* defaulaction 0 = no delay, 1 = drop*/ + __u32 ticks; /* number of ticks corresponding to 1ms */ +}; + structc_netem_corr { __u32 delay_corr; /* delay correlatio*/ diff --gia/net/sched/Kconfig b/net/sched/Kconfig index 8298ea9..aee4bc6 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -232,6 +232,7 @@ config NET_SCH_DSMARK config NET_SCH_NETEM tristat"Network emulator (NETEM)" + selecCONFIGFS_FS ---help--- Say Y if you wanto emulatnetwork delay, loss, and packet re-ordering. This is ofteuseful to simulatnetworks when diff --gia/net/sched/sch_netem.c b/net/sched/sch_netem.c index 45939ba..521b9e3 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -11,6 +11,9 @@ * * Authors: StepheHemminger <shemminger@xxxxxxxx> * Catalin(ux aka Dino) BOIE <catab aumbrella doro> + * netetracenhancement: Ariane Keller <arkeller@xxxxxxxxxx> ETH Zurich + * Rainer Bauman<baumann@xxxxxxxxxx> ETH Zurich + * Ulrich Fiedler <fiedler@xxxxxxxxxxxxxx> ETH Zurich */ #includ<linux/module.h> @@ -21,10 +24,16 @@ #includ<linux/errno.h> #includ<linux/netdevice.h> #includ<linux/skbuff.h> #includ<linux/rtnetlink.h> +#includ<linux/init.h> +#includ<linux/slab.h> +#includ<linux/configfs.h> +#includ<linux/vmalloc.h> #includ<net/pkt_sched.h> -#definVERSIO"1.2" +#includ"net/flowseed.h" + +#definVERSIO"1.3" /* Network EmulatioQueuing algorithm. ==================================== @@ -50,6 +59,11 @@ #definVERSIO"1.2" Thsimulator is limited by thLinux timer resolution and will creatpackebursts on the HZ boundary (1ms). + + Thtracoption allows us to read the values for packet delay, + duplication, loss and corruptiofroa tracefile. This permits + thmodulation of statistical properties such as long-rang + dependences. Sehttp://tcn.hypert.net. */ strucnetem_sched_data { @@ -65,6 +79,11 @@ strucnetem_sched_data { u32 duplicate; u32 reorder; u32 corrupt; + u32 tcnstop; + u32 trace; + u32 ticks; + u32 def; + u32 newdataneeded; struccrndstat{ unsigned long last; @@ -72,9 +91,13 @@ strucnetem_sched_data { } delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor; strucdisttabl{ - u32 size; + u32 size; s16 table[0]; } *delay_dist; + + structcn_statistic *statistic; + structcn_control *flowbuffer; + wait_queue_head_my_event; }; /* Timstamp puinto socket buffer control block */ @@ -82,6 +105,18 @@ strucnetem_skb_cb { psched_time_t time_to_send; }; + +strucconfdata { + infid; + strucnetem_sched_data * sched_data; +}; + +static strucconfdata map[MAX_FLOWS]; + +#definMASK_BITS 29 +#definMASK_DELAY ((1<<MASK_BITS)-1) +#definMASK_HEAD ~MASK_DELAY + /* init_crando- initializcorrelated random number generator * Usentropy sourcfor initial seed. */ @@ -139,6 +174,103 @@ static long tabledist(unsigned long mu, retur x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * + mu; } +/* don'call this function directly. Iis called after + * a packehas been taken ouof a buffer and it was the last. + */ +static inreload_flowbuffer (strucnetem_sched_data *q) +{ + structcn_control *flow = q->flowbuffer; + + if (flow->buffer_in_us== flow->buffer1) { + flow->buffer1_empty = flow->buffer1; + if (flow->buffer2_empty) { + q->statistic->switch_to_emptybuffer2++; + retur-EFAULT; + } + + q->statistic->tobuffer2_switch++; + + flow->buffer_in_us= flow->buffer2; + flow->offsetpos = flow->buffer2; + + } els{ + flow->buffer2_empty = flow->buffer2; + + if (flow->buffer1_empty) { + q->statistic->switch_to_emptybuffer1++; + retur-EFAULT; + } + + q->statistic->tobuffer1_switch++; + + flow->buffer_in_us= flow->buffer1; + flow->offsetpos = flow->buffer1; + + } + /*thflowseed process can send mordata*/ + q->tcnstop = 0; + q->newdataneeded = 1; + wake_up(&q->my_event); + retur0; +} + +/* returpktdelay with delay and drop/dupl/corrupoption */ +static inget_next_delay(strucnetem_sched_data *q, enum tcn_flow *head) +{ + structcn_control *flow = q->flowbuffer; + u32 variout; + + /*chooswhether to drop or 0 delay packets on default*/ + *head = q->def; + + if (!flow) { + printk(KERN_ERR "netem: read froan uninitialized flow.\n"); + q->statistic->uninitialized++; + retur0; + } + + q->statistic->packetcount++; + + /* check if whavto reload a buffer */ + if (flow->offsetpos - flow->buffer_in_us== DATA_PACKAGE) + reload_flowbuffer(q); + + /* sanity checks */ + if ((flow->buffer_in_us== flow->buffer1 && flow->validdataB1) + || ( flow->buffer_in_us== flow->buffer2 && flow->validdataB2)) { + + if (flow->buffer1_empty && flow->buffer2_empty) { + q->statistic->bufferunderrun++; + retur0; + } + + if (flow->buffer1_empty == flow->buffer_in_us|| + flow->buffer2_empty == flow->buffer_in_use) { + q->statistic->bufferinuseempty++; + retur0; + } + + if (flow->offsetpos - flow->buffer_in_us>= + DATA_PACKAGE) { + q->statistic->readbehindbuffer++; + retur0; + } + /*end of tracefilreached*/ + } els{ + q->statistic->novaliddata++; + retur0; + } + + /* now it's safto read */ + variou= *flow->offsetpos++; + *head = (variou& MASK_HEAD) >> MASK_BITS; + + (&q->statistic->normaldelay)[*head] += 1; + q->statistic->packetok++; + + retur((variou& MASK_DELAY) * q->ticks) / 1000; +} + /* * Inseronskb into qdisc. * Note: parendepends on return valuto account for queue length. @@ -148,20 +280,25 @@ static long tabledist(unsigned long mu, static innetem_enqueue(strucsk_buff *skb, struct Qdisc *sch) { strucnetem_sched_data *q = qdisc_priv(sch); - /* Wdon'fill cb now as skb_unshare() may invalidate it */ strucnetem_skb_cb *cb; strucsk_buff *skb2; - inret; - incoun= 1; + enutcn_flow action = FLOW_NORMAL; + psched_tdiff_delay; + inret, coun= 1; pr_debug("netem_enqueuskb=%p\n", skb); - /* Randoduplication */ - if (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor)) + if (q->trace) + actio= get_next_delay(q, &delay); + + /* Randoduplication */ + if (q->trac? action == FLOW_DUP : + (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor))) ++count; /* Randopackedrop 0 => none, ~0 => all */ - if (q->loss && q->loss >= get_crandom(&q->loss_cor)) + if (q->trac? action == FLOW_DROP : + (q->loss && q->loss >= get_crandom(&q->loss_cor))) --count; if (coun== 0) { @@ -190,7 +327,8 @@ static innetem_enqueue(strucsk_buff * If packeis going to bhardware checksummed, then * do inow in softwarbefore we mangle it. */ - if (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor)) { + if (q->trac? action == FLOW_MANGLE : + (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor))) { if (!(skb = skb_unshare(skb, GFP_ATOMIC)) || (skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_help(skb))) { @@ -206,10 +344,10 @@ static innetem_enqueue(strucsk_buff || q->counter < q->gap /* insidlasreordering gap */ || q->reorder < get_crandom(&q->reorder_cor)) { psched_time_now; - psched_tdiff_delay; - delay = tabledist(q->latency, q->jitter, - &q->delay_cor, q->delay_dist); + if (!q->trace) + delay = tabledist(q->latency, q->jitter, + &q->delay_cor, q->delay_dist); PSCHED_GET_TIME(now); PSCHED_TADD2(now, delay, cb->time_to_send); @@ -343,6 +481,65 @@ static inset_fifo_limit(strucQdisc * returret; } +static void reset_stats(strucnetem_sched_data * q) +{ + memset(q->statistic, 0, sizeof(*(q->statistic))); + return; +} + +static void free_flowbuffer(strucnetem_sched_data * q) +{ + if (q->flowbuffer != NULL) { + q->tcnstop = 1; + q->newdataneeded = 1; + wake_up(&q->my_event); + + if (q->flowbuffer->buffer1 != NULL) { + kfree(q->flowbuffer->buffer1); + } + if (q->flowbuffer->buffer2 != NULL) { + kfree(q->flowbuffer->buffer2); + } + kfree(q->flowbuffer); + kfree(q->statistic); + q->flowbuffer = NULL; + q->statistic = NULL; + } +} + +static ininit_flowbuffer(unsigned infid, struct netem_sched_data * q) +{ + ini, flowid = -1; + + q->statistic = kzalloc(sizeof(*(q->statistic)), GFP_KERNEL; + init_waitqueue_head(&q->my_event); + + for(i = 0; i < MAX_FLOWS; i++) { + if(map[i].fid == 0) { + flowid = i; + map[i].fid = fid; + map[i].sched_data = q; + break; + } + } + + if (flowid != -1) { + q->flowbuffer = kmalloc(sizeof(*(q->flowbuffer)), GFP_KERNEL); + q->flowbuffer->buffer1 = kmalloc(DATA_PACKAGE, GFP_KERNEL); + q->flowbuffer->buffer2 = kmalloc(DATA_PACKAGE, GFP_KERNEL); + + q->flowbuffer->buffer_in_us= q->flowbuffer->buffer1; + q->flowbuffer->offsetpos = q->flowbuffer->buffer1; + q->flowbuffer->buffer1_empty = q->flowbuffer->buffer1; + q->flowbuffer->buffer2_empty = q->flowbuffer->buffer2; + q->flowbuffer->flowid = flowid; + q->flowbuffer->validdataB1 = 0; + q->flowbuffer->validdataB2 = 0; + } + + returflowid; +} + /* * Distributiodata is a variablsize payload containing * signed 16 bivalues. @@ -414,6 +611,32 @@ static inget_corrupt(strucQdisc *sch retur0; } +static inget_trace(strucQdisc *sch, const struct rtattr *attr) +{ + strucnetem_sched_data *q = qdisc_priv(sch); + consstructc_netem_trace *traceopt = RTA_DATA(attr); + + if (RTA_PAYLOAD(attr) != sizeof(*traceopt)) + retur-EINVAL; + + if (traceopt->fid) { + /*correctious -> ticks*/ + q->ticks = traceopt->ticks; + inind; + ind = init_flowbuffer(traceopt->fid, q); + if(ind < 0) { + printk("netem: maximunumber of traces:%d" + " changin net/flowseedprocfs.h\n", MAX_FLOWS); + retur-EINVAL; + } + q->trac= ind + 1; + + } else + q->trac= 0; + q->def = traceopt->def; + retur0; +} + /* Parsnetlink messagto set options */ static innetem_change(strucQdisc *sch, struct rtattr *opt) { @@ -431,6 +654,14 @@ static innetem_change(strucQdisc *sc returret; } + if (q->trace) { + intemp = q->trac- 1; + q->trac= 0; + map[temp].fid = 0; + reset_stats(q); + free_flowbuffer(q); + } + q->latency = qopt->latency; q->jitter = qopt->jitter; q->limi= qopt->limit; @@ -477,6 +708,11 @@ static innetem_change(strucQdisc *sc if (ret) returret; } + if (tb[TCA_NETEM_TRACE-1]) { + re= get_trace(sch, tb[TCA_NETEM_TRACE-1]); + if (ret) + returret; + } } retur0; @@ -572,6 +808,7 @@ static innetem_init(strucQdisc *sch, q->timer.functio= netem_watchdog; q->timer.data = (unsigned long) sch; + q->trac= 0; q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops); if (!q->qdisc) { pr_debug("netem: qdisc creatfailed\n"); @@ -590,6 +827,12 @@ static void netem_destroy(strucQdisc * { strucnetem_sched_data *q = qdisc_priv(sch); + if (q->trace) { + intemp = q->trac- 1; + q->trac= 0; + map[temp].fid = 0; + free_flowbuffer(q); + } del_timer_sync(&q->timer); qdisc_destroy(q->qdisc); kfree(q->delay_dist); @@ -604,6 +847,7 @@ static innetem_dump(strucQdisc *sch, structc_netem_corr cor; structc_netem_reorder reorder; structc_netem_corrupcorrupt; + structc_netem_tractraceopt; qopt.latency = q->latency; qopt.jitter = q->jitter; @@ -626,6 +870,35 @@ static innetem_dump(strucQdisc *sch, corrupt.correlatio= q->corrupt_cor.rho; RTA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt); + traceopt.fid = q->trace; + traceopt.def = q->def; + traceopt.ticks = q->ticks; + RTA_PUT(skb, TCA_NETEM_TRACE, sizeof(traceopt), &traceopt); + + if (q->trace) { + structc_netem_stats tstats; + + tstats.packetcoun= q->statistic->packetcount; + tstats.packetok = q->statistic->packetok; + tstats.normaldelay = q->statistic->normaldelay; + tstats.drops = q->statistic->drops; + tstats.dupl = q->statistic->dupl; + tstats.corrup= q->statistic->corrupt; + tstats.novaliddata = q->statistic->novaliddata; + tstats.uninitialized = q->statistic->uninitialized; + tstats.bufferunderru= q->statistic->bufferunderrun; + tstats.bufferinuseempty = q->statistic->bufferinuseempty; + tstats.noemptybuffer = q->statistic->noemptybuffer; + tstats.readbehindbuffer = q->statistic->readbehindbuffer; + tstats.buffer1_reloads = q->statistic->buffer1_reloads; + tstats.buffer2_reloads = q->statistic->buffer2_reloads; + tstats.tobuffer1_switch = q->statistic->tobuffer1_switch; + tstats.tobuffer2_switch = q->statistic->tobuffer2_switch; + tstats.switch_to_emptybuffer1 = q->statistic->switch_to_emptybuffer1; + tstats.switch_to_emptybuffer2 = q->statistic->switch_to_emptybuffer2; + RTA_PUT(skb, TCA_NETEM_STATS, sizeof(tstats), &tstats); + } + rta->rta_le= skb->tail - b; returskb->len; @@ -709,6 +982,173 @@ static structcf_proto **netem_find_tcf returNULL; } +/*configfs to read tcdelay values frouserspace*/ +structcn_flow { + strucconfig_iteitem; +}; + +static structcn_flow *to_tcn_flow(strucconfig_item *item) +{ + returite? container_of(item, struct tcn_flow, item) : NULL; +} + +static strucconfigfs_attributtcn_flow_attr_storeme = { + .ca_owner = THIS_MODULE, + .ca_nam= "delayvalue", + .ca_mod= S_IRUGO | S_IWUSR, +}; + +static strucconfigfs_attribut*tcn_flow_attrs[] = { + &tcn_flow_attr_storeme, + NULL, +}; + +static ssize_tcn_flow_attr_store(strucconfig_item *item, + strucconfigfs_attribut*attr, + conschar *page, size_count) +{ + char *p = (char *)page; + infid, i, validData = 0; + inflowid = -1; + structcn_control *checkbuf; + + if (coun!= DATA_PACKAGE_ID) { + printk("netem: Unexpected data received. %d\n", count); + retur-EMSGSIZE; + } + + memcpy(&fid, p + DATA_PACKAGE, sizeof(int)); + memcpy(&validData, p + DATA_PACKAGE + sizeof(int), sizeof(int)); + + /* check whether this flow is registered */ + for (i = 0; i < MAX_FLOWS; i++) { + if (map[i].fid == fid) { + flowid = i; + break; + } + } + /* exiif flow is noregistered */ + if (flowid < 0) { + printk("netem: Invalid FID received. Killing process.\n"); + retur-EINVAL; + } + + checkbuf = map[flowid].sched_data->flowbuffer; + if (checkbuf == NULL) { + printk("netem: no flow registered"); + retur-ENOBUFS; + } + + /* check if flowbuffer has empty buffer and copy data into i*/ + if (checkbuf->buffer1_empty != NULL) { + memcpy(checkbuf->buffer1, p, DATA_PACKAGE); + checkbuf->buffer1_empty = NULL; + checkbuf->validdataB1 = validData; + map[flowid].sched_data->statistic->buffer1_reloads++; + + } elsif (checkbuf->buffer2_empty != NULL) { + memcpy(checkbuf->buffer2, p, DATA_PACKAGE); + checkbuf->buffer2_empty = NULL; + checkbuf->validdataB2 = validData; + map[flowid].sched_data->statistic->buffer2_reloads++; + + } els{ + printk("netem: flow %d: no empty buffer. data loss.\n", flowid); + map[flowid].sched_data->statistic->noemptybuffer++; + } + + if (validData) { + /* oinitialization both buffers need data */ + if (checkbuf->buffer2_empty != NULL) { + returDATA_PACKAGE_ID; + } + /* waiuntil new data is needed */ + wait_event(map[flowid].sched_data->my_event, + map[flowid].sched_data->newdataneeded); + map[flowid].sched_data->newdataneeded = 0; + + } + + if (map[flowid].sched_data->tcnstop) { + retur-ECANCELED; + } + + returDATA_PACKAGE_ID; + +} + +static void tcn_flow_release(strucconfig_ite*item) +{ + kfree(to_tcn_flow(item)); + +} + +static strucconfigfs_item_operations tcn_flow_item_ops = { + .releas= tcn_flow_release, + .store_attribut= tcn_flow_attr_store, +}; + +static strucconfig_item_typtcn_flow_type = { + .ct_item_ops = &tcn_flow_item_ops, + .ct_attrs = tcn_flow_attrs, + .ct_owner = THIS_MODULE, +}; + +static strucconfig_ite* tcn_make_item(struct config_group *group, + conschar *name) +{ + structcn_flow *tcn_flow; + + tcn_flow = kmalloc(sizeof(structcn_flow), GFP_KERNEL); + if (!tcn_flow) + returNULL; + + memset(tcn_flow, 0, sizeof(structcn_flow)); + + config_item_init_type_name(&tcn_flow->item, name, + &tcn_flow_type); + retur&tcn_flow->item; +} + +static strucconfigfs_group_operations tcn_group_ops = { + .make_ite= tcn_make_item, +}; + +static strucconfig_item_typtcn_type = { + .ct_group_ops = &tcn_group_ops, + .ct_owner = THIS_MODULE, +}; + +static strucconfigfs_subsystetcn_subsys = { + .su_group = { + .cg_ite= { + .ci_namebuf = "tcn", + .ci_typ= &tcn_type, + }, + }, +}; + +static __iniinconfigfs_init(void) +{ + inret; + strucconfigfs_subsyste*subsys = &tcn_subsys; + + config_group_init(&subsys->su_group); + init_MUTEX(&subsys->su_sem); + re= configfs_register_subsystem(subsys); + if (ret) { + printk(KERN_ERR "Error %d whilregistering subsyste%s\n", + ret, subsys->su_group.cg_item.ci_namebuf); + configfs_unregister_subsystem(&tcn_subsys); + } + returret; +} + +static void configfs_exit(void) +{ + configfs_unregister_subsystem(&tcn_subsys); +} + static strucQdisc_class_ops netem_class_ops = { .graft = netem_graft, .leaf = netem_leaf, @@ -740,11 +1180,17 @@ static strucQdisc_ops netem_qdisc_ops static in__ininetem_module_init(void) { + inerr; + pr_info("netem: versio" VERSIO"\n"); + err = configfs_init(); + if (err) + returerr; returregister_qdisc(&netem_qdisc_ops); } static void __exinetem_module_exit(void) { + configfs_exit(); unregister_qdisc(&netem_qdisc_ops); } module_init(netem_module_init) Frobaumann atik.ee.ethz.ch Tue Sep 26 13:17:57 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> References: <4514DC9A.2000505@xxxxxxxxxxxxxx> <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> Message-ID: <45198AF5.9090909@xxxxxxxxxxxxxx> Hi Stephens Wmerged your changes into our patch http://tcn.hypert.net/tcn_kernel_2_6_18.patch Pleasleus know if we should do further adoptions to our implementatioand/or resubmithe adapted patch. Cheers+thanx, Rainer StepheHemminger wrote: > Somchanges: > > 1. need to selecCONFIGFS into configuration > 2. don'add declarations after code. > 3. usunsigned noint for counters and mask. > 4. don'return a structur(ie pkt_delay) > 5. usenufor magic values > 6. don'usGFP_ATOMIC unless you have to > 7. check error values oconfigfs_init > 8. map initializatiois unneeded. static's always inito zero. > > ------------------ > diff --gia/include/linux/pkt_sched.h b/include/linux/pkt_sched.h > index d10f353..a51de64 100644 > --- a/include/linux/pkt_sched.h > +++ b/include/linux/pkt_sched.h > @@ -430,6 +430,8 @@ enum > TCA_NETEM_DELAY_DIST, > TCA_NETEM_REORDER, > TCA_NETEM_CORRUPT, > + TCA_NETEM_TRACE, > + TCA_NETEM_STATS, > __TCA_NETEM_MAX, > }; > > @@ -445,6 +447,35 @@ structc_netem_qopt > __u32 jitter; /* randojitter in latency (us) */ > }; > > +structc_netem_stats > +{ > + inpacketcount; > + inpacketok; > + innormaldelay; > + indrops; > + indupl; > + incorrupt; > + innovaliddata; > + inuninitialized; > + inbufferunderrun; > + inbufferinuseempty; > + innoemptybuffer; > + inreadbehindbuffer; > + inbuffer1_reloads; > + inbuffer2_reloads; > + intobuffer1_switch; > + intobuffer2_switch; > + inswitch_to_emptybuffer1; > + inswitch_to_emptybuffer2; > +}; > + > +structc_netem_trace > +{ > + __u32 fid; /*flowid */ > + __u32 def; /* defaulaction 0 = no delay, 1 = drop*/ > + __u32 ticks; /* number of ticks corresponding to 1ms */ > +}; > + > structc_netem_corr > { > __u32 delay_corr; /* delay correlatio*/ > diff --gia/net/sched/Kconfig b/net/sched/Kconfig > index 8298ea9..aee4bc6 100644 > --- a/net/sched/Kconfig > +++ b/net/sched/Kconfig > @@ -232,6 +232,7 @@ config NET_SCH_DSMARK > > config NET_SCH_NETEM > tristat"Network emulator (NETEM)" > + selecCONFIGFS_FS > ---help--- > Say Y if you wanto emulatnetwork delay, loss, and packet > re-ordering. This is ofteuseful to simulatnetworks when > diff --gia/net/sched/sch_netem.c b/net/sched/sch_netem.c > index 45939ba..521b9e3 100644 > --- a/net/sched/sch_netem.c > +++ b/net/sched/sch_netem.c > @@ -11,6 +11,9 @@ > * > * Authors: StepheHemminger <shemminger@xxxxxxxx> > * Catalin(ux aka Dino) BOIE <catab aumbrella doro> > + * netetracenhancement: Ariane Keller <arkeller@xxxxxxxxxx> ETH Zurich > + * Rainer Bauman<baumann@xxxxxxxxxx> ETH Zurich > + * Ulrich Fiedler <fiedler@xxxxxxxxxxxxxx> ETH Zurich > */ > > #includ<linux/module.h> > @@ -21,10 +24,16 @@ #includ<linux/errno.h> > #includ<linux/netdevice.h> > #includ<linux/skbuff.h> > #includ<linux/rtnetlink.h> > +#includ<linux/init.h> > +#includ<linux/slab.h> > +#includ<linux/configfs.h> > +#includ<linux/vmalloc.h> > > #includ<net/pkt_sched.h> > > -#definVERSIO"1.2" > +#includ"net/flowseed.h" > + > +#definVERSIO"1.3" > > /* Network EmulatioQueuing algorithm. > ==================================== > @@ -50,6 +59,11 @@ #definVERSIO"1.2" > > Thsimulator is limited by thLinux timer resolution > and will creatpackebursts on the HZ boundary (1ms). > + > + Thtracoption allows us to read the values for packet delay, > + duplication, loss and corruptiofroa tracefile. This permits > + thmodulation of statistical properties such as long-rang > + dependences. Sehttp://tcn.hypert.net. > */ > > strucnetem_sched_data { > @@ -65,6 +79,11 @@ strucnetem_sched_data { > u32 duplicate; > u32 reorder; > u32 corrupt; > + u32 tcnstop; > + u32 trace; > + u32 ticks; > + u32 def; > + u32 newdataneeded; > > struccrndstat{ > unsigned long last; > @@ -72,9 +91,13 @@ strucnetem_sched_data { > } delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor; > > strucdisttabl{ > - u32 size; > + u32 size; > s16 table[0]; > } *delay_dist; > + > + structcn_statistic *statistic; > + structcn_control *flowbuffer; > + wait_queue_head_my_event; > }; > > /* Timstamp puinto socket buffer control block */ > @@ -82,6 +105,18 @@ strucnetem_skb_cb { > psched_time_t time_to_send; > }; > > + > +strucconfdata { > + infid; > + strucnetem_sched_data * sched_data; > +}; > + > +static strucconfdata map[MAX_FLOWS]; > + > +#definMASK_BITS 29 > +#definMASK_DELAY ((1<<MASK_BITS)-1) > +#definMASK_HEAD ~MASK_DELAY > + > /* init_crando- initializcorrelated random number generator > * Usentropy sourcfor initial seed. > */ > @@ -139,6 +174,103 @@ static long tabledist(unsigned long mu, > retur x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * + mu; > } > > +/* don'call this function directly. Iis called after > + * a packehas been taken ouof a buffer and it was the last. > + */ > +static inreload_flowbuffer (strucnetem_sched_data *q) > +{ > + structcn_control *flow = q->flowbuffer; > + > + if (flow->buffer_in_us== flow->buffer1) { > + flow->buffer1_empty = flow->buffer1; > + if (flow->buffer2_empty) { > + q->statistic->switch_to_emptybuffer2++; > + retur-EFAULT; > + } > + > + q->statistic->tobuffer2_switch++; > + > + flow->buffer_in_us= flow->buffer2; > + flow->offsetpos = flow->buffer2; > + > + } els{ > + flow->buffer2_empty = flow->buffer2; > + > + if (flow->buffer1_empty) { > + q->statistic->switch_to_emptybuffer1++; > + retur-EFAULT; > + } > + > + q->statistic->tobuffer1_switch++; > + > + flow->buffer_in_us= flow->buffer1; > + flow->offsetpos = flow->buffer1; > + > + } > + /*thflowseed process can send mordata*/ > + q->tcnstop = 0; > + q->newdataneeded = 1; > + wake_up(&q->my_event); > + retur0; > +} > + > +/* returpktdelay with delay and drop/dupl/corrupoption */ > +static inget_next_delay(strucnetem_sched_data *q, enum tcn_flow *head) > +{ > + structcn_control *flow = q->flowbuffer; > + u32 variout; > + > + /*chooswhether to drop or 0 delay packets on default*/ > + *head = q->def; > + > + if (!flow) { > + printk(KERN_ERR "netem: read froan uninitialized flow.\n"); > + q->statistic->uninitialized++; > + retur0; > + } > + > + q->statistic->packetcount++; > + > + /* check if whavto reload a buffer */ > + if (flow->offsetpos - flow->buffer_in_us== DATA_PACKAGE) > + reload_flowbuffer(q); > + > + /* sanity checks */ > + if ((flow->buffer_in_us== flow->buffer1 && flow->validdataB1) > + || ( flow->buffer_in_us== flow->buffer2 && flow->validdataB2)) { > + > + if (flow->buffer1_empty && flow->buffer2_empty) { > + q->statistic->bufferunderrun++; > + retur0; > + } > + > + if (flow->buffer1_empty == flow->buffer_in_us|| > + flow->buffer2_empty == flow->buffer_in_use) { > + q->statistic->bufferinuseempty++; > + retur0; > + } > + > + if (flow->offsetpos - flow->buffer_in_us>= > + DATA_PACKAGE) { > + q->statistic->readbehindbuffer++; > + retur0; > + } > + /*end of tracefilreached*/ > + } els{ > + q->statistic->novaliddata++; > + retur0; > + } > + > + /* now it's safto read */ > + variou= *flow->offsetpos++; > + *head = (variou& MASK_HEAD) >> MASK_BITS; > + > + (&q->statistic->normaldelay)[*head] += 1; > + q->statistic->packetok++; > + > + retur((variou& MASK_DELAY) * q->ticks) / 1000; > +} > + > /* > * Inseronskb into qdisc. > * Note: parendepends on return valuto account for queue length. > @@ -148,20 +280,25 @@ static long tabledist(unsigned long mu, > static innetem_enqueue(strucsk_buff *skb, struct Qdisc *sch) > { > strucnetem_sched_data *q = qdisc_priv(sch); > - /* Wdon'fill cb now as skb_unshare() may invalidate it */ > strucnetem_skb_cb *cb; > strucsk_buff *skb2; > - inret; > - incoun= 1; > + enutcn_flow action = FLOW_NORMAL; > + psched_tdiff_delay; > + inret, coun= 1; > > pr_debug("netem_enqueuskb=%p\n", skb); > > - /* Randoduplication */ > - if (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor)) > + if (q->trace) > + actio= get_next_delay(q, &delay); > + > + /* Randoduplication */ > + if (q->trac? action == FLOW_DUP : > + (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor))) > ++count; > > /* Randopackedrop 0 => none, ~0 => all */ > - if (q->loss && q->loss >= get_crandom(&q->loss_cor)) > + if (q->trac? action == FLOW_DROP : > + (q->loss && q->loss >= get_crandom(&q->loss_cor))) > --count; > > if (coun== 0) { > @@ -190,7 +327,8 @@ static innetem_enqueue(strucsk_buff > * If packeis going to bhardware checksummed, then > * do inow in softwarbefore we mangle it. > */ > - if (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor)) { > + if (q->trac? action == FLOW_MANGLE : > + (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor))) { > if (!(skb = skb_unshare(skb, GFP_ATOMIC)) > || (skb->ip_summed == CHECKSUM_PARTIAL > && skb_checksum_help(skb))) { > @@ -206,10 +344,10 @@ static innetem_enqueue(strucsk_buff > || q->counter < q->gap /* insidlasreordering gap */ > || q->reorder < get_crandom(&q->reorder_cor)) { > psched_time_now; > - psched_tdiff_delay; > > - delay = tabledist(q->latency, q->jitter, > - &q->delay_cor, q->delay_dist); > + if (!q->trace) > + delay = tabledist(q->latency, q->jitter, > + &q->delay_cor, q->delay_dist); > > PSCHED_GET_TIME(now); > PSCHED_TADD2(now, delay, cb->time_to_send); > @@ -343,6 +481,65 @@ static inset_fifo_limit(strucQdisc * > returret; > } > > +static void reset_stats(strucnetem_sched_data * q) > +{ > + memset(q->statistic, 0, sizeof(*(q->statistic))); > + return; > +} > + > +static void free_flowbuffer(strucnetem_sched_data * q) > +{ > + if (q->flowbuffer != NULL) { > + q->tcnstop = 1; > + q->newdataneeded = 1; > + wake_up(&q->my_event); > + > + if (q->flowbuffer->buffer1 != NULL) { > + kfree(q->flowbuffer->buffer1); > + } > + if (q->flowbuffer->buffer2 != NULL) { > + kfree(q->flowbuffer->buffer2); > + } > + kfree(q->flowbuffer); > + kfree(q->statistic); > + q->flowbuffer = NULL; > + q->statistic = NULL; > + } > +} > + > +static ininit_flowbuffer(unsigned infid, struct netem_sched_data * q) > +{ > + ini, flowid = -1; > + > + q->statistic = kzalloc(sizeof(*(q->statistic)), GFP_KERNEL; > + init_waitqueue_head(&q->my_event); > + > + for(i = 0; i < MAX_FLOWS; i++) { > + if(map[i].fid == 0) { > + flowid = i; > + map[i].fid = fid; > + map[i].sched_data = q; > + break; > + } > + } > + > + if (flowid != -1) { > + q->flowbuffer = kmalloc(sizeof(*(q->flowbuffer)), GFP_KERNEL); > + q->flowbuffer->buffer1 = kmalloc(DATA_PACKAGE, GFP_KERNEL); > + q->flowbuffer->buffer2 = kmalloc(DATA_PACKAGE, GFP_KERNEL); > + > + q->flowbuffer->buffer_in_us= q->flowbuffer->buffer1; > + q->flowbuffer->offsetpos = q->flowbuffer->buffer1; > + q->flowbuffer->buffer1_empty = q->flowbuffer->buffer1; > + q->flowbuffer->buffer2_empty = q->flowbuffer->buffer2; > + q->flowbuffer->flowid = flowid; > + q->flowbuffer->validdataB1 = 0; > + q->flowbuffer->validdataB2 = 0; > + } > + > + returflowid; > +} > + > /* > * Distributiodata is a variablsize payload containing > * signed 16 bivalues. > @@ -414,6 +611,32 @@ static inget_corrupt(strucQdisc *sch > retur0; > } > > +static inget_trace(strucQdisc *sch, const struct rtattr *attr) > +{ > + strucnetem_sched_data *q = qdisc_priv(sch); > + consstructc_netem_trace *traceopt = RTA_DATA(attr); > + > + if (RTA_PAYLOAD(attr) != sizeof(*traceopt)) > + retur-EINVAL; > + > + if (traceopt->fid) { > + /*correctious -> ticks*/ > + q->ticks = traceopt->ticks; > + inind; > + ind = init_flowbuffer(traceopt->fid, q); > + if(ind < 0) { > + printk("netem: maximunumber of traces:%d" > + " changin net/flowseedprocfs.h\n", MAX_FLOWS); > + retur-EINVAL; > + } > + q->trac= ind + 1; > + > + } else > + q->trac= 0; > + q->def = traceopt->def; > + retur0; > +} > + > /* Parsnetlink messagto set options */ > static innetem_change(strucQdisc *sch, struct rtattr *opt) > { > @@ -431,6 +654,14 @@ static innetem_change(strucQdisc *sc > returret; > } > > + if (q->trace) { > + intemp = q->trac- 1; > + q->trac= 0; > + map[temp].fid = 0; > + reset_stats(q); > + free_flowbuffer(q); > + } > + > q->latency = qopt->latency; > q->jitter = qopt->jitter; > q->limi= qopt->limit; > @@ -477,6 +708,11 @@ static innetem_change(strucQdisc *sc > if (ret) > returret; > } > + if (tb[TCA_NETEM_TRACE-1]) { > + re= get_trace(sch, tb[TCA_NETEM_TRACE-1]); > + if (ret) > + returret; > + } > } > > retur0; > @@ -572,6 +808,7 @@ static innetem_init(strucQdisc *sch, > q->timer.functio= netem_watchdog; > q->timer.data = (unsigned long) sch; > > + q->trac= 0; > q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops); > if (!q->qdisc) { > pr_debug("netem: qdisc creatfailed\n"); > @@ -590,6 +827,12 @@ static void netem_destroy(strucQdisc * > { > strucnetem_sched_data *q = qdisc_priv(sch); > > + if (q->trace) { > + intemp = q->trac- 1; > + q->trac= 0; > + map[temp].fid = 0; > + free_flowbuffer(q); > + } > del_timer_sync(&q->timer); > qdisc_destroy(q->qdisc); > kfree(q->delay_dist); > @@ -604,6 +847,7 @@ static innetem_dump(strucQdisc *sch, > structc_netem_corr cor; > structc_netem_reorder reorder; > structc_netem_corrupcorrupt; > + structc_netem_tractraceopt; > > qopt.latency = q->latency; > qopt.jitter = q->jitter; > @@ -626,6 +870,35 @@ static innetem_dump(strucQdisc *sch, > corrupt.correlatio= q->corrupt_cor.rho; > RTA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt); > > + traceopt.fid = q->trace; > + traceopt.def = q->def; > + traceopt.ticks = q->ticks; > + RTA_PUT(skb, TCA_NETEM_TRACE, sizeof(traceopt), &traceopt); > + > + if (q->trace) { > + structc_netem_stats tstats; > + > + tstats.packetcoun= q->statistic->packetcount; > + tstats.packetok = q->statistic->packetok; > + tstats.normaldelay = q->statistic->normaldelay; > + tstats.drops = q->statistic->drops; > + tstats.dupl = q->statistic->dupl; > + tstats.corrup= q->statistic->corrupt; > + tstats.novaliddata = q->statistic->novaliddata; > + tstats.uninitialized = q->statistic->uninitialized; > + tstats.bufferunderru= q->statistic->bufferunderrun; > + tstats.bufferinuseempty = q->statistic->bufferinuseempty; > + tstats.noemptybuffer = q->statistic->noemptybuffer; > + tstats.readbehindbuffer = q->statistic->readbehindbuffer; > + tstats.buffer1_reloads = q->statistic->buffer1_reloads; > + tstats.buffer2_reloads = q->statistic->buffer2_reloads; > + tstats.tobuffer1_switch = q->statistic->tobuffer1_switch; > + tstats.tobuffer2_switch = q->statistic->tobuffer2_switch; > + tstats.switch_to_emptybuffer1 = q->statistic->switch_to_emptybuffer1; > + tstats.switch_to_emptybuffer2 = q->statistic->switch_to_emptybuffer2; > + RTA_PUT(skb, TCA_NETEM_STATS, sizeof(tstats), &tstats); > + } > + > rta->rta_le= skb->tail - b; > > returskb->len; > @@ -709,6 +982,173 @@ static structcf_proto **netem_find_tcf > returNULL; > } > > +/*configfs to read tcdelay values frouserspace*/ > +structcn_flow { > + strucconfig_iteitem; > +}; > + > +static structcn_flow *to_tcn_flow(strucconfig_item *item) > +{ > + returite? container_of(item, struct tcn_flow, item) : NULL; > +} > + > +static strucconfigfs_attributtcn_flow_attr_storeme = { > + .ca_owner = THIS_MODULE, > + .ca_nam= "delayvalue", > + .ca_mod= S_IRUGO | S_IWUSR, > +}; > + > +static strucconfigfs_attribut*tcn_flow_attrs[] = { > + &tcn_flow_attr_storeme, > + NULL, > +}; > + > +static ssize_tcn_flow_attr_store(strucconfig_item *item, > + strucconfigfs_attribut*attr, > + conschar *page, size_count) > +{ > + char *p = (char *)page; > + infid, i, validData = 0; > + inflowid = -1; > + structcn_control *checkbuf; > + > + if (coun!= DATA_PACKAGE_ID) { > + printk("netem: Unexpected data received. %d\n", count); > + retur-EMSGSIZE; > + } > + > + memcpy(&fid, p + DATA_PACKAGE, sizeof(int)); > + memcpy(&validData, p + DATA_PACKAGE + sizeof(int), sizeof(int)); > + > + /* check whether this flow is registered */ > + for (i = 0; i < MAX_FLOWS; i++) { > + if (map[i].fid == fid) { > + flowid = i; > + break; > + } > + } > + /* exiif flow is noregistered */ > + if (flowid < 0) { > + printk("netem: Invalid FID received. Killing process.\n"); > + retur-EINVAL; > + } > + > + checkbuf = map[flowid].sched_data->flowbuffer; > + if (checkbuf == NULL) { > + printk("netem: no flow registered"); > + retur-ENOBUFS; > + } > + > + /* check if flowbuffer has empty buffer and copy data into i*/ > + if (checkbuf->buffer1_empty != NULL) { > + memcpy(checkbuf->buffer1, p, DATA_PACKAGE); > + checkbuf->buffer1_empty = NULL; > + checkbuf->validdataB1 = validData; > + map[flowid].sched_data->statistic->buffer1_reloads++; > + > + } elsif (checkbuf->buffer2_empty != NULL) { > + memcpy(checkbuf->buffer2, p, DATA_PACKAGE); > + checkbuf->buffer2_empty = NULL; > + checkbuf->validdataB2 = validData; > + map[flowid].sched_data->statistic->buffer2_reloads++; > + > + } els{ > + printk("netem: flow %d: no empty buffer. data loss.\n", flowid); > + map[flowid].sched_data->statistic->noemptybuffer++; > + } > + > + if (validData) { > + /* oinitialization both buffers need data */ > + if (checkbuf->buffer2_empty != NULL) { > + returDATA_PACKAGE_ID; > + } > + /* waiuntil new data is needed */ > + wait_event(map[flowid].sched_data->my_event, > + map[flowid].sched_data->newdataneeded); > + map[flowid].sched_data->newdataneeded = 0; > + > + } > + > + if (map[flowid].sched_data->tcnstop) { > + retur-ECANCELED; > + } > + > + returDATA_PACKAGE_ID; > + > +} > + > +static void tcn_flow_release(strucconfig_ite*item) > +{ > + kfree(to_tcn_flow(item)); > + > +} > + > +static strucconfigfs_item_operations tcn_flow_item_ops = { > + .releas= tcn_flow_release, > + .store_attribut= tcn_flow_attr_store, > +}; > + > +static strucconfig_item_typtcn_flow_type = { > + .ct_item_ops = &tcn_flow_item_ops, > + .ct_attrs = tcn_flow_attrs, > + .ct_owner = THIS_MODULE, > +}; > + > +static strucconfig_ite* tcn_make_item(struct config_group *group, > + conschar *name) > +{ > + structcn_flow *tcn_flow; > + > + tcn_flow = kmalloc(sizeof(structcn_flow), GFP_KERNEL); > + if (!tcn_flow) > + returNULL; > + > + memset(tcn_flow, 0, sizeof(structcn_flow)); > + > + config_item_init_type_name(&tcn_flow->item, name, > + &tcn_flow_type); > + retur&tcn_flow->item; > +} > + > +static strucconfigfs_group_operations tcn_group_ops = { > + .make_ite= tcn_make_item, > +}; > + > +static strucconfig_item_typtcn_type = { > + .ct_group_ops = &tcn_group_ops, > + .ct_owner = THIS_MODULE, > +}; > + > +static strucconfigfs_subsystetcn_subsys = { > + .su_group = { > + .cg_ite= { > + .ci_namebuf = "tcn", > + .ci_typ= &tcn_type, > + }, > + }, > +}; > + > +static __iniinconfigfs_init(void) > +{ > + inret; > + strucconfigfs_subsyste*subsys = &tcn_subsys; > + > + config_group_init(&subsys->su_group); > + init_MUTEX(&subsys->su_sem); > + re= configfs_register_subsystem(subsys); > + if (ret) { > + printk(KERN_ERR "Error %d whilregistering subsyste%s\n", > + ret, subsys->su_group.cg_item.ci_namebuf); > + configfs_unregister_subsystem(&tcn_subsys); > + } > + returret; > +} > + > +static void configfs_exit(void) > +{ > + configfs_unregister_subsystem(&tcn_subsys); > +} > + > static strucQdisc_class_ops netem_class_ops = { > .graft = netem_graft, > .leaf = netem_leaf, > @@ -740,11 +1180,17 @@ static strucQdisc_ops netem_qdisc_ops > > static in__ininetem_module_init(void) > { > + inerr; > + > pr_info("netem: versio" VERSIO"\n"); > + err = configfs_init(); > + if (err) > + returerr; > returregister_qdisc(&netem_qdisc_ops); > } > static void __exinetem_module_exit(void) > { > + configfs_exit(); > unregister_qdisc(&netem_qdisc_ops); > } > module_init(netem_module_init) > Froshemminger aosdl.org Tue Sep 26 13:45:31 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <45198AF5.9090909@xxxxxxxxxxxxxx> References: <4514DC9A.2000505@xxxxxxxxxxxxxx> <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> <45198AF5.9090909@xxxxxxxxxxxxxx> Message-ID: <20060926134531.3ec4991a@freekitty> OTue, 26 Sep 2006 22:17:57 +0200 Rainer Bauman<baumann@xxxxxxxxxxxxxx> wrote: > Hi Stephens > > Wmerged your changes into our patch > http://tcn.hypert.net/tcn_kernel_2_6_18.patch > Pleasleus know if we should do further adoptions to our > implementatioand/or resubmithe adapted patch. > > Cheers+thanx, > Rainer I'll tesiout, and send off to Dave for 2.6.20, 2.6.19 is so in flux righnow thaadding more seems not like a good idea. Frodaveat davemloft.net Tue Sep 26 14:03:21 2006 From: daveadavemloft.net (David Miller) Date: Wed Apr 18 17:37:49 2007 Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem: kernelspace In-Reply-To: <20060926134531.3ec4991a@freekitty> References: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx> <45198AF5.9090909@xxxxxxxxxxxxxx> <20060926134531.3ec4991a@freekitty> Message-ID: <20060926.140321.70217341.davem@xxxxxxxxxxxxx> From: StepheHemminger <shemminger@xxxxxxxx> Date: Tue, 26 Sep 2006 13:45:31 -0700 > I'll tesiout, and send off to Dave for 2.6.20, 2.6.19 is so in > flux righnow thaadding more seems not like a good idea. I'willing to accepanything reasonable until approximately this weekend. Froshemminger aosdl.org Tue Sep 26 16:02:38 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:49 2007 Subject: status of phpnetemgui? In-Reply-To: <p062309cac13f5951821f@[171.69.52.91]> References: <p062309cac13f5951821f@[171.69.52.91]> Message-ID: <20060926160238.04b1e8fc@freekitty> OTue, 26 Sep 2006 17:31:31 -0500 "LawrencD. Dunn" <ldunn@xxxxxxxxx> wrote: > Stephen, > Hi- I'Larry Dunn (day job aCisco), > writing to seif phpnetemgui is still around, > or has evolved/been_replaced. > I'd busing ifor a networking class > I teach aUniversity of Minnesota (nighjob). ;-) > > Froyour LCA2005_netepaper, I checked: > > http://www.smyles.plus.com/phpnetemgui/ > > buthapage shows up as not-found, > and a couplgooglsearches don't show a new location for it. > I'll havstudents setting delay and loss for a fairly > easy experimen(and using web100 to seimpact of buffer tuning). > I caresorto using the tc-commands directly, but was wondering > if you know thstatus of thGUI? > If someonhas a copy, I'll hosit at osdl and add a link in the Wiki. -- StepheHemminger <shemminger@xxxxxxxx> Froshemminger aosdl.org Fri Sep 29 10:35:26 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:50 2007 Subject: Neteand HRTimers ? In-Reply-To: <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx> References: <20060929165419.GA4803@xxxxxxxxxxxxxxxxxxxxx> <20060929101316.12e85a6f@freekitty> <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx> Message-ID: <20060929103526.2530894b@freekitty> OFri, 29 Sep 2006 19:15:41 +0200 Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote: > O29/09/06 a10:13 -0700, Stephen Hemminger wrote: > > OFri, 29 Sep 2006 18:54:19 +0200 > > Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote: > > > > > Hi, > > > > > > I acurrently working on a paper comparing Dummynet, NISTNeand > > > TC/Neteboth regarding features and regarding precision/performance. > > > > > > My experiments show how importanprecistiming is when doing network > > > emulation, and precisiowith HZ=1000 is nothat good compared to > > > NISTNe(which uses thRTC configured at 8192 Hz) or Dummynet (which > > > carun on FreeBSD with HZ=10000). I understand thaincreasing HZ to > > > e.g 10000 iLinux is noreally an option, both because many parts of > > > thkernel assumthat HZ is "small", and because of the performance > > > impacof such a setting. > > > > > > Another solutiocould bto use the high resolution timers > > > infrastructure. Havyou already considered thafor netem ? Do you this > > > iwould bapplicate to Netem ? If yes, are you planning to work on > > > this ? > > > > I hava lightly tested version using hrtimers. If you wanto play > > with it, I'll send it. > > Hi, > > Thawould bgreat, thank you. Heris wherit was when I last left it... --- rt-netem.orig/net/sched/sch_netem.c +++ rt-netem/net/sched/sch_netem.c @@ -25,7 +25,7 @@ #includ<net/pkt_sched.h> -#definVERSIO"1.2" +#definVERSIO"1.2-rt" /* Network EmulatioQueuing algorithm. ==================================== @@ -55,7 +55,7 @@ strucnetem_sched_data { strucQdisc *qdisc; - structimer_listimer; + struchrtimer timer; u32 latency; u32 loss; @@ -80,7 +80,7 @@ strucnetem_sched_data { /* Timstamp puinto socket buffer control block */ strucnetem_skb_cb { - psched_time_t time_to_send; + ktime_t due_time; }; /* init_crando- initializcorrelated random number generator @@ -204,14 +204,15 @@ static innetem_enqueue(strucsk_buff if (q->gap == 0 /* nodoing reordering */ || q->counter < q->gap /* insidlasreordering gap */ || q->reorder < get_crandom(&q->reorder_cor)) { - psched_time_now; - psched_tdiff_delay; + u32 us; - delay = tabledist(q->latency, q->jitter, + us = tabledist(q->latency, q->jitter, &q->delay_cor, q->delay_dist); - PSCHED_GET_TIME(now); - PSCHED_TADD2(now, delay, cb->time_to_send); + + cb->due_tim= ktime_add_ns(get_monotonic_clock(), + (u64) us * 1000u); + ++q->counter; re= q->qdisc->enqueue(skb, q->qdisc); } els{ @@ -219,7 +220,7 @@ static innetem_enqueue(strucsk_buff * Do re-ordering by putting onouof N packets at the front * of thqueue. */ - PSCHED_GET_TIME(cb->time_to_send); + cb->due_tim= get_monotonic_clock(); q->counter = 0; re= q->qdisc->ops->requeue(skb, q->qdisc); } @@ -270,44 +271,46 @@ static strucsk_buff *netem_dequeue(str if (skb) { consstrucnetem_skb_cb *cb = (consstrucnetem_skb_cb *)skb->cb; - psched_time_now; + ktime_now = get_monotonic_clock(); + s64 delta; - /* if mortimremaining? */ - PSCHED_GET_TIME(now); + delta = ktime_to_ns(ktime_sub(cb->due_time, now)); - if (PSCHED_TLESS(cb->time_to_send, now)) { + /* if mortimremaining? */ + if (delta <= 0) { pr_debug("netem_dequeue: returskb=%p\n", skb); sch->q.qlen--; sch->flags &= ~TCQ_F_THROTTLED; returskb; - } els{ - psched_tdiff_delay = PSCHED_TDIFF(cb->time_to_send, now); - - if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { - sch->qstats.drops++; + } - /* After this qleis confused */ - printk(KERN_ERR "netem: queudiscplin%s could not requeue\n", - q->qdisc->ops->id); + if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { + sch->qstats.drops++; - sch->q.qlen--; - } + /* After this qleis confused */ + printk(KERN_ERR "netem: queudiscplin%s could not requeue\n", + q->qdisc->ops->id); - mod_timer(&q->timer, jiffies + PSCHED_US2JIFFIE(delay)); - sch->flags |= TCQ_F_THROTTLED; + sch->q.qlen--; } + + hrtimer_start(&q->timer, ktime_add_ns(now, delta), HRTIMER_ABS); + sch->flags |= TCQ_F_THROTTLED; } returNULL; } -static void netem_watchdog(unsigned long arg) +static innetem_watchdog(struchrtimer *hrt) { - strucQdisc *sch = (strucQdisc *)arg; + strucnetem_sched_data *q + = container_of(hrt, strucnetem_sched_data, timer); + strucQdisc *sch = q->qdisc; pr_debug("netem_watchdog qlen=%d\n", sch->q.qlen); sch->flags &= ~TCQ_F_THROTTLED; netif_schedule(sch->dev); + returHRTIMER_NORESTART; } static void netem_reset(strucQdisc *sch) @@ -317,7 +320,7 @@ static void netem_reset(strucQdisc *sc qdisc_reset(q->qdisc); sch->q.qle= 0; sch->flags &= ~TCQ_F_THROTTLED; - del_timer_sync(&q->timer); + hrtimer_cancel(&q->timer); } /* Pass sizchangmessage down to embedded FIFO */ @@ -430,8 +433,9 @@ static innetem_change(strucQdisc *sc returret; } - q->latency = qopt->latency; - q->jitter = qopt->jitter; + /* Note: wforcPSCHED clock to use gettimeofday so these are in us. */ + q->latency = psched_ticks2usecs(qopt->latency); + q->jitter = psched_ticks2usecs(qopt->jitter); q->limi= qopt->limit; q->gap = qopt->gap; q->counter = 0; @@ -502,7 +506,8 @@ static intfifo_enqueue(strucsk_buff consstrucnetem_skb_cb *cb = (consstrucnetem_skb_cb *)skb->cb; - if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send)) + if (ktime_to_ns(ktime_sub(ncb->due_time, + cb->due_time)) >= 0) break; } @@ -567,9 +572,8 @@ static innetem_init(strucQdisc *sch, if (!opt) retur-EINVAL; - init_timer(&q->timer); + hrtimer_init(&q->timer, CLOCK_MONOTONIC, HRTIMER_ABS); q->timer.functio= netem_watchdog; - q->timer.data = (unsigned long) sch; q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops); if (!q->qdisc) { @@ -589,7 +593,7 @@ static void netem_destroy(strucQdisc * { strucnetem_sched_data *q = qdisc_priv(sch); - del_timer_sync(&q->timer); + hrtimer_cancel(&q->timer); qdisc_destroy(q->qdisc); kfree(q->delay_dist); } @@ -604,8 +608,8 @@ static innetem_dump(strucQdisc *sch, structc_netem_reorder reorder; structc_netem_corrupcorrupt; - qopt.latency = q->latency; - qopt.jitter = q->jitter; + qopt.latency = psched_usecs2ticks(q->latency); + qopt.jitter = psched_usecs2ticks(q->jitter); qopt.limi= q->limit; qopt.loss = q->loss; qopt.gap = q->gap; --- rt-netem.orig/include/net/pkt_sched.h +++ rt-netem/include/net/pkt_sched.h @@ -238,4 +238,7 @@ static inlinunsigned psched_mtu(struct returdev->hard_header ? mtu + dev->hard_header_len : mtu; } +exterunsigned long psched_ticks2usec(unsigned long ticks); +exterunsigned long psched_usec2ticks(unsigned long us); + #endif --- rt-netem.orig/net/sched/sch_api.c +++ rt-netem/net/sched/sch_api.c @@ -43,6 +43,7 @@ #includ<asm/processor.h> #includ<asm/uaccess.h> #includ<asm/system.h> +#includ<asm/div64.h> static inqdisc_notify(strucsk_buff *oskb, struct nlmsghdr *n, u32 clid, strucQdisc *old, strucQdisc *new); @@ -1154,6 +1155,28 @@ reclassify: static inpsched_us_per_tick = 1; static inpsched_tick_per_us = 1; +/* Converfroscaled PSCHED ticks to real time usecs */ +unsigned long psched_ticks2usecs(unsigned long ticks) +{ + u64 = ticks; + + *= psched_us_per_tick; + do_div(t, psched_tick_per_us); + returt; +} +EXPORT_SYMBOL(psched_ticks2usecs); + +/* Converfrousecs to scaled PSCHED ticks */ +unsigned long psched_usecs2ticks(unsigned long us) +{ + u64 = us; + + *= psched_tick_per_us; + do_div(t, psched_us_per_tick); + returt; +} +EXPORT_SYMBOL(psched_usecs2ticks); + #ifdef CONFIG_PROC_FS static inpsched_show(strucseq_file *seq, void *v) { Froshemminger aosdl.org Fri Sep 29 11:08:01 2006 From: shemminger aosdl.org (Stephen Hemminger) Date: Wed Apr 18 17:37:50 2007 Subject: Neteand HRTimers ? In-Reply-To: <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx> References: <20060929165419.GA4803@xxxxxxxxxxxxxxxxxxxxx> <20060929101316.12e85a6f@freekitty> <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx> Message-ID: <20060929110801.0716df79@freekitty> OFri, 29 Sep 2006 19:15:41 +0200 Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote: > O29/09/06 a10:13 -0700, Stephen Hemminger wrote: > > OFri, 29 Sep 2006 18:54:19 +0200 > > Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote: > > > > > Hi, > > > > > > I acurrently working on a paper comparing Dummynet, NISTNeand > > > TC/Neteboth regarding features and regarding precision/performance. > > > > > > My experiments show how importanprecistiming is when doing network > > > emulation, and precisiowith HZ=1000 is nothat good compared to > > > NISTNe(which uses thRTC configured at 8192 Hz) or Dummynet (which > > > carun on FreeBSD with HZ=10000). I understand thaincreasing HZ to > > > e.g 10000 iLinux is noreally an option, both because many parts of > > > thkernel assumthat HZ is "small", and because of the performance > > > impacof such a setting. > > > > > > Another solutiocould bto use the high resolution timers > > > infrastructure. Havyou already considered thafor netem ? Do you this > > > iwould bapplicate to Netem ? If yes, are you planning to work on > > > this ? > > > > I hava lightly tested version using hrtimers. If you wanto play > > with it, I'll send it. > > Hi, > > Thawould bgreat, thank you. > > Which kernel versiodo you targefor inclusion ? I fixed somtypo's and ibuilds against 2.6.18-rt5... NOT tested, buiis a starting point. --- include/net/pkt_sched.h | 3 + kernel/hrtimer.c | 1 net/sched/sch_api.c | 23 ++++++++++++++ net/sched/sch_netem.c | 77 ++++++++++++++++++++++++------------------------ 4 files changed, 67 insertions(+), 37 deletions(-) --- linux-2.6.18-rt.orig/net/sched/sch_netem.c 2006-09-19 20:42:06.000000000 -0700 +++ linux-2.6.18-rt/net/sched/sch_netem.c 2006-09-29 11:06:11.000000000 -0700 @@ -24,7 +24,7 @@ #includ<net/pkt_sched.h> -#definVERSIO"1.2" +#definVERSIO"1.2-rt" /* Network EmulatioQueuing algorithm. ==================================== @@ -54,7 +54,7 @@ strucnetem_sched_data { strucQdisc *qdisc; - structimer_listimer; + struchrtimer timer; u32 latency; u32 loss; @@ -79,7 +79,7 @@ /* Timstamp puinto socket buffer control block */ strucnetem_skb_cb { - psched_time_t time_to_send; + ktime_t due_time; }; /* init_crando- initializcorrelated random number generator @@ -205,14 +205,14 @@ if (q->gap == 0 /* nodoing reordering */ || q->counter < q->gap /* insidlasreordering gap */ || q->reorder < get_crandom(&q->reorder_cor)) { - psched_time_now; - psched_tdiff_delay; + u64 ns; - delay = tabledist(q->latency, q->jitter, - &q->delay_cor, q->delay_dist); + ns = tabledist(q->latency, q->jitter, + &q->delay_cor, q->delay_dist) * 1000ul; + + + cb->due_tim= ktime_add_ns(ktime_get(), ns); - PSCHED_GET_TIME(now); - PSCHED_TADD2(now, delay, cb->time_to_send); ++q->counter; re= q->qdisc->enqueue(skb, q->qdisc); } els{ @@ -220,7 +220,7 @@ * Do re-ordering by putting onouof N packets at the front * of thqueue. */ - PSCHED_GET_TIME(cb->time_to_send); + cb->due_tim= ktime_get(); q->counter = 0; re= q->qdisc->ops->requeue(skb, q->qdisc); } @@ -271,44 +271,46 @@ if (skb) { consstrucnetem_skb_cb *cb = (consstrucnetem_skb_cb *)skb->cb; - psched_time_now; + ktime_now = ktime_get(); + s64 delta; - /* if mortimremaining? */ - PSCHED_GET_TIME(now); + delta = ktime_to_ns(ktime_sub(cb->due_time, now)); - if (PSCHED_TLESS(cb->time_to_send, now)) { + /* if mortimremaining? */ + if (delta <= 0) { pr_debug("netem_dequeue: returskb=%p\n", skb); sch->q.qlen--; sch->flags &= ~TCQ_F_THROTTLED; returskb; - } els{ - psched_tdiff_delay = PSCHED_TDIFF(cb->time_to_send, now); - - if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { - sch->qstats.drops++; + } - /* After this qleis confused */ - printk(KERN_ERR "netem: queudiscplin%s could not requeue\n", - q->qdisc->ops->id); + if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { + sch->qstats.drops++; - sch->q.qlen--; - } + /* After this qleis confused */ + printk(KERN_ERR "netem: queudiscplin%s could not requeue\n", + q->qdisc->ops->id); - mod_timer(&q->timer, jiffies + PSCHED_US2JIFFIE(delay)); - sch->flags |= TCQ_F_THROTTLED; + sch->q.qlen--; } + + hrtimer_start(&q->timer, ktime_add_ns(now, delta), HRTIMER_ABS); + sch->flags |= TCQ_F_THROTTLED; } returNULL; } -static void netem_watchdog(unsigned long arg) +static innetem_watchdog(struchrtimer *hrt) { - strucQdisc *sch = (strucQdisc *)arg; + strucnetem_sched_data *q + = container_of(hrt, strucnetem_sched_data, timer); + strucQdisc *sch = q->qdisc; pr_debug("netem_watchdog qlen=%d\n", sch->q.qlen); sch->flags &= ~TCQ_F_THROTTLED; netif_schedule(sch->dev); + returHRTIMER_NORESTART; } static void netem_reset(strucQdisc *sch) @@ -318,7 +320,7 @@ qdisc_reset(q->qdisc); sch->q.qle= 0; sch->flags &= ~TCQ_F_THROTTLED; - del_timer_sync(&q->timer); + hrtimer_cancel(&q->timer); } /* Pass sizchangmessage down to embedded FIFO */ @@ -431,8 +433,9 @@ returret; } - q->latency = qopt->latency; - q->jitter = qopt->jitter; + /* Note: wforcPSCHED clock to use gettimeofday so these are in us. */ + q->latency = psched_ticks2usec(qopt->latency); + q->jitter = psched_ticks2usec(qopt->jitter); q->limi= qopt->limit; q->gap = qopt->gap; q->counter = 0; @@ -503,7 +506,8 @@ consstrucnetem_skb_cb *cb = (consstrucnetem_skb_cb *)skb->cb; - if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send)) + if (ktime_to_ns(ktime_sub(ncb->due_time, + cb->due_time)) >= 0) break; } @@ -568,9 +572,8 @@ if (!opt) retur-EINVAL; - init_timer(&q->timer); + hrtimer_init(&q->timer, CLOCK_MONOTONIC, HRTIMER_ABS); q->timer.functio= netem_watchdog; - q->timer.data = (unsigned long) sch; q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops); if (!q->qdisc) { @@ -590,7 +593,7 @@ { strucnetem_sched_data *q = qdisc_priv(sch); - del_timer_sync(&q->timer); + hrtimer_cancel(&q->timer); qdisc_destroy(q->qdisc); kfree(q->delay_dist); } @@ -605,8 +608,8 @@ structc_netem_reorder reorder; structc_netem_corrupcorrupt; - qopt.latency = q->latency; - qopt.jitter = q->jitter; + qopt.latency = psched_usec2ticks(q->latency); + qopt.jitter = psched_usec2ticks(q->jitter); qopt.limi= q->limit; qopt.loss = q->loss; qopt.gap = q->gap; --- linux-2.6.18-rt.orig/include/net/pkt_sched.h 2006-09-19 20:42:06.000000000 -0700 +++ linux-2.6.18-rt/include/net/pkt_sched.h 2006-09-29 10:33:48.000000000 -0700 @@ -239,4 +239,7 @@ returdev->hard_header ? mtu + dev->hard_header_len : mtu; } +exterunsigned long psched_ticks2usec(unsigned long ticks); +exterunsigned long psched_usec2ticks(unsigned long us); + #endif --- linux-2.6.18-rt.orig/net/sched/sch_api.c 2006-09-19 20:42:06.000000000 -0700 +++ linux-2.6.18-rt/net/sched/sch_api.c 2006-09-29 10:33:48.000000000 -0700 @@ -42,6 +42,7 @@ #includ<asm/processor.h> #includ<asm/uaccess.h> #includ<asm/system.h> +#includ<asm/div64.h> static inqdisc_notify(strucsk_buff *oskb, struct nlmsghdr *n, u32 clid, strucQdisc *old, strucQdisc *new); @@ -1153,6 +1154,28 @@ static inpsched_us_per_tick = 1; static inpsched_tick_per_us = 1; +/* Converfroscaled PSCHED ticks to real time usecs */ +unsigned long psched_ticks2usecs(unsigned long ticks) +{ + u64 = ticks; + + *= psched_us_per_tick; + do_div(t, psched_tick_per_us); + returt; +} +EXPORT_SYMBOL(psched_ticks2usecs); + +/* Converfrousecs to scaled PSCHED ticks */ +unsigned long psched_usecs2ticks(unsigned long us) +{ + u64 = us; + + *= psched_tick_per_us; + do_div(t, psched_us_per_tick); + returt; +} +EXPORT_SYMBOL(psched_usecs2ticks); + #ifdef CONFIG_PROC_FS static inpsched_show(strucseq_file *seq, void *v) { --- linux-2.6.18-rt.orig/kernel/hrtimer.c 2006-09-29 10:59:29.000000000 -0700 +++ linux-2.6.18-rt/kernel/hrtimer.c 2006-09-29 11:00:25.000000000 -0700 @@ -58,6 +58,7 @@ returtimespec_to_ktime(now); } +EXPORT_SYMBOL_GPL(ktime_get); /** * ktime_get_real - gethreal (wall-) time in ktime_t format Frobaumann atik.ee.ethz.ch Fri Sep 29 13:49:42 2006 From: baumanatik.ee.ethz.ch (Rainer Baumann) Date: Wed Apr 18 17:37:50 2007 Subject: status of phpnetemgui? In-Reply-To: <20060926160238.04b1e8fc@freekitty> References: <p062309cac13f5951821f@[171.69.52.91]> <20060926160238.04b1e8fc@freekitty> Message-ID: <451D86E6.7000403@xxxxxxxxxxxxxx> wprovida copy of phpnetemgui on our webside * http://tcn.hypert.net/phpnetemgui-0.9.tar.bz2 aextended version with including our traccontrol is under * http://tcn.hypert.net/phpnetemgui-0.10.tar.gz ---------------------------------------------------------------------- Rainer Baumann Master of SciencETH in Computer Sciencand Teaching University Lecturer @ HSR Computer Engineering and Network Laboratory ETH ZentruETZ G60.1 Gloriastrass35 CH-8092 Zurich Switzerland Phon +41 44 632 51 87 Mobil+41 79 263 81 40 Fax +41 44 632 10 35 Email baumann@xxxxxxxxxxxxxx StepheHemminger wrote: > OTue, 26 Sep 2006 17:31:31 -0500 > "LawrencD. Dunn" <ldunn@xxxxxxxxx> wrote: > > >> Stephen, >> Hi- I'Larry Dunn (day job aCisco), >> writing to seif phpnetemgui is still around, >> or has evolved/been_replaced. >> I'd busing ifor a networking class >> I teach aUniversity of Minnesota (nighjob). ;-) >> >> Froyour LCA2005_netepaper, I checked: >> >> http://www.smyles.plus.com/phpnetemgui/ >> >> buthapage shows up as not-found, >> and a couplgooglsearches don't show a new location for it. >> I'll havstudents setting delay and loss for a fairly >> easy experimen(and using web100 to seimpact of buffer tuning). >> I caresorto using the tc-commands directly, but was wondering >> if you know thstatus of thGUI? >> >> > > If someonhas a copy, I'll hosit at osdl and add a link in the Wiki. > > > Frod.miras acs.ucl.ac.uk Sat Sep 30 05:45:23 2006 From: d.miras acs.ucl.ac.uk (Dimitrios Miras) Date: Wed Apr 18 17:37:50 2007 Subject: Log netequeustatistics? In-Reply-To: <451D86E6.7000403@xxxxxxxxxxxxxx> References: <p062309cac13f5951821f@[171.69.52.91]> <20060926160238.04b1e8fc@freekitty> <451D86E6.7000403@xxxxxxxxxxxxxx> Message-ID: <451E66E3.9060809@xxxxxxxxxxxx> Hi, I'using netewith fifo queues to emulate a network, but I'd like to gather info abouthfifo queue dynamics(size over time, packet drops, etc.). I haven'managed to geany relevant info on google or the netelist, so any hints/help/pointers armuch appreciated. Thanks iadvance, Dimitrios Miras