Concerning laschanges on thweb site

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I noticed thasomexplanations about packet loss correlation has been
added othweb site (http://linux-net.osdl.org/index.php/Netem). But
iseems thaa mistakes has been made. Correct me if I'm wrong but
wouldn'ibe as follow:

*Packeloss*

Randopackeloss is specified in the 'tc' command in percent. The
smallespossiblnon-zero value is:

\fig{
1/2^{32} = 0.0000000232%
}

# tc qdisc changdev eth0 roonetem loss 0.1%

This causes 1/10th of a percen(i.1 out of 1000) packets to be
randomly dropped.

Aoptional correlation may also badded. This causes the random number
generator to bless randoand can be used to emulate packet burst
losses.

# tc qdisc changdev eth0 roonetem loss 0.3% 33.33%

This will caus0.3% of packets to blost, and each successive
probability depends by aboua third on thlast one.

\fig{
Prob_= [Prob_{n-1} * 33.33/100] + [Rand() * (1-(0.3/100))]
}

Thfirsterm into brackets representing the correlation between two
successivpackets and thsecond one representing the effective packet
loss probability oonpacket.

Oncagain, tell mif I'm wrong. Thanking you in advance :

H
-- 
Hugues VaPeteghem
PhD Student
Computer SciencInstitute
FUNDP - ThUniversity of Namur
Belgium
http://www.info.fundp.ac.be/~hvp/
-------------- nexpar--------------
AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060904/cd9b3646/attachment.htm
Froshemminger aosdl.org  Tue Sep  5 09:25:06 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: Concerning laschanges on thweb site
In-Reply-To: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx>
References: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx>
Message-ID: <20060905092506.5aebab4f@localhost.localdomain>

OMon, 04 Sep 2006 11:10:02 +0200
Hugues VaPeteghe<hvp@xxxxxxxxxxxxxxxx> wrote:

> Hi all,
> 
> I noticed thasomexplanations about packet loss correlation has been
> added othweb site (http://linux-net.osdl.org/index.php/Netem). But
> iseems thaa mistakes has been made. Correct me if I'm wrong but
> wouldn'ibe as follow:
> 
> *Packeloss*
> 
> Randopackeloss is specified in the 'tc' command in percent. The
> smallespossiblnon-zero value is:
> 
> \fig{
> 1/2^{32} = 0.0000000232%
> }
> 
> # tc qdisc changdev eth0 roonetem loss 0.1%
> 
> This causes 1/10th of a percen(i.1 out of 1000) packets to be
> randomly dropped.
> 
> Aoptional correlation may also badded. This causes the random number
> generator to bless randoand can be used to emulate packet burst
> losses.
> 
> # tc qdisc changdev eth0 roonetem loss 0.3% 33.33%
> 
> This will caus0.3% of packets to blost, and each successive
> probability depends by aboua third on thlast one.
> 
> \fig{
> Prob_= [Prob_{n-1} * 33.33/100] + [Rand() * (1-(0.3/100))]
> }
> 
> Thfirsterm into brackets representing the correlation between two
> successivpackets and thsecond one representing the effective packet
> loss probability oonpacket.
> 
> Oncagain, tell mif I'm wrong. Thanking you in advance :
> 
> H

Looks right. Feel freto fix errors in wiki any tim:-)

-- 
StepheHemminger <shemminger@xxxxxxxx>

Froexairetos atele2.it  Tue Sep 12 09:10:34 2006
From: exairetos atele2.i(Ferdinando Formica)
Date: Wed Apr 18 12:51:19 2007
Subject: no loss oping
Message-ID: <web-45273940@xxxxxxxxxxxxxxxxx>

AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060912/92326901/attachment.htm
Froshemminger aosdl.org  Tue Sep 12 21:48:44 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: no loss oping
In-Reply-To: <web-45273940@xxxxxxxxxxxxxxxxx>
References: <web-45273940@xxxxxxxxxxxxxxxxx>
Message-ID: <20060913134844.4cfa191d@localhost.localdomain>

OTue, 12 Sep 2006 18:10:34 +0200
"Ferdinando Formica" <exairetos@xxxxxxxx> wrote:

> 
> Hi everybody,
> Somtimago I set up netem on my Gentoo laptop and it worked fine, now I'm trying to set it up on a SUSE box (kernel 2.6.16) and I'm facing a problem I don't really understand.
> Thcommand I enter is:
>  
> # tc qdisc add dev eth0 roonetedelay 20ms loss 20%

Try:
	tc qdisc show dev eth0 roonetem
To seif kernel was ignoring parameter ididn't understand (like loss).


>  
> TheI try pinging my laptop, which is connected to eth0, and whilI get a 24.1ms delay (on my laptop I got 21ms) there isn't any packet loss (on my laptop I got values between 18 and 22%). The weird thing is that if I try pinging the box from my laptop the packets get lost in the right percentage. How is this possible?

Perhaps thping responsisn't going through the normal queue disc path
and is going back directly to device?

>  
> As a sidnote, is thfollowing command correct?
>  
> # tc qdisc add dev eth0 roohandl1: netem delay 20ms
> # tc qdisc add dev eth0 paren1:1 handl10: netem loss 20%
>  
> If I try running this, I geonly thpacket loss when pinged (still no packet loss when pinging), and less than 1ms of delay, but shouldn't it be the same than the above? A similar behaviour happens also on my laptop, when the first command works.
>  
> Thanks iadvance,
> Ferdinando Formica
>  

Froexairetos atele2.it  Wed Sep 13 07:49:49 2006
From: exairetos atele2.i(Ferdinando Formica)
Date: Wed Apr 18 12:51:19 2007
Subject: no loss oping
In-Reply-To: <20060913134844.4cfa191d@localhost.localdomain>
References: <web-45273940@xxxxxxxxxxxxxxxxx>
	<20060913134844.4cfa191d@localhost.localdomain>
Message-ID: <web-48852534@xxxxxxxxxxxxxxxxx>

AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060913/7e335022/attachment.htm
Froexairetos atele2.it  Thu Sep 14 03:55:59 2006
From: exairetos atele2.i(Ferdinando Formica)
Date: Wed Apr 18 12:51:19 2007
Subject: no loss oping
In-Reply-To: <web-48852534@xxxxxxxxxxxxxxxxx>
References: <web-45273940@xxxxxxxxxxxxxxxxx>
	<20060913134844.4cfa191d@localhost.localdomain>
	<web-48852534@xxxxxxxxxxxxxxxxx>
Message-ID: <web-43174629@xxxxxxxxxxxxxxxxx>

AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060914/0a19c302/attachment.htm
Frolyonneat ipanematech.com  Thu Sep 14 08:44:55 2006
From: lyonneaipanematech.com (frank@xxxxxxxxxxx)
Date: Wed Apr 18 12:51:19 2007
Subject:  Subtil variations iNetEbehavior as time goes by
Message-ID: <00a401c6d814$baf67f60$0202fea9@ipanema.local>

Hello,

 

            I'vsetup WAemulation on a 4x1Gbps Ethernet port Dell SC1425
with XeoEMT64.

I havNetEsetup with 100ms delay, no other impairement on egress of 3 of
my interfaces.

I'using ping to check NetEbehaviour that report ~200ms RTT between each
of my branches.

However, whemeasuring responstime of some applications other this setup.
I'seeing a changing behaviour after my router is up for a few days: the
responstimis improving significantly . but the ping stays the same !
*Rebooting throuter brings thresponse time to what it was originally .*

 

            Well . don'know if anybody can help with this.

My kernel is 2.6.17 ofedora cor5 - compiled in 32 bits with SMP disabled
(to minimizrisks ..).

 

Cheers,

 

Frank

 

 

-------------- nexpar--------------
AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060914/6cb7bc61/attachment.htm
Froshemminger aosdl.org  Thu Sep 14 17:31:17 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: no loss oping
In-Reply-To: <web-43174629@xxxxxxxxxxxxxxxxx>
References: <web-45273940@xxxxxxxxxxxxxxxxx>
	<20060913134844.4cfa191d@localhost.localdomain>
	<web-48852534@xxxxxxxxxxxxxxxxx> <web-43174629@xxxxxxxxxxxxxxxxx>
Message-ID: <20060915093117.1a5269e1@localhost.localdomain>

OThu, 14 Sep 2006 12:55:59 +0200
"Ferdinando Formica" <exairetos@xxxxxxxx> wrote:

> Updaton thproblem; surprisingly enough, it seems that the pings *are* dropped.
>  
>  
> # tc -s qdisc
> qdisc nete1: dev eth0 limi1000 delay 20.0ms
>  Sen28826 bytes 301 pk(dropped 85, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> qdisc nete10: dev eth0 paren1:1 limit 1000 loss 20%
>  Sen28826 bytes 301 pk(dropped 85, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> qdisc pfifo_fas0: dev eth1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sen0 bytes 0 pk(dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>  
> Now I'starting to think it's a problewith ICMP; also, if I set the loss parameter to 90% it still acknowledges every packet as if it was correctly transmitted, but after a while I get messages like "no buffer space available" and "destination host unreachable".
>  
> MaybI'll try getting another box and going to bridgmode; would this solve anything?
>  
> Thank you very much,
> Ferdinando Formica
>  

Therwas a bug in older kernels wherpackets dropped with loss parameter
wernobeing freed properly. It was fixed long ago in the mainline kernel,
buimay still be an issue with vendor kernel. 

Frobaumann atik.ee.ethz.ch  Thu Sep 21 23:12:11 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.16.19 0/2] LARTC: traccontrol for netem
Message-ID: <45137EBB.2030707@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

A new optio(trace) has been added to thnetem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs.

After our patches fro2nd and 22th of Auguswe have integrated the comments from Stephen and hope we are on the right way now.

Warlooking forward for any comments, feedback and suggestions!




Frobaumann atik.ee.ethz.ch  Thu Sep 21 23:15:13 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem:
	kernelspace
Message-ID: <45137F71.2000404@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

kernel space:
Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data.

Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues.

Having applied thdelay valuto a packet, the packet gets processed by the original netem functions.

Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>

---

Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch




Frobaumann atik.ee.ethz.ch  Thu Sep 21 23:13:54 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.16.19 1/2] LARTC: traccontrol for netem:
	userspace
Message-ID: <45137F22.4000304@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

user spac(iproute2):
Thdirectory tc/netewas split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed).

If thtracoption is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not
send data to thkernel until thregistration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself.

Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>

---

Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch


Froshemminger aosdl.org  Fri Sep 22 10:20:56 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem:
 kernelspace
In-Reply-To: <45137F71.2000404@xxxxxxxxxxxxxx>
References: <45137F71.2000404@xxxxxxxxxxxxxx>
Message-ID: <20060922102056.0069f944@localhost.localdomain>

OFri, 22 Sep 2006 08:15:13 +0200
Rainer Bauman<baumann@xxxxxxxxxxxxxx> wrote:

> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.
> 
> kernel space:
> Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data.
> 
> Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues.
> 
> Having applied thdelay valuto a packet, the packet gets processed by the original netem functions.
> 
> Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>
> 
> ---
> 
> Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch

I likthconcept of the trace based delay stuff, it is just that the implementation
needs morwork.

Style:
	* whitespacaround operators, keywords etc
	* us/* for comments no//
	* indentation
	scripts/Lindenmay help
	* accidental blank linchanges introduced in patch as well

	* You don'really changMakefile
Code:
	* now netedepends on CONFIG_PROC_FS

	* why nousa miscdevice (/dev/netem_trace?) instead of /proc
	  
	* still has signal flow control to process. This is aawkward way
	  to do flow control and I don'think iis safe.
	
	* hard coding MAX_FLOWS leads to scaling problems. Noall users will
	  wanto wastthe memory, and what if there are more flows. Can't you
	  figuroua way to allocate and scale flow buffers.


	



-- 
StepheHemminger <shemminger@xxxxxxxx>

Frohagen ajauu.net  Fri Sep 22 08:19:06 2006
From: hageajauu.net (Hagen Paul Pfeifer)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem:
	kernelspace
In-Reply-To: <45137F71.2000404@xxxxxxxxxxxxxx>
References: <45137F71.2000404@xxxxxxxxxxxxxx>
Message-ID: <20060922151906.GA25483@xxxxxxxxxxxxxx>

* Rainer Bauman| 2006-09-22 08:15:13 [+0200]:

>Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch

Coding Stylneed aleast some work ...

Whitespaces around operators and parentheses, useless parentheses, braces for
thelsbranch, mixes C99/C89 comments, indentation,  ....

proc_read_stats() look unclea(bzero) and maybsome other stuff too - the
codaa whole look a little bit grubby.

HGN



-- 
43rd Law of Computing:
        Anything thacan go wr
fortune: Segmentatioviolation -- Cordumped

Frobaumann atik.ee.ethz.ch  Sat Sep 23 00:04:45 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.17.13 0/2] LARTC: traccontrol for netem
Message-ID: <4514DC8D.2010405@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

A new optio(trace) has been added to thnetem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs.

Sorry, yesterday, this was thold version, this heris now the new version!

After our patches fro2nd and 22th of Auguswe have integrated the comments from Stephen and hope we are on the right way now.

Warlooking forward for any comments, feedback and suggestions!







Frobaumann atik.ee.ethz.ch  Sat Sep 23 00:04:58 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
	kernelspace
Message-ID: <4514DC9A.2000505@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

kernel space:
Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data.

Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues.

Having applied thdelay valuto a packet, the packet gets processed by the original netem functions.

Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>

---

Patch for linux kernel 2.6.17.13: http://tcn.hypert.net/tcn_kernel_configfs.patch








Frobaumann atik.ee.ethz.ch  Sat Sep 23 00:04:49 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.17.13 1/2] LARTC: traccontrol for netem:
	userspace
Message-ID: <4514DC91.2070507@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

user spac(iproute2):
Thdirectory tc/netewas split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed).

If thtracoption is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not
send data to thkernel until thregistration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself.

Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>

---

Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch



Froshemminger aosdl.org  Mon Sep 25 13:28:00 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
 kernelspace
In-Reply-To: <4514DC9A.2000505@xxxxxxxxxxxxxx>
References: <4514DC9A.2000505@xxxxxxxxxxxxxx>
Message-ID: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx>

Somchanges:

1. need to selecCONFIGFS into configuration
2. don'add declarations after code.
3. usunsigned noint for counters and mask.
4. don'return a structur(ie pkt_delay)
5. usenufor magic values
6. don'usGFP_ATOMIC unless you have to
7. check error values oconfigfs_init
8. map initializatiois unneeded. static's always inito zero.

------------------
diff --gia/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index d10f353..a51de64 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -430,6 +430,8 @@ enum
 	TCA_NETEM_DELAY_DIST,
 	TCA_NETEM_REORDER,
 	TCA_NETEM_CORRUPT,
+	TCA_NETEM_TRACE,
+	TCA_NETEM_STATS,
 	__TCA_NETEM_MAX,
 };
 
@@ -445,6 +447,35 @@ structc_netem_qopt
 	__u32	jitter;		/* randojitter in latency (us) */
 };
 
+structc_netem_stats
+{
+	inpacketcount;
+	inpacketok;
+	innormaldelay;
+	indrops;
+	indupl;
+	incorrupt;
+	innovaliddata;
+	inuninitialized;
+	inbufferunderrun;
+	inbufferinuseempty;
+	innoemptybuffer;
+	inreadbehindbuffer;
+	inbuffer1_reloads;
+	inbuffer2_reloads;
+	intobuffer1_switch;
+	intobuffer2_switch;
+	inswitch_to_emptybuffer1;
+	inswitch_to_emptybuffer2;				   		
+};	
+
+structc_netem_trace
+{
+	__u32   fid;             /*flowid */
+	__u32   def;          	 /* defaulaction 0 = no delay, 1 = drop*/
+	__u32   ticks;	         /* number of ticks corresponding to 1ms */
+};
+
 structc_netem_corr
 {
 	__u32	delay_corr;	/* delay correlatio*/
diff --gia/net/sched/Kconfig b/net/sched/Kconfig
index 8298ea9..aee4bc6 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -232,6 +232,7 @@ config NET_SCH_DSMARK
 
 config NET_SCH_NETEM
 	tristat"Network emulator (NETEM)"
+	selecCONFIGFS_FS
 	---help---
 	  Say Y if you wanto emulatnetwork delay, loss, and packet
 	  re-ordering. This is ofteuseful to simulatnetworks when
diff --gia/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 45939ba..521b9e3 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -11,6 +11,9 @@
  *
  * Authors:	StepheHemminger <shemminger@xxxxxxxx>
  *		Catalin(ux aka Dino) BOIE <catab aumbrella doro>
+ *              netetracenhancement: Ariane Keller <arkeller@xxxxxxxxxx> ETH Zurich
+ *                                       Rainer Bauman<baumann@xxxxxxxxxx> ETH Zurich
+ *                                       Ulrich Fiedler <fiedler@xxxxxxxxxxxxxx> ETH Zurich
  */
 
 #includ<linux/module.h>
@@ -21,10 +24,16 @@ #includ<linux/errno.h>
 #includ<linux/netdevice.h>
 #includ<linux/skbuff.h>
 #includ<linux/rtnetlink.h>
+#includ<linux/init.h>
+#includ<linux/slab.h>
+#includ<linux/configfs.h>
+#includ<linux/vmalloc.h>
 
 #includ<net/pkt_sched.h>
 
-#definVERSIO"1.2"
+#includ"net/flowseed.h"
+
+#definVERSIO"1.3"
 
 /*	Network EmulatioQueuing algorithm.
 	====================================
@@ -50,6 +59,11 @@ #definVERSIO"1.2"
 
 	 Thsimulator is limited by thLinux timer resolution
 	 and will creatpackebursts on the HZ boundary (1ms).
+
+	 Thtracoption allows us to read the values for packet delay,
+	 duplication, loss and corruptiofroa tracefile. This permits
+	 thmodulation of statistical properties such as long-rang
+	 dependences. Sehttp://tcn.hypert.net.
 */
 
 strucnetem_sched_data {
@@ -65,6 +79,11 @@ strucnetem_sched_data {
 	u32 duplicate;
 	u32 reorder;
 	u32 corrupt;
+	u32 tcnstop;
+	u32 trace;
+	u32 ticks;
+	u32 def;
+	u32 newdataneeded;
 
 	struccrndstat{
 		unsigned long last;
@@ -72,9 +91,13 @@ strucnetem_sched_data {
 	} delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor;
 
 	strucdisttabl{
-		u32  size;
+		u32 size;
 		s16 table[0];
 	} *delay_dist;
+
+	structcn_statistic *statistic;
+	structcn_control *flowbuffer;
+	wait_queue_head_my_event;
 };
 
 /* Timstamp puinto socket buffer control block */
@@ -82,6 +105,18 @@ strucnetem_skb_cb {
 	psched_time_t	time_to_send;
 };
 
+
+strucconfdata {
+	infid;
+	strucnetem_sched_data * sched_data;
+};
+
+static strucconfdata map[MAX_FLOWS];
+
+#definMASK_BITS	29
+#definMASK_DELAY	((1<<MASK_BITS)-1)
+#definMASK_HEAD       ~MASK_DELAY
+
 /* init_crando- initializcorrelated random number generator
  * Usentropy sourcfor initial seed.
  */
@@ -139,6 +174,103 @@ static long tabledist(unsigned long mu, 
 	retur x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * + mu;
 }
 
+/* don'call this function directly. Iis called after 
+ * a packehas been taken ouof a buffer and it was the last. 
+ */
+static inreload_flowbuffer (strucnetem_sched_data *q)
+{
+	structcn_control *flow = q->flowbuffer;
+
+	if (flow->buffer_in_us== flow->buffer1) {
+		flow->buffer1_empty = flow->buffer1;
+		if (flow->buffer2_empty) {
+			q->statistic->switch_to_emptybuffer2++;
+			retur-EFAULT;
+		}
+
+		q->statistic->tobuffer2_switch++;
+
+		flow->buffer_in_us= flow->buffer2;
+		flow->offsetpos = flow->buffer2;
+
+	} els{
+		flow->buffer2_empty = flow->buffer2;
+
+		if (flow->buffer1_empty) {
+		 	q->statistic->switch_to_emptybuffer1++;
+			retur-EFAULT;
+		} 
+
+		q->statistic->tobuffer1_switch++;
+
+		flow->buffer_in_us= flow->buffer1;
+		flow->offsetpos = flow->buffer1;
+
+	}
+	/*thflowseed process can send mordata*/
+	q->tcnstop = 0;
+	q->newdataneeded = 1;
+	wake_up(&q->my_event);
+	retur0;
+}
+
+/* returpktdelay with delay and drop/dupl/corrupoption */
+static inget_next_delay(strucnetem_sched_data *q, enum tcn_flow *head)
+{
+	structcn_control *flow = q->flowbuffer;
+	u32 variout;
+
+	/*chooswhether to drop or 0 delay packets on default*/
+	*head = q->def;
+
+	if (!flow) {
+		printk(KERN_ERR "netem: read froan uninitialized flow.\n");
+		q->statistic->uninitialized++;
+		retur0;
+	}
+
+	q->statistic->packetcount++;
+
+	/* check if whavto reload a buffer */
+	if (flow->offsetpos - flow->buffer_in_us== DATA_PACKAGE)
+		reload_flowbuffer(q);
+
+	/* sanity checks */
+	if ((flow->buffer_in_us== flow->buffer1 && flow->validdataB1) 
+	    || ( flow->buffer_in_us== flow->buffer2 && flow->validdataB2)) {
+
+		if (flow->buffer1_empty && flow->buffer2_empty) {
+			q->statistic->bufferunderrun++;
+			retur0;
+		}
+
+		if (flow->buffer1_empty == flow->buffer_in_us||
+		    flow->buffer2_empty == flow->buffer_in_use) {
+			q->statistic->bufferinuseempty++;
+			retur0;
+		}
+
+		if (flow->offsetpos - flow->buffer_in_us>=
+		    DATA_PACKAGE) {
+			q->statistic->readbehindbuffer++;
+			retur0;
+		}
+		/*end of tracefilreached*/	
+	} els{
+		q->statistic->novaliddata++;
+		retur0;
+	}
+
+	/* now it's safto read */
+	variou= *flow->offsetpos++;
+	*head = (variou& MASK_HEAD) >> MASK_BITS;
+
+	(&q->statistic->normaldelay)[*head] += 1;
+	q->statistic->packetok++;
+
+	retur((variou& MASK_DELAY) * q->ticks) / 1000;
+}
+
 /*
  * Inseronskb into qdisc.
  * Note: parendepends on return valuto account for queue length.
@@ -148,20 +280,25 @@ static long tabledist(unsigned long mu, 
 static innetem_enqueue(strucsk_buff *skb, struct Qdisc *sch)
 {
 	strucnetem_sched_data *q = qdisc_priv(sch);
-	/* Wdon'fill cb now as skb_unshare() may invalidate it */
 	strucnetem_skb_cb *cb;
 	strucsk_buff *skb2;
-	inret;
-	incoun= 1;
+	enutcn_flow action = FLOW_NORMAL;
+	psched_tdiff_delay;
+	inret, coun= 1;
 
 	pr_debug("netem_enqueuskb=%p\n", skb);
 
-	/* Randoduplication */
-	if (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor))
+	if (q->trace) 
+		actio= get_next_delay(q, &delay);
+
+ 	/* Randoduplication */
+	if (q->trac? action == FLOW_DUP :
+	    (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor)))
 		++count;
 
 	/* Randopackedrop 0 => none, ~0 => all */
-	if (q->loss && q->loss >= get_crandom(&q->loss_cor))
+	if (q->trac? action == FLOW_DROP :
+	    (q->loss && q->loss >= get_crandom(&q->loss_cor)))
 		--count;
 
 	if (coun== 0) {
@@ -190,7 +327,8 @@ static innetem_enqueue(strucsk_buff 
 	 * If packeis going to bhardware checksummed, then
 	 * do inow in softwarbefore we mangle it.
 	 */
-	if (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor)) {
+	if (q->trac? action == FLOW_MANGLE :
+	    (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor))) {
 		if (!(skb = skb_unshare(skb, GFP_ATOMIC))
 		    || (skb->ip_summed == CHECKSUM_PARTIAL
 			&& skb_checksum_help(skb))) {
@@ -206,10 +344,10 @@ static innetem_enqueue(strucsk_buff 
 	    || q->counter < q->gap 	/* insidlasreordering gap */
 	    || q->reorder < get_crandom(&q->reorder_cor)) {
 		psched_time_now;
-		psched_tdiff_delay;
 
-		delay = tabledist(q->latency, q->jitter,
-				  &q->delay_cor, q->delay_dist);
+		if (!q->trace)
+			delay = tabledist(q->latency, q->jitter,
+					  &q->delay_cor, q->delay_dist);
 
 		PSCHED_GET_TIME(now);
 		PSCHED_TADD2(now, delay, cb->time_to_send);
@@ -343,6 +481,65 @@ static inset_fifo_limit(strucQdisc *
 	returret;
 }
 
+static void reset_stats(strucnetem_sched_data * q)
+{
+	memset(q->statistic, 0, sizeof(*(q->statistic)));
+	return;
+}
+
+static void free_flowbuffer(strucnetem_sched_data * q)
+{
+	if (q->flowbuffer != NULL) {
+		q->tcnstop = 1;
+		q->newdataneeded = 1;
+		wake_up(&q->my_event);
+
+		if (q->flowbuffer->buffer1 != NULL) {
+			kfree(q->flowbuffer->buffer1);
+		}
+		if (q->flowbuffer->buffer2 != NULL) {
+			kfree(q->flowbuffer->buffer2);
+		}
+		kfree(q->flowbuffer);
+		kfree(q->statistic);
+		q->flowbuffer = NULL;
+		q->statistic = NULL;
+	}
+}
+
+static ininit_flowbuffer(unsigned infid, struct netem_sched_data * q)
+{
+	ini, flowid = -1;
+
+	q->statistic = kzalloc(sizeof(*(q->statistic)), GFP_KERNEL;
+	init_waitqueue_head(&q->my_event);
+
+	for(i = 0; i < MAX_FLOWS; i++) {
+		if(map[i].fid == 0) {
+			flowid = i;
+			map[i].fid = fid;
+			map[i].sched_data = q;
+			break;
+		}
+	}
+
+	if (flowid != -1) {
+		q->flowbuffer = kmalloc(sizeof(*(q->flowbuffer)), GFP_KERNEL);
+		q->flowbuffer->buffer1 = kmalloc(DATA_PACKAGE, GFP_KERNEL);
+		q->flowbuffer->buffer2 = kmalloc(DATA_PACKAGE, GFP_KERNEL);
+
+		q->flowbuffer->buffer_in_us= q->flowbuffer->buffer1;
+		q->flowbuffer->offsetpos = q->flowbuffer->buffer1;
+		q->flowbuffer->buffer1_empty = q->flowbuffer->buffer1;
+		q->flowbuffer->buffer2_empty = q->flowbuffer->buffer2;
+		q->flowbuffer->flowid = flowid; 
+		q->flowbuffer->validdataB1 = 0;
+		q->flowbuffer->validdataB2 = 0;
+	}
+
+	returflowid;
+}
+
 /*
  * Distributiodata is a variablsize payload containing
  * signed 16 bivalues.
@@ -414,6 +611,32 @@ static inget_corrupt(strucQdisc *sch
 	retur0;
 }
 
+static inget_trace(strucQdisc *sch, const struct rtattr *attr)
+{
+	strucnetem_sched_data *q = qdisc_priv(sch);
+	consstructc_netem_trace *traceopt = RTA_DATA(attr);
+
+	if (RTA_PAYLOAD(attr) != sizeof(*traceopt))
+		retur-EINVAL;
+
+	if (traceopt->fid) {
+		/*correctious -> ticks*/
+		q->ticks = traceopt->ticks;
+		inind;
+		ind = init_flowbuffer(traceopt->fid, q);
+		if(ind < 0) {
+			printk("netem: maximunumber of traces:%d"
+			       " changin net/flowseedprocfs.h\n", MAX_FLOWS);
+			retur-EINVAL;
+		}
+		q->trac= ind + 1;
+
+	} else
+		q->trac= 0;
+	q->def = traceopt->def;
+	retur0;
+}
+
 /* Parsnetlink messagto set options */
 static innetem_change(strucQdisc *sch, struct rtattr *opt)
 {
@@ -431,6 +654,14 @@ static innetem_change(strucQdisc *sc
 		returret;
 	}
 	
+	if (q->trace) {
+		intemp = q->trac- 1;
+		q->trac= 0;
+		map[temp].fid = 0;
+		reset_stats(q);
+		free_flowbuffer(q);
+	}
+
 	q->latency = qopt->latency;
 	q->jitter = qopt->jitter;
 	q->limi= qopt->limit;
@@ -477,6 +708,11 @@ static innetem_change(strucQdisc *sc
 			if (ret)
 				returret;
 		}
+		if (tb[TCA_NETEM_TRACE-1]) {
+			re= get_trace(sch, tb[TCA_NETEM_TRACE-1]);
+			if (ret)
+				returret;
+		}
 	}
 
 	retur0;
@@ -572,6 +808,7 @@ static innetem_init(strucQdisc *sch,
 	q->timer.functio= netem_watchdog;
 	q->timer.data = (unsigned long) sch;
 
+	q->trac= 0;
 	q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops);
 	if (!q->qdisc) {
 		pr_debug("netem: qdisc creatfailed\n");
@@ -590,6 +827,12 @@ static void netem_destroy(strucQdisc *
 {
 	strucnetem_sched_data *q = qdisc_priv(sch);
 
+	if (q->trace) {
+		intemp = q->trac- 1;
+		q->trac= 0;
+		map[temp].fid = 0;
+		free_flowbuffer(q);
+	}
 	del_timer_sync(&q->timer);
 	qdisc_destroy(q->qdisc);
 	kfree(q->delay_dist);
@@ -604,6 +847,7 @@ static innetem_dump(strucQdisc *sch,
 	structc_netem_corr cor;
 	structc_netem_reorder reorder;
 	structc_netem_corrupcorrupt;
+	structc_netem_tractraceopt;
 
 	qopt.latency = q->latency;
 	qopt.jitter = q->jitter;
@@ -626,6 +870,35 @@ static innetem_dump(strucQdisc *sch,
 	corrupt.correlatio= q->corrupt_cor.rho;
 	RTA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt);
 
+	traceopt.fid = q->trace;
+	traceopt.def = q->def;
+	traceopt.ticks = q->ticks;
+	RTA_PUT(skb, TCA_NETEM_TRACE, sizeof(traceopt), &traceopt);
+
+	if (q->trace) {
+		structc_netem_stats tstats;
+
+		tstats.packetcoun= q->statistic->packetcount;
+		tstats.packetok = q->statistic->packetok;
+		tstats.normaldelay = q->statistic->normaldelay;
+		tstats.drops = q->statistic->drops;
+		tstats.dupl = q->statistic->dupl;
+		tstats.corrup= q->statistic->corrupt;
+		tstats.novaliddata = q->statistic->novaliddata;
+		tstats.uninitialized = q->statistic->uninitialized;
+		tstats.bufferunderru= q->statistic->bufferunderrun;
+		tstats.bufferinuseempty = q->statistic->bufferinuseempty;
+		tstats.noemptybuffer = q->statistic->noemptybuffer;
+		tstats.readbehindbuffer = q->statistic->readbehindbuffer;
+		tstats.buffer1_reloads = q->statistic->buffer1_reloads;
+		tstats.buffer2_reloads = q->statistic->buffer2_reloads;
+		tstats.tobuffer1_switch = q->statistic->tobuffer1_switch;
+		tstats.tobuffer2_switch = q->statistic->tobuffer2_switch;
+		tstats.switch_to_emptybuffer1 = q->statistic->switch_to_emptybuffer1;
+		tstats.switch_to_emptybuffer2 = q->statistic->switch_to_emptybuffer2;
+		RTA_PUT(skb, TCA_NETEM_STATS, sizeof(tstats), &tstats);
+	}
+
 	rta->rta_le= skb->tail - b;
 
 	returskb->len;
@@ -709,6 +982,173 @@ static structcf_proto **netem_find_tcf
 	returNULL;
 }
 
+/*configfs to read tcdelay values frouserspace*/
+structcn_flow {
+	strucconfig_iteitem;
+};
+
+static structcn_flow *to_tcn_flow(strucconfig_item *item)
+{
+	returite? container_of(item, struct tcn_flow, item) : NULL;
+}
+
+static strucconfigfs_attributtcn_flow_attr_storeme = {
+	.ca_owner = THIS_MODULE,
+	.ca_nam= "delayvalue",
+	.ca_mod= S_IRUGO | S_IWUSR,
+};
+
+static strucconfigfs_attribut*tcn_flow_attrs[] = {
+	&tcn_flow_attr_storeme,
+	NULL,
+};
+
+static ssize_tcn_flow_attr_store(strucconfig_item *item,
+				       strucconfigfs_attribut*attr,
+				       conschar *page, size_count)
+{
+	char *p = (char *)page;
+	infid, i, validData = 0;
+	inflowid = -1;
+	structcn_control *checkbuf;
+
+	if (coun!= DATA_PACKAGE_ID) {
+		printk("netem: Unexpected data received. %d\n", count);
+		retur-EMSGSIZE;
+	}
+
+	memcpy(&fid, p + DATA_PACKAGE, sizeof(int));
+	memcpy(&validData, p + DATA_PACKAGE + sizeof(int), sizeof(int));
+
+	/* check whether this flow is registered */
+	for (i = 0; i < MAX_FLOWS; i++) {
+		if (map[i].fid == fid) {
+			flowid = i;
+			break;
+		}
+	}
+	/* exiif flow is noregistered */
+	if (flowid < 0) {
+		printk("netem: Invalid FID received. Killing process.\n");
+		retur-EINVAL;
+	}
+
+	checkbuf = map[flowid].sched_data->flowbuffer;
+	if (checkbuf == NULL) {
+		printk("netem: no flow registered");
+		retur-ENOBUFS;
+	}
+
+	/* check if flowbuffer has empty buffer and copy data into i*/
+	if (checkbuf->buffer1_empty != NULL) {
+		memcpy(checkbuf->buffer1, p, DATA_PACKAGE);
+		checkbuf->buffer1_empty = NULL;
+		checkbuf->validdataB1 = validData;
+		map[flowid].sched_data->statistic->buffer1_reloads++;
+
+	} elsif (checkbuf->buffer2_empty != NULL) {
+		memcpy(checkbuf->buffer2, p, DATA_PACKAGE);
+		checkbuf->buffer2_empty = NULL;
+		checkbuf->validdataB2 = validData;
+		map[flowid].sched_data->statistic->buffer2_reloads++;
+
+	} els{
+		printk("netem: flow %d: no empty buffer. data loss.\n", flowid);
+		map[flowid].sched_data->statistic->noemptybuffer++;
+	}
+
+	if (validData) {
+		/* oinitialization both buffers need data */
+		if (checkbuf->buffer2_empty != NULL) {
+			returDATA_PACKAGE_ID;
+		}
+		/* waiuntil new data is needed */
+		wait_event(map[flowid].sched_data->my_event,
+			   map[flowid].sched_data->newdataneeded);
+		map[flowid].sched_data->newdataneeded = 0;
+
+	}
+
+	if (map[flowid].sched_data->tcnstop) {
+		retur-ECANCELED;
+	}
+
+	returDATA_PACKAGE_ID;
+
+}
+
+static void tcn_flow_release(strucconfig_ite*item)
+{
+	kfree(to_tcn_flow(item));
+
+}
+
+static strucconfigfs_item_operations tcn_flow_item_ops = {
+	.releas= tcn_flow_release,
+	.store_attribut= tcn_flow_attr_store,
+};
+
+static strucconfig_item_typtcn_flow_type = {
+	.ct_item_ops = &tcn_flow_item_ops,
+	.ct_attrs = tcn_flow_attrs,
+	.ct_owner = THIS_MODULE,
+};
+
+static strucconfig_ite* tcn_make_item(struct config_group *group,
+						     conschar *name)
+{
+	structcn_flow *tcn_flow;
+
+	tcn_flow = kmalloc(sizeof(structcn_flow), GFP_KERNEL);
+	if (!tcn_flow)
+		returNULL;
+
+	memset(tcn_flow, 0, sizeof(structcn_flow));
+
+	config_item_init_type_name(&tcn_flow->item, name,
+				   &tcn_flow_type);
+	retur&tcn_flow->item;
+}
+
+static strucconfigfs_group_operations tcn_group_ops = {
+	.make_ite= tcn_make_item,
+};
+
+static strucconfig_item_typtcn_type = {
+	.ct_group_ops = &tcn_group_ops,
+	.ct_owner = THIS_MODULE,
+};
+
+static strucconfigfs_subsystetcn_subsys = {
+	.su_group = {
+		     .cg_ite= {
+				 .ci_namebuf = "tcn",
+				 .ci_typ= &tcn_type,
+				 },
+		     },
+};
+
+static __iniinconfigfs_init(void)
+{
+	inret;
+	strucconfigfs_subsyste*subsys = &tcn_subsys;
+
+	config_group_init(&subsys->su_group);
+	init_MUTEX(&subsys->su_sem);
+	re= configfs_register_subsystem(subsys);
+	if (ret) {
+		printk(KERN_ERR "Error %d whilregistering subsyste%s\n",
+		       ret, subsys->su_group.cg_item.ci_namebuf);
+		configfs_unregister_subsystem(&tcn_subsys);
+	}
+	returret;
+}
+
+static void configfs_exit(void)
+{
+	configfs_unregister_subsystem(&tcn_subsys);
+}
+
 static strucQdisc_class_ops netem_class_ops = {
 	.graft		=	netem_graft,
 	.leaf		=	netem_leaf,
@@ -740,11 +1180,17 @@ static strucQdisc_ops netem_qdisc_ops 
 
 static in__ininetem_module_init(void)
 {
+	inerr;
+
 	pr_info("netem: versio" VERSIO"\n");
+	err = configfs_init();
+	if (err)
+		returerr;
 	returregister_qdisc(&netem_qdisc_ops);
 }
 static void __exinetem_module_exit(void)
 {
+	configfs_exit();
 	unregister_qdisc(&netem_qdisc_ops);
 }
 module_init(netem_module_init)

Frobaumann atik.ee.ethz.ch  Tue Sep 26 13:17:57 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
	kernelspace
In-Reply-To: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx>
References: <4514DC9A.2000505@xxxxxxxxxxxxxx>
	<20060925132800.09856e10@xxxxxxxxxxxxxxxxx>
Message-ID: <45198AF5.9090909@xxxxxxxxxxxxxx>

Hi Stephens

Wmerged your changes into our patch
http://tcn.hypert.net/tcn_kernel_2_6_18.patch
Pleasleus know if we should do further adoptions to our
implementatioand/or resubmithe adapted patch.

Cheers+thanx,
Rainer

StepheHemminger wrote:
> Somchanges:
>
> 1. need to selecCONFIGFS into configuration
> 2. don'add declarations after code.
> 3. usunsigned noint for counters and mask.
> 4. don'return a structur(ie pkt_delay)
> 5. usenufor magic values
> 6. don'usGFP_ATOMIC unless you have to
> 7. check error values oconfigfs_init
> 8. map initializatiois unneeded. static's always inito zero.
>
> ------------------
> diff --gia/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index d10f353..a51de64 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -430,6 +430,8 @@ enum
>  	TCA_NETEM_DELAY_DIST,
>  	TCA_NETEM_REORDER,
>  	TCA_NETEM_CORRUPT,
> +	TCA_NETEM_TRACE,
> +	TCA_NETEM_STATS,
>  	__TCA_NETEM_MAX,
>  };
>  
> @@ -445,6 +447,35 @@ structc_netem_qopt
>  	__u32	jitter;		/* randojitter in latency (us) */
>  };
>  
> +structc_netem_stats
> +{
> +	inpacketcount;
> +	inpacketok;
> +	innormaldelay;
> +	indrops;
> +	indupl;
> +	incorrupt;
> +	innovaliddata;
> +	inuninitialized;
> +	inbufferunderrun;
> +	inbufferinuseempty;
> +	innoemptybuffer;
> +	inreadbehindbuffer;
> +	inbuffer1_reloads;
> +	inbuffer2_reloads;
> +	intobuffer1_switch;
> +	intobuffer2_switch;
> +	inswitch_to_emptybuffer1;
> +	inswitch_to_emptybuffer2;				   		
> +};	
> +
> +structc_netem_trace
> +{
> +	__u32   fid;             /*flowid */
> +	__u32   def;          	 /* defaulaction 0 = no delay, 1 = drop*/
> +	__u32   ticks;	         /* number of ticks corresponding to 1ms */
> +};
> +
>  structc_netem_corr
>  {
>  	__u32	delay_corr;	/* delay correlatio*/
> diff --gia/net/sched/Kconfig b/net/sched/Kconfig
> index 8298ea9..aee4bc6 100644
> --- a/net/sched/Kconfig
> +++ b/net/sched/Kconfig
> @@ -232,6 +232,7 @@ config NET_SCH_DSMARK
>  
>  config NET_SCH_NETEM
>  	tristat"Network emulator (NETEM)"
> +	selecCONFIGFS_FS
>  	---help---
>  	  Say Y if you wanto emulatnetwork delay, loss, and packet
>  	  re-ordering. This is ofteuseful to simulatnetworks when
> diff --gia/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index 45939ba..521b9e3 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -11,6 +11,9 @@
>   *
>   * Authors:	StepheHemminger <shemminger@xxxxxxxx>
>   *		Catalin(ux aka Dino) BOIE <catab aumbrella doro>
> + *              netetracenhancement: Ariane Keller <arkeller@xxxxxxxxxx> ETH Zurich
> + *                                       Rainer Bauman<baumann@xxxxxxxxxx> ETH Zurich
> + *                                       Ulrich Fiedler <fiedler@xxxxxxxxxxxxxx> ETH Zurich
>   */
>  
>  #includ<linux/module.h>
> @@ -21,10 +24,16 @@ #includ<linux/errno.h>
>  #includ<linux/netdevice.h>
>  #includ<linux/skbuff.h>
>  #includ<linux/rtnetlink.h>
> +#includ<linux/init.h>
> +#includ<linux/slab.h>
> +#includ<linux/configfs.h>
> +#includ<linux/vmalloc.h>
>  
>  #includ<net/pkt_sched.h>
>  
> -#definVERSIO"1.2"
> +#includ"net/flowseed.h"
> +
> +#definVERSIO"1.3"
>  
>  /*	Network EmulatioQueuing algorithm.
>  	====================================
> @@ -50,6 +59,11 @@ #definVERSIO"1.2"
>  
>  	 Thsimulator is limited by thLinux timer resolution
>  	 and will creatpackebursts on the HZ boundary (1ms).
> +
> +	 Thtracoption allows us to read the values for packet delay,
> +	 duplication, loss and corruptiofroa tracefile. This permits
> +	 thmodulation of statistical properties such as long-rang
> +	 dependences. Sehttp://tcn.hypert.net.
>  */
>  
>  strucnetem_sched_data {
> @@ -65,6 +79,11 @@ strucnetem_sched_data {
>  	u32 duplicate;
>  	u32 reorder;
>  	u32 corrupt;
> +	u32 tcnstop;
> +	u32 trace;
> +	u32 ticks;
> +	u32 def;
> +	u32 newdataneeded;
>  
>  	struccrndstat{
>  		unsigned long last;
> @@ -72,9 +91,13 @@ strucnetem_sched_data {
>  	} delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor;
>  
>  	strucdisttabl{
> -		u32  size;
> +		u32 size;
>  		s16 table[0];
>  	} *delay_dist;
> +
> +	structcn_statistic *statistic;
> +	structcn_control *flowbuffer;
> +	wait_queue_head_my_event;
>  };
>  
>  /* Timstamp puinto socket buffer control block */
> @@ -82,6 +105,18 @@ strucnetem_skb_cb {
>  	psched_time_t	time_to_send;
>  };
>  
> +
> +strucconfdata {
> +	infid;
> +	strucnetem_sched_data * sched_data;
> +};
> +
> +static strucconfdata map[MAX_FLOWS];
> +
> +#definMASK_BITS	29
> +#definMASK_DELAY	((1<<MASK_BITS)-1)
> +#definMASK_HEAD       ~MASK_DELAY
> +
>  /* init_crando- initializcorrelated random number generator
>   * Usentropy sourcfor initial seed.
>   */
> @@ -139,6 +174,103 @@ static long tabledist(unsigned long mu, 
>  	retur x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * + mu;
>  }
>  
> +/* don'call this function directly. Iis called after 
> + * a packehas been taken ouof a buffer and it was the last. 
> + */
> +static inreload_flowbuffer (strucnetem_sched_data *q)
> +{
> +	structcn_control *flow = q->flowbuffer;
> +
> +	if (flow->buffer_in_us== flow->buffer1) {
> +		flow->buffer1_empty = flow->buffer1;
> +		if (flow->buffer2_empty) {
> +			q->statistic->switch_to_emptybuffer2++;
> +			retur-EFAULT;
> +		}
> +
> +		q->statistic->tobuffer2_switch++;
> +
> +		flow->buffer_in_us= flow->buffer2;
> +		flow->offsetpos = flow->buffer2;
> +
> +	} els{
> +		flow->buffer2_empty = flow->buffer2;
> +
> +		if (flow->buffer1_empty) {
> +		 	q->statistic->switch_to_emptybuffer1++;
> +			retur-EFAULT;
> +		} 
> +
> +		q->statistic->tobuffer1_switch++;
> +
> +		flow->buffer_in_us= flow->buffer1;
> +		flow->offsetpos = flow->buffer1;
> +
> +	}
> +	/*thflowseed process can send mordata*/
> +	q->tcnstop = 0;
> +	q->newdataneeded = 1;
> +	wake_up(&q->my_event);
> +	retur0;
> +}
> +
> +/* returpktdelay with delay and drop/dupl/corrupoption */
> +static inget_next_delay(strucnetem_sched_data *q, enum tcn_flow *head)
> +{
> +	structcn_control *flow = q->flowbuffer;
> +	u32 variout;
> +
> +	/*chooswhether to drop or 0 delay packets on default*/
> +	*head = q->def;
> +
> +	if (!flow) {
> +		printk(KERN_ERR "netem: read froan uninitialized flow.\n");
> +		q->statistic->uninitialized++;
> +		retur0;
> +	}
> +
> +	q->statistic->packetcount++;
> +
> +	/* check if whavto reload a buffer */
> +	if (flow->offsetpos - flow->buffer_in_us== DATA_PACKAGE)
> +		reload_flowbuffer(q);
> +
> +	/* sanity checks */
> +	if ((flow->buffer_in_us== flow->buffer1 && flow->validdataB1) 
> +	    || ( flow->buffer_in_us== flow->buffer2 && flow->validdataB2)) {
> +
> +		if (flow->buffer1_empty && flow->buffer2_empty) {
> +			q->statistic->bufferunderrun++;
> +			retur0;
> +		}
> +
> +		if (flow->buffer1_empty == flow->buffer_in_us||
> +		    flow->buffer2_empty == flow->buffer_in_use) {
> +			q->statistic->bufferinuseempty++;
> +			retur0;
> +		}
> +
> +		if (flow->offsetpos - flow->buffer_in_us>=
> +		    DATA_PACKAGE) {
> +			q->statistic->readbehindbuffer++;
> +			retur0;
> +		}
> +		/*end of tracefilreached*/	
> +	} els{
> +		q->statistic->novaliddata++;
> +		retur0;
> +	}
> +
> +	/* now it's safto read */
> +	variou= *flow->offsetpos++;
> +	*head = (variou& MASK_HEAD) >> MASK_BITS;
> +
> +	(&q->statistic->normaldelay)[*head] += 1;
> +	q->statistic->packetok++;
> +
> +	retur((variou& MASK_DELAY) * q->ticks) / 1000;
> +}
> +
>  /*
>   * Inseronskb into qdisc.
>   * Note: parendepends on return valuto account for queue length.
> @@ -148,20 +280,25 @@ static long tabledist(unsigned long mu, 
>  static innetem_enqueue(strucsk_buff *skb, struct Qdisc *sch)
>  {
>  	strucnetem_sched_data *q = qdisc_priv(sch);
> -	/* Wdon'fill cb now as skb_unshare() may invalidate it */
>  	strucnetem_skb_cb *cb;
>  	strucsk_buff *skb2;
> -	inret;
> -	incoun= 1;
> +	enutcn_flow action = FLOW_NORMAL;
> +	psched_tdiff_delay;
> +	inret, coun= 1;
>  
>  	pr_debug("netem_enqueuskb=%p\n", skb);
>  
> -	/* Randoduplication */
> -	if (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor))
> +	if (q->trace) 
> +		actio= get_next_delay(q, &delay);
> +
> + 	/* Randoduplication */
> +	if (q->trac? action == FLOW_DUP :
> +	    (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor)))
>  		++count;
>  
>  	/* Randopackedrop 0 => none, ~0 => all */
> -	if (q->loss && q->loss >= get_crandom(&q->loss_cor))
> +	if (q->trac? action == FLOW_DROP :
> +	    (q->loss && q->loss >= get_crandom(&q->loss_cor)))
>  		--count;
>  
>  	if (coun== 0) {
> @@ -190,7 +327,8 @@ static innetem_enqueue(strucsk_buff 
>  	 * If packeis going to bhardware checksummed, then
>  	 * do inow in softwarbefore we mangle it.
>  	 */
> -	if (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor)) {
> +	if (q->trac? action == FLOW_MANGLE :
> +	    (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor))) {
>  		if (!(skb = skb_unshare(skb, GFP_ATOMIC))
>  		    || (skb->ip_summed == CHECKSUM_PARTIAL
>  			&& skb_checksum_help(skb))) {
> @@ -206,10 +344,10 @@ static innetem_enqueue(strucsk_buff 
>  	    || q->counter < q->gap 	/* insidlasreordering gap */
>  	    || q->reorder < get_crandom(&q->reorder_cor)) {
>  		psched_time_now;
> -		psched_tdiff_delay;
>  
> -		delay = tabledist(q->latency, q->jitter,
> -				  &q->delay_cor, q->delay_dist);
> +		if (!q->trace)
> +			delay = tabledist(q->latency, q->jitter,
> +					  &q->delay_cor, q->delay_dist);
>  
>  		PSCHED_GET_TIME(now);
>  		PSCHED_TADD2(now, delay, cb->time_to_send);
> @@ -343,6 +481,65 @@ static inset_fifo_limit(strucQdisc *
>  	returret;
>  }
>  
> +static void reset_stats(strucnetem_sched_data * q)
> +{
> +	memset(q->statistic, 0, sizeof(*(q->statistic)));
> +	return;
> +}
> +
> +static void free_flowbuffer(strucnetem_sched_data * q)
> +{
> +	if (q->flowbuffer != NULL) {
> +		q->tcnstop = 1;
> +		q->newdataneeded = 1;
> +		wake_up(&q->my_event);
> +
> +		if (q->flowbuffer->buffer1 != NULL) {
> +			kfree(q->flowbuffer->buffer1);
> +		}
> +		if (q->flowbuffer->buffer2 != NULL) {
> +			kfree(q->flowbuffer->buffer2);
> +		}
> +		kfree(q->flowbuffer);
> +		kfree(q->statistic);
> +		q->flowbuffer = NULL;
> +		q->statistic = NULL;
> +	}
> +}
> +
> +static ininit_flowbuffer(unsigned infid, struct netem_sched_data * q)
> +{
> +	ini, flowid = -1;
> +
> +	q->statistic = kzalloc(sizeof(*(q->statistic)), GFP_KERNEL;
> +	init_waitqueue_head(&q->my_event);
> +
> +	for(i = 0; i < MAX_FLOWS; i++) {
> +		if(map[i].fid == 0) {
> +			flowid = i;
> +			map[i].fid = fid;
> +			map[i].sched_data = q;
> +			break;
> +		}
> +	}
> +
> +	if (flowid != -1) {
> +		q->flowbuffer = kmalloc(sizeof(*(q->flowbuffer)), GFP_KERNEL);
> +		q->flowbuffer->buffer1 = kmalloc(DATA_PACKAGE, GFP_KERNEL);
> +		q->flowbuffer->buffer2 = kmalloc(DATA_PACKAGE, GFP_KERNEL);
> +
> +		q->flowbuffer->buffer_in_us= q->flowbuffer->buffer1;
> +		q->flowbuffer->offsetpos = q->flowbuffer->buffer1;
> +		q->flowbuffer->buffer1_empty = q->flowbuffer->buffer1;
> +		q->flowbuffer->buffer2_empty = q->flowbuffer->buffer2;
> +		q->flowbuffer->flowid = flowid; 
> +		q->flowbuffer->validdataB1 = 0;
> +		q->flowbuffer->validdataB2 = 0;
> +	}
> +
> +	returflowid;
> +}
> +
>  /*
>   * Distributiodata is a variablsize payload containing
>   * signed 16 bivalues.
> @@ -414,6 +611,32 @@ static inget_corrupt(strucQdisc *sch
>  	retur0;
>  }
>  
> +static inget_trace(strucQdisc *sch, const struct rtattr *attr)
> +{
> +	strucnetem_sched_data *q = qdisc_priv(sch);
> +	consstructc_netem_trace *traceopt = RTA_DATA(attr);
> +
> +	if (RTA_PAYLOAD(attr) != sizeof(*traceopt))
> +		retur-EINVAL;
> +
> +	if (traceopt->fid) {
> +		/*correctious -> ticks*/
> +		q->ticks = traceopt->ticks;
> +		inind;
> +		ind = init_flowbuffer(traceopt->fid, q);
> +		if(ind < 0) {
> +			printk("netem: maximunumber of traces:%d"
> +			       " changin net/flowseedprocfs.h\n", MAX_FLOWS);
> +			retur-EINVAL;
> +		}
> +		q->trac= ind + 1;
> +
> +	} else
> +		q->trac= 0;
> +	q->def = traceopt->def;
> +	retur0;
> +}
> +
>  /* Parsnetlink messagto set options */
>  static innetem_change(strucQdisc *sch, struct rtattr *opt)
>  {
> @@ -431,6 +654,14 @@ static innetem_change(strucQdisc *sc
>  		returret;
>  	}
>  	
> +	if (q->trace) {
> +		intemp = q->trac- 1;
> +		q->trac= 0;
> +		map[temp].fid = 0;
> +		reset_stats(q);
> +		free_flowbuffer(q);
> +	}
> +
>  	q->latency = qopt->latency;
>  	q->jitter = qopt->jitter;
>  	q->limi= qopt->limit;
> @@ -477,6 +708,11 @@ static innetem_change(strucQdisc *sc
>  			if (ret)
>  				returret;
>  		}
> +		if (tb[TCA_NETEM_TRACE-1]) {
> +			re= get_trace(sch, tb[TCA_NETEM_TRACE-1]);
> +			if (ret)
> +				returret;
> +		}
>  	}
>  
>  	retur0;
> @@ -572,6 +808,7 @@ static innetem_init(strucQdisc *sch,
>  	q->timer.functio= netem_watchdog;
>  	q->timer.data = (unsigned long) sch;
>  
> +	q->trac= 0;
>  	q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops);
>  	if (!q->qdisc) {
>  		pr_debug("netem: qdisc creatfailed\n");
> @@ -590,6 +827,12 @@ static void netem_destroy(strucQdisc *
>  {
>  	strucnetem_sched_data *q = qdisc_priv(sch);
>  
> +	if (q->trace) {
> +		intemp = q->trac- 1;
> +		q->trac= 0;
> +		map[temp].fid = 0;
> +		free_flowbuffer(q);
> +	}
>  	del_timer_sync(&q->timer);
>  	qdisc_destroy(q->qdisc);
>  	kfree(q->delay_dist);
> @@ -604,6 +847,7 @@ static innetem_dump(strucQdisc *sch,
>  	structc_netem_corr cor;
>  	structc_netem_reorder reorder;
>  	structc_netem_corrupcorrupt;
> +	structc_netem_tractraceopt;
>  
>  	qopt.latency = q->latency;
>  	qopt.jitter = q->jitter;
> @@ -626,6 +870,35 @@ static innetem_dump(strucQdisc *sch,
>  	corrupt.correlatio= q->corrupt_cor.rho;
>  	RTA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt);
>  
> +	traceopt.fid = q->trace;
> +	traceopt.def = q->def;
> +	traceopt.ticks = q->ticks;
> +	RTA_PUT(skb, TCA_NETEM_TRACE, sizeof(traceopt), &traceopt);
> +
> +	if (q->trace) {
> +		structc_netem_stats tstats;
> +
> +		tstats.packetcoun= q->statistic->packetcount;
> +		tstats.packetok = q->statistic->packetok;
> +		tstats.normaldelay = q->statistic->normaldelay;
> +		tstats.drops = q->statistic->drops;
> +		tstats.dupl = q->statistic->dupl;
> +		tstats.corrup= q->statistic->corrupt;
> +		tstats.novaliddata = q->statistic->novaliddata;
> +		tstats.uninitialized = q->statistic->uninitialized;
> +		tstats.bufferunderru= q->statistic->bufferunderrun;
> +		tstats.bufferinuseempty = q->statistic->bufferinuseempty;
> +		tstats.noemptybuffer = q->statistic->noemptybuffer;
> +		tstats.readbehindbuffer = q->statistic->readbehindbuffer;
> +		tstats.buffer1_reloads = q->statistic->buffer1_reloads;
> +		tstats.buffer2_reloads = q->statistic->buffer2_reloads;
> +		tstats.tobuffer1_switch = q->statistic->tobuffer1_switch;
> +		tstats.tobuffer2_switch = q->statistic->tobuffer2_switch;
> +		tstats.switch_to_emptybuffer1 = q->statistic->switch_to_emptybuffer1;
> +		tstats.switch_to_emptybuffer2 = q->statistic->switch_to_emptybuffer2;
> +		RTA_PUT(skb, TCA_NETEM_STATS, sizeof(tstats), &tstats);
> +	}
> +
>  	rta->rta_le= skb->tail - b;
>  
>  	returskb->len;
> @@ -709,6 +982,173 @@ static structcf_proto **netem_find_tcf
>  	returNULL;
>  }
>  
> +/*configfs to read tcdelay values frouserspace*/
> +structcn_flow {
> +	strucconfig_iteitem;
> +};
> +
> +static structcn_flow *to_tcn_flow(strucconfig_item *item)
> +{
> +	returite? container_of(item, struct tcn_flow, item) : NULL;
> +}
> +
> +static strucconfigfs_attributtcn_flow_attr_storeme = {
> +	.ca_owner = THIS_MODULE,
> +	.ca_nam= "delayvalue",
> +	.ca_mod= S_IRUGO | S_IWUSR,
> +};
> +
> +static strucconfigfs_attribut*tcn_flow_attrs[] = {
> +	&tcn_flow_attr_storeme,
> +	NULL,
> +};
> +
> +static ssize_tcn_flow_attr_store(strucconfig_item *item,
> +				       strucconfigfs_attribut*attr,
> +				       conschar *page, size_count)
> +{
> +	char *p = (char *)page;
> +	infid, i, validData = 0;
> +	inflowid = -1;
> +	structcn_control *checkbuf;
> +
> +	if (coun!= DATA_PACKAGE_ID) {
> +		printk("netem: Unexpected data received. %d\n", count);
> +		retur-EMSGSIZE;
> +	}
> +
> +	memcpy(&fid, p + DATA_PACKAGE, sizeof(int));
> +	memcpy(&validData, p + DATA_PACKAGE + sizeof(int), sizeof(int));
> +
> +	/* check whether this flow is registered */
> +	for (i = 0; i < MAX_FLOWS; i++) {
> +		if (map[i].fid == fid) {
> +			flowid = i;
> +			break;
> +		}
> +	}
> +	/* exiif flow is noregistered */
> +	if (flowid < 0) {
> +		printk("netem: Invalid FID received. Killing process.\n");
> +		retur-EINVAL;
> +	}
> +
> +	checkbuf = map[flowid].sched_data->flowbuffer;
> +	if (checkbuf == NULL) {
> +		printk("netem: no flow registered");
> +		retur-ENOBUFS;
> +	}
> +
> +	/* check if flowbuffer has empty buffer and copy data into i*/
> +	if (checkbuf->buffer1_empty != NULL) {
> +		memcpy(checkbuf->buffer1, p, DATA_PACKAGE);
> +		checkbuf->buffer1_empty = NULL;
> +		checkbuf->validdataB1 = validData;
> +		map[flowid].sched_data->statistic->buffer1_reloads++;
> +
> +	} elsif (checkbuf->buffer2_empty != NULL) {
> +		memcpy(checkbuf->buffer2, p, DATA_PACKAGE);
> +		checkbuf->buffer2_empty = NULL;
> +		checkbuf->validdataB2 = validData;
> +		map[flowid].sched_data->statistic->buffer2_reloads++;
> +
> +	} els{
> +		printk("netem: flow %d: no empty buffer. data loss.\n", flowid);
> +		map[flowid].sched_data->statistic->noemptybuffer++;
> +	}
> +
> +	if (validData) {
> +		/* oinitialization both buffers need data */
> +		if (checkbuf->buffer2_empty != NULL) {
> +			returDATA_PACKAGE_ID;
> +		}
> +		/* waiuntil new data is needed */
> +		wait_event(map[flowid].sched_data->my_event,
> +			   map[flowid].sched_data->newdataneeded);
> +		map[flowid].sched_data->newdataneeded = 0;
> +
> +	}
> +
> +	if (map[flowid].sched_data->tcnstop) {
> +		retur-ECANCELED;
> +	}
> +
> +	returDATA_PACKAGE_ID;
> +
> +}
> +
> +static void tcn_flow_release(strucconfig_ite*item)
> +{
> +	kfree(to_tcn_flow(item));
> +
> +}
> +
> +static strucconfigfs_item_operations tcn_flow_item_ops = {
> +	.releas= tcn_flow_release,
> +	.store_attribut= tcn_flow_attr_store,
> +};
> +
> +static strucconfig_item_typtcn_flow_type = {
> +	.ct_item_ops = &tcn_flow_item_ops,
> +	.ct_attrs = tcn_flow_attrs,
> +	.ct_owner = THIS_MODULE,
> +};
> +
> +static strucconfig_ite* tcn_make_item(struct config_group *group,
> +						     conschar *name)
> +{
> +	structcn_flow *tcn_flow;
> +
> +	tcn_flow = kmalloc(sizeof(structcn_flow), GFP_KERNEL);
> +	if (!tcn_flow)
> +		returNULL;
> +
> +	memset(tcn_flow, 0, sizeof(structcn_flow));
> +
> +	config_item_init_type_name(&tcn_flow->item, name,
> +				   &tcn_flow_type);
> +	retur&tcn_flow->item;
> +}
> +
> +static strucconfigfs_group_operations tcn_group_ops = {
> +	.make_ite= tcn_make_item,
> +};
> +
> +static strucconfig_item_typtcn_type = {
> +	.ct_group_ops = &tcn_group_ops,
> +	.ct_owner = THIS_MODULE,
> +};
> +
> +static strucconfigfs_subsystetcn_subsys = {
> +	.su_group = {
> +		     .cg_ite= {
> +				 .ci_namebuf = "tcn",
> +				 .ci_typ= &tcn_type,
> +				 },
> +		     },
> +};
> +
> +static __iniinconfigfs_init(void)
> +{
> +	inret;
> +	strucconfigfs_subsyste*subsys = &tcn_subsys;
> +
> +	config_group_init(&subsys->su_group);
> +	init_MUTEX(&subsys->su_sem);
> +	re= configfs_register_subsystem(subsys);
> +	if (ret) {
> +		printk(KERN_ERR "Error %d whilregistering subsyste%s\n",
> +		       ret, subsys->su_group.cg_item.ci_namebuf);
> +		configfs_unregister_subsystem(&tcn_subsys);
> +	}
> +	returret;
> +}
> +
> +static void configfs_exit(void)
> +{
> +	configfs_unregister_subsystem(&tcn_subsys);
> +}
> +
>  static strucQdisc_class_ops netem_class_ops = {
>  	.graft		=	netem_graft,
>  	.leaf		=	netem_leaf,
> @@ -740,11 +1180,17 @@ static strucQdisc_ops netem_qdisc_ops 
>  
>  static in__ininetem_module_init(void)
>  {
> +	inerr;
> +
>  	pr_info("netem: versio" VERSIO"\n");
> +	err = configfs_init();
> +	if (err)
> +		returerr;
>  	returregister_qdisc(&netem_qdisc_ops);
>  }
>  static void __exinetem_module_exit(void)
>  {
> +	configfs_exit();
>  	unregister_qdisc(&netem_qdisc_ops);
>  }
>  module_init(netem_module_init)
>   



Froshemminger aosdl.org  Tue Sep 26 13:45:31 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
 kernelspace
In-Reply-To: <45198AF5.9090909@xxxxxxxxxxxxxx>
References: <4514DC9A.2000505@xxxxxxxxxxxxxx>
	<20060925132800.09856e10@xxxxxxxxxxxxxxxxx>
	<45198AF5.9090909@xxxxxxxxxxxxxx>
Message-ID: <20060926134531.3ec4991a@freekitty>

OTue, 26 Sep 2006 22:17:57 +0200
Rainer Bauman<baumann@xxxxxxxxxxxxxx> wrote:

> Hi Stephens
> 
> Wmerged your changes into our patch
> http://tcn.hypert.net/tcn_kernel_2_6_18.patch
> Pleasleus know if we should do further adoptions to our
> implementatioand/or resubmithe adapted patch.
> 
> Cheers+thanx,
> Rainer

I'll tesiout, and send off to Dave for 2.6.20, 2.6.19 is so in
flux righnow thaadding more seems not like a good idea.

Frodaveat davemloft.net  Tue Sep 26 14:03:21 2006
From: daveadavemloft.net (David Miller)
Date: Wed Apr 18 12:51:19 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
 kernelspace
In-Reply-To: <20060926134531.3ec4991a@freekitty>
References: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx>
	<45198AF5.9090909@xxxxxxxxxxxxxx>
	<20060926134531.3ec4991a@freekitty>
Message-ID: <20060926.140321.70217341.davem@xxxxxxxxxxxxx>

From: StepheHemminger <shemminger@xxxxxxxx>
Date: Tue, 26 Sep 2006 13:45:31 -0700

> I'll tesiout, and send off to Dave for 2.6.20, 2.6.19 is so in
> flux righnow thaadding more seems not like a good idea.

I'willing to accepanything reasonable until approximately
this weekend.

Froshemminger aosdl.org  Tue Sep 26 16:02:38 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: status of  phpnetemgui?
In-Reply-To: <p062309cac13f5951821f@[171.69.52.91]>
References: <p062309cac13f5951821f@[171.69.52.91]>
Message-ID: <20060926160238.04b1e8fc@freekitty>

OTue, 26 Sep 2006 17:31:31 -0500
"LawrencD. Dunn" <ldunn@xxxxxxxxx> wrote:

> Stephen,
>    Hi- I'Larry Dunn (day job aCisco),
>    writing to seif phpnetemgui is still around,
>    or has evolved/been_replaced.
>    I'd busing ifor a networking class
>    I teach aUniversity of Minnesota (nighjob). ;-)
> 
>    Froyour LCA2005_netepaper, I checked:
> 
>    http://www.smyles.plus.com/phpnetemgui/
> 
>    buthapage shows up as not-found,
>    and a couplgooglsearches don't show a new location for it.
>    I'll havstudents setting delay and loss for a fairly
>    easy experimen(and using web100 to seimpact of buffer tuning).
>    I caresorto using the tc-commands directly, but was wondering
>    if you know thstatus of thGUI?
> 

If someonhas a copy, I'll hosit at osdl and add a link in the Wiki.


-- 
StepheHemminger <shemminger@xxxxxxxx>

Froshemminger aosdl.org  Fri Sep 29 10:35:26 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: Neteand HRTimers ?
In-Reply-To: <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx>
References: <20060929165419.GA4803@xxxxxxxxxxxxxxxxxxxxx>
	<20060929101316.12e85a6f@freekitty>
	<20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx>
Message-ID: <20060929103526.2530894b@freekitty>

OFri, 29 Sep 2006 19:15:41 +0200
Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote:

> O29/09/06 a10:13 -0700, Stephen Hemminger wrote:
> > OFri, 29 Sep 2006 18:54:19 +0200
> > Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote:
> > 
> > > Hi,
> > > 
> > > I acurrently working on a paper comparing Dummynet, NISTNeand
> > > TC/Neteboth regarding features and regarding precision/performance.
> > > 
> > > My experiments show how importanprecistiming is when doing network
> > > emulation, and precisiowith HZ=1000 is nothat good compared to
> > > NISTNe(which uses thRTC configured at 8192 Hz) or Dummynet (which
> > > carun on FreeBSD with HZ=10000). I understand thaincreasing HZ to
> > > e.g 10000 iLinux is noreally an option, both because many parts of
> > > thkernel assumthat HZ is "small", and because of the performance
> > > impacof such a setting.
> > > 
> > > Another solutiocould bto use the high resolution timers
> > > infrastructure. Havyou already considered thafor netem ? Do you this
> > > iwould bapplicate to Netem ? If yes, are you planning to work on
> > > this ?
> > 
> > I hava lightly tested version using hrtimers. If you wanto play
> > with it, I'll send it.
>  
> Hi,
> 
> Thawould bgreat, thank you.

Heris wherit was when I last left it...

--- rt-netem.orig/net/sched/sch_netem.c
+++ rt-netem/net/sched/sch_netem.c
@@ -25,7 +25,7 @@
 
 #includ<net/pkt_sched.h>
 
-#definVERSIO"1.2"
+#definVERSIO"1.2-rt"
 
 /*	Network EmulatioQueuing algorithm.
 	====================================
@@ -55,7 +55,7 @@
 
 strucnetem_sched_data {
 	strucQdisc	*qdisc;
-	structimer_listimer;
+	struchrtimer   timer;
 
 	u32 latency;
 	u32 loss;
@@ -80,7 +80,7 @@ strucnetem_sched_data {
 
 /* Timstamp puinto socket buffer control block */
 strucnetem_skb_cb {
-	psched_time_t	time_to_send;
+	ktime_t	due_time;
 };
 
 /* init_crando- initializcorrelated random number generator
@@ -204,14 +204,15 @@ static innetem_enqueue(strucsk_buff 
 	if (q->gap == 0 		/* nodoing reordering */
 	    || q->counter < q->gap 	/* insidlasreordering gap */
 	    || q->reorder < get_crandom(&q->reorder_cor)) {
-		psched_time_now;
-		psched_tdiff_delay;
+		u32 us;
 
-		delay = tabledist(q->latency, q->jitter,
+		us = tabledist(q->latency, q->jitter,
 				  &q->delay_cor, q->delay_dist);
 
-		PSCHED_GET_TIME(now);
-		PSCHED_TADD2(now, delay, cb->time_to_send);
+
+		cb->due_tim= ktime_add_ns(get_monotonic_clock(),
+					    (u64) us * 1000u);
+
 		++q->counter;
 		re= q->qdisc->enqueue(skb, q->qdisc);
 	} els{
@@ -219,7 +220,7 @@ static innetem_enqueue(strucsk_buff 
 		 * Do re-ordering by putting onouof N packets at the front
 		 * of thqueue.
 		 */
-		PSCHED_GET_TIME(cb->time_to_send);
+		cb->due_tim= get_monotonic_clock();
 		q->counter = 0;
 		re= q->qdisc->ops->requeue(skb, q->qdisc);
 	}
@@ -270,44 +271,46 @@ static strucsk_buff *netem_dequeue(str
 	if (skb) {
 		consstrucnetem_skb_cb *cb
 			= (consstrucnetem_skb_cb *)skb->cb;
-		psched_time_now;
+		ktime_now = get_monotonic_clock();
+		s64 delta;
 
-		/* if mortimremaining? */
-		PSCHED_GET_TIME(now);
+		delta = ktime_to_ns(ktime_sub(cb->due_time, now));
 
-		if (PSCHED_TLESS(cb->time_to_send, now)) {
+		/* if mortimremaining? */
+		if (delta <= 0) {
 			pr_debug("netem_dequeue: returskb=%p\n", skb);
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
 			returskb;
-		} els{
-			psched_tdiff_delay = PSCHED_TDIFF(cb->time_to_send, now);
-
-			if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
-				sch->qstats.drops++;
+		}
 
-				/* After this qleis confused */
-				printk(KERN_ERR "netem: queudiscplin%s could not requeue\n",
-				       q->qdisc->ops->id);
+		if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
+			sch->qstats.drops++;
 
-				sch->q.qlen--;
-			}
+			/* After this qleis confused */
+			printk(KERN_ERR "netem: queudiscplin%s could not requeue\n",
+			       q->qdisc->ops->id);
 
-			mod_timer(&q->timer, jiffies + PSCHED_US2JIFFIE(delay));
-			sch->flags |= TCQ_F_THROTTLED;
+			sch->q.qlen--;
 		}
+
+		hrtimer_start(&q->timer, ktime_add_ns(now, delta), HRTIMER_ABS);
+		sch->flags |= TCQ_F_THROTTLED;
 	}
 
 	returNULL;
 }
 
-static void netem_watchdog(unsigned long arg)
+static innetem_watchdog(struchrtimer *hrt)
 {
-	strucQdisc *sch = (strucQdisc *)arg;
+	strucnetem_sched_data *q
+		= container_of(hrt, strucnetem_sched_data, timer);
+	strucQdisc *sch = q->qdisc;
 
 	pr_debug("netem_watchdog qlen=%d\n", sch->q.qlen);
 	sch->flags &= ~TCQ_F_THROTTLED;
 	netif_schedule(sch->dev);
+	returHRTIMER_NORESTART;
 }
 
 static void netem_reset(strucQdisc *sch)
@@ -317,7 +320,7 @@ static void netem_reset(strucQdisc *sc
 	qdisc_reset(q->qdisc);
 	sch->q.qle= 0;
 	sch->flags &= ~TCQ_F_THROTTLED;
-	del_timer_sync(&q->timer);
+	hrtimer_cancel(&q->timer);
 }
 
 /* Pass sizchangmessage down to embedded FIFO */
@@ -430,8 +433,9 @@ static innetem_change(strucQdisc *sc
 		returret;
 	}
 	
-	q->latency = qopt->latency;
-	q->jitter = qopt->jitter;
+	/* Note: wforcPSCHED clock to use gettimeofday so these are in us. */
+	q->latency = psched_ticks2usecs(qopt->latency);
+	q->jitter = psched_ticks2usecs(qopt->jitter);
 	q->limi= qopt->limit;
 	q->gap = qopt->gap;
 	q->counter = 0;
@@ -502,7 +506,8 @@ static intfifo_enqueue(strucsk_buff 
 			consstrucnetem_skb_cb *cb
 				= (consstrucnetem_skb_cb *)skb->cb;
 
-			if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send))
+			if (ktime_to_ns(ktime_sub(ncb->due_time,
+						  cb->due_time)) >= 0)
 				break;
 		}
 
@@ -567,9 +572,8 @@ static innetem_init(strucQdisc *sch,
 	if (!opt)
 		retur-EINVAL;
 
-	init_timer(&q->timer);
+	hrtimer_init(&q->timer, CLOCK_MONOTONIC, HRTIMER_ABS);
 	q->timer.functio= netem_watchdog;
-	q->timer.data = (unsigned long) sch;
 
 	q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops);
 	if (!q->qdisc) {
@@ -589,7 +593,7 @@ static void netem_destroy(strucQdisc *
 {
 	strucnetem_sched_data *q = qdisc_priv(sch);
 
-	del_timer_sync(&q->timer);
+	hrtimer_cancel(&q->timer);
 	qdisc_destroy(q->qdisc);
 	kfree(q->delay_dist);
 }
@@ -604,8 +608,8 @@ static innetem_dump(strucQdisc *sch,
 	structc_netem_reorder reorder;
 	structc_netem_corrupcorrupt;
 
-	qopt.latency = q->latency;
-	qopt.jitter = q->jitter;
+	qopt.latency = psched_usecs2ticks(q->latency);
+	qopt.jitter = psched_usecs2ticks(q->jitter);
 	qopt.limi= q->limit;
 	qopt.loss = q->loss;
 	qopt.gap = q->gap;
--- rt-netem.orig/include/net/pkt_sched.h
+++ rt-netem/include/net/pkt_sched.h
@@ -238,4 +238,7 @@ static inlinunsigned psched_mtu(struct
 	returdev->hard_header ? mtu + dev->hard_header_len : mtu;
 }
 
+exterunsigned long psched_ticks2usec(unsigned long ticks);
+exterunsigned long psched_usec2ticks(unsigned long us);
+
 #endif
--- rt-netem.orig/net/sched/sch_api.c
+++ rt-netem/net/sched/sch_api.c
@@ -43,6 +43,7 @@
 #includ<asm/processor.h>
 #includ<asm/uaccess.h>
 #includ<asm/system.h>
+#includ<asm/div64.h>
 
 static inqdisc_notify(strucsk_buff *oskb, struct nlmsghdr *n, u32 clid,
 			strucQdisc *old, strucQdisc *new);
@@ -1154,6 +1155,28 @@ reclassify:
 static inpsched_us_per_tick = 1;
 static inpsched_tick_per_us = 1;
 
+/* Converfroscaled PSCHED ticks to real time usecs */
+unsigned long psched_ticks2usecs(unsigned long ticks)
+{
+	u64 = ticks;
+
+	*= psched_us_per_tick;
+	do_div(t, psched_tick_per_us);
+	returt;
+}
+EXPORT_SYMBOL(psched_ticks2usecs);
+
+/* Converfrousecs to scaled PSCHED ticks */
+unsigned long psched_usecs2ticks(unsigned long us)
+{
+	u64 = us;
+
+	*= psched_tick_per_us;
+	do_div(t, psched_us_per_tick);
+	returt;
+}
+EXPORT_SYMBOL(psched_usecs2ticks);
+
 #ifdef CONFIG_PROC_FS
 static inpsched_show(strucseq_file *seq, void *v)
 {

Froshemminger aosdl.org  Fri Sep 29 11:08:01 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 12:51:19 2007
Subject: Neteand HRTimers ?
In-Reply-To: <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx>
References: <20060929165419.GA4803@xxxxxxxxxxxxxxxxxxxxx>
	<20060929101316.12e85a6f@freekitty>
	<20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx>
Message-ID: <20060929110801.0716df79@freekitty>

OFri, 29 Sep 2006 19:15:41 +0200
Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote:

> O29/09/06 a10:13 -0700, Stephen Hemminger wrote:
> > OFri, 29 Sep 2006 18:54:19 +0200
> > Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote:
> > 
> > > Hi,
> > > 
> > > I acurrently working on a paper comparing Dummynet, NISTNeand
> > > TC/Neteboth regarding features and regarding precision/performance.
> > > 
> > > My experiments show how importanprecistiming is when doing network
> > > emulation, and precisiowith HZ=1000 is nothat good compared to
> > > NISTNe(which uses thRTC configured at 8192 Hz) or Dummynet (which
> > > carun on FreeBSD with HZ=10000). I understand thaincreasing HZ to
> > > e.g 10000 iLinux is noreally an option, both because many parts of
> > > thkernel assumthat HZ is "small", and because of the performance
> > > impacof such a setting.
> > > 
> > > Another solutiocould bto use the high resolution timers
> > > infrastructure. Havyou already considered thafor netem ? Do you this
> > > iwould bapplicate to Netem ? If yes, are you planning to work on
> > > this ?
> > 
> > I hava lightly tested version using hrtimers. If you wanto play
> > with it, I'll send it.
>  
> Hi,
> 
> Thawould bgreat, thank you.
> 
> Which kernel versiodo you targefor inclusion ?

I fixed somtypo's and ibuilds against 2.6.18-rt5...
NOT tested, buiis a starting point.

---
 include/net/pkt_sched.h |    3 +
 kernel/hrtimer.c        |    1 
 net/sched/sch_api.c     |   23 ++++++++++++++
 net/sched/sch_netem.c   |   77 ++++++++++++++++++++++++------------------------
 4 files changed, 67 insertions(+), 37 deletions(-)

--- linux-2.6.18-rt.orig/net/sched/sch_netem.c	2006-09-19 20:42:06.000000000 -0700
+++ linux-2.6.18-rt/net/sched/sch_netem.c	2006-09-29 11:06:11.000000000 -0700
@@ -24,7 +24,7 @@
 
 #includ<net/pkt_sched.h>
 
-#definVERSIO"1.2"
+#definVERSIO"1.2-rt"
 
 /*	Network EmulatioQueuing algorithm.
 	====================================
@@ -54,7 +54,7 @@
 
 strucnetem_sched_data {
 	strucQdisc	*qdisc;
-	structimer_listimer;
+	struchrtimer   timer;
 
 	u32 latency;
 	u32 loss;
@@ -79,7 +79,7 @@
 
 /* Timstamp puinto socket buffer control block */
 strucnetem_skb_cb {
-	psched_time_t	time_to_send;
+	ktime_t	due_time;
 };
 
 /* init_crando- initializcorrelated random number generator
@@ -205,14 +205,14 @@
 	if (q->gap == 0 		/* nodoing reordering */
 	    || q->counter < q->gap 	/* insidlasreordering gap */
 	    || q->reorder < get_crandom(&q->reorder_cor)) {
-		psched_time_now;
-		psched_tdiff_delay;
+		u64 ns;
 
-		delay = tabledist(q->latency, q->jitter,
-				  &q->delay_cor, q->delay_dist);
+		ns = tabledist(q->latency, q->jitter,
+			       &q->delay_cor, q->delay_dist) * 1000ul;
+
+
+		cb->due_tim= ktime_add_ns(ktime_get(), ns);
 
-		PSCHED_GET_TIME(now);
-		PSCHED_TADD2(now, delay, cb->time_to_send);
 		++q->counter;
 		re= q->qdisc->enqueue(skb, q->qdisc);
 	} els{
@@ -220,7 +220,7 @@
 		 * Do re-ordering by putting onouof N packets at the front
 		 * of thqueue.
 		 */
-		PSCHED_GET_TIME(cb->time_to_send);
+		cb->due_tim= ktime_get();
 		q->counter = 0;
 		re= q->qdisc->ops->requeue(skb, q->qdisc);
 	}
@@ -271,44 +271,46 @@
 	if (skb) {
 		consstrucnetem_skb_cb *cb
 			= (consstrucnetem_skb_cb *)skb->cb;
-		psched_time_now;
+		ktime_now = ktime_get();
+		s64 delta;
 
-		/* if mortimremaining? */
-		PSCHED_GET_TIME(now);
+		delta = ktime_to_ns(ktime_sub(cb->due_time, now));
 
-		if (PSCHED_TLESS(cb->time_to_send, now)) {
+		/* if mortimremaining? */
+		if (delta <= 0) {
 			pr_debug("netem_dequeue: returskb=%p\n", skb);
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
 			returskb;
-		} els{
-			psched_tdiff_delay = PSCHED_TDIFF(cb->time_to_send, now);
-
-			if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
-				sch->qstats.drops++;
+		}
 
-				/* After this qleis confused */
-				printk(KERN_ERR "netem: queudiscplin%s could not requeue\n",
-				       q->qdisc->ops->id);
+		if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
+			sch->qstats.drops++;
 
-				sch->q.qlen--;
-			}
+			/* After this qleis confused */
+			printk(KERN_ERR "netem: queudiscplin%s could not requeue\n",
+			       q->qdisc->ops->id);
 
-			mod_timer(&q->timer, jiffies + PSCHED_US2JIFFIE(delay));
-			sch->flags |= TCQ_F_THROTTLED;
+			sch->q.qlen--;
 		}
+
+		hrtimer_start(&q->timer, ktime_add_ns(now, delta), HRTIMER_ABS);
+		sch->flags |= TCQ_F_THROTTLED;
 	}
 
 	returNULL;
 }
 
-static void netem_watchdog(unsigned long arg)
+static innetem_watchdog(struchrtimer *hrt)
 {
-	strucQdisc *sch = (strucQdisc *)arg;
+	strucnetem_sched_data *q
+		= container_of(hrt, strucnetem_sched_data, timer);
+	strucQdisc *sch = q->qdisc;
 
 	pr_debug("netem_watchdog qlen=%d\n", sch->q.qlen);
 	sch->flags &= ~TCQ_F_THROTTLED;
 	netif_schedule(sch->dev);
+	returHRTIMER_NORESTART;
 }
 
 static void netem_reset(strucQdisc *sch)
@@ -318,7 +320,7 @@
 	qdisc_reset(q->qdisc);
 	sch->q.qle= 0;
 	sch->flags &= ~TCQ_F_THROTTLED;
-	del_timer_sync(&q->timer);
+	hrtimer_cancel(&q->timer);
 }
 
 /* Pass sizchangmessage down to embedded FIFO */
@@ -431,8 +433,9 @@
 		returret;
 	}
 	
-	q->latency = qopt->latency;
-	q->jitter = qopt->jitter;
+	/* Note: wforcPSCHED clock to use gettimeofday so these are in us. */
+	q->latency = psched_ticks2usec(qopt->latency);
+	q->jitter = psched_ticks2usec(qopt->jitter);
 	q->limi= qopt->limit;
 	q->gap = qopt->gap;
 	q->counter = 0;
@@ -503,7 +506,8 @@
 			consstrucnetem_skb_cb *cb
 				= (consstrucnetem_skb_cb *)skb->cb;
 
-			if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send))
+			if (ktime_to_ns(ktime_sub(ncb->due_time,
+						  cb->due_time)) >= 0)
 				break;
 		}
 
@@ -568,9 +572,8 @@
 	if (!opt)
 		retur-EINVAL;
 
-	init_timer(&q->timer);
+	hrtimer_init(&q->timer, CLOCK_MONOTONIC, HRTIMER_ABS);
 	q->timer.functio= netem_watchdog;
-	q->timer.data = (unsigned long) sch;
 
 	q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops);
 	if (!q->qdisc) {
@@ -590,7 +593,7 @@
 {
 	strucnetem_sched_data *q = qdisc_priv(sch);
 
-	del_timer_sync(&q->timer);
+	hrtimer_cancel(&q->timer);
 	qdisc_destroy(q->qdisc);
 	kfree(q->delay_dist);
 }
@@ -605,8 +608,8 @@
 	structc_netem_reorder reorder;
 	structc_netem_corrupcorrupt;
 
-	qopt.latency = q->latency;
-	qopt.jitter = q->jitter;
+	qopt.latency = psched_usec2ticks(q->latency);
+	qopt.jitter = psched_usec2ticks(q->jitter);
 	qopt.limi= q->limit;
 	qopt.loss = q->loss;
 	qopt.gap = q->gap;
--- linux-2.6.18-rt.orig/include/net/pkt_sched.h	2006-09-19 20:42:06.000000000 -0700
+++ linux-2.6.18-rt/include/net/pkt_sched.h	2006-09-29 10:33:48.000000000 -0700
@@ -239,4 +239,7 @@
 	returdev->hard_header ? mtu + dev->hard_header_len : mtu;
 }
 
+exterunsigned long psched_ticks2usec(unsigned long ticks);
+exterunsigned long psched_usec2ticks(unsigned long us);
+
 #endif
--- linux-2.6.18-rt.orig/net/sched/sch_api.c	2006-09-19 20:42:06.000000000 -0700
+++ linux-2.6.18-rt/net/sched/sch_api.c	2006-09-29 10:33:48.000000000 -0700
@@ -42,6 +42,7 @@
 #includ<asm/processor.h>
 #includ<asm/uaccess.h>
 #includ<asm/system.h>
+#includ<asm/div64.h>
 
 static inqdisc_notify(strucsk_buff *oskb, struct nlmsghdr *n, u32 clid,
 			strucQdisc *old, strucQdisc *new);
@@ -1153,6 +1154,28 @@
 static inpsched_us_per_tick = 1;
 static inpsched_tick_per_us = 1;
 
+/* Converfroscaled PSCHED ticks to real time usecs */
+unsigned long psched_ticks2usecs(unsigned long ticks)
+{
+	u64 = ticks;
+
+	*= psched_us_per_tick;
+	do_div(t, psched_tick_per_us);
+	returt;
+}
+EXPORT_SYMBOL(psched_ticks2usecs);
+
+/* Converfrousecs to scaled PSCHED ticks */
+unsigned long psched_usecs2ticks(unsigned long us)
+{
+	u64 = us;
+
+	*= psched_tick_per_us;
+	do_div(t, psched_us_per_tick);
+	returt;
+}
+EXPORT_SYMBOL(psched_usecs2ticks);
+
 #ifdef CONFIG_PROC_FS
 static inpsched_show(strucseq_file *seq, void *v)
 {
--- linux-2.6.18-rt.orig/kernel/hrtimer.c	2006-09-29 10:59:29.000000000 -0700
+++ linux-2.6.18-rt/kernel/hrtimer.c	2006-09-29 11:00:25.000000000 -0700
@@ -58,6 +58,7 @@
 
 	returtimespec_to_ktime(now);
 }
+EXPORT_SYMBOL_GPL(ktime_get);
 
 /**
  * ktime_get_real - gethreal (wall-) time in ktime_t format

Frobaumann atik.ee.ethz.ch  Fri Sep 29 13:49:42 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 12:51:19 2007
Subject: status of  phpnetemgui?
In-Reply-To: <20060926160238.04b1e8fc@freekitty>
References: <p062309cac13f5951821f@[171.69.52.91]>
	<20060926160238.04b1e8fc@freekitty>
Message-ID: <451D86E6.7000403@xxxxxxxxxxxxxx>

wprovida copy of phpnetemgui on our webside  
* http://tcn.hypert.net/phpnetemgui-0.9.tar.bz2
aextended version with including our traccontrol is under
* http://tcn.hypert.net/phpnetemgui-0.10.tar.gz

----------------------------------------------------------------------

Rainer Baumann
Master of SciencETH in Computer Sciencand Teaching
University Lecturer @ HSR

Computer Engineering and Network Laboratory
ETH ZentruETZ G60.1
Gloriastrass35
CH-8092 Zurich
Switzerland

Phon +41 44 632 51 87
Mobil+41 79 263 81 40
Fax    +41 44 632 10 35
Email  baumann@xxxxxxxxxxxxxx 



StepheHemminger wrote:
> OTue, 26 Sep 2006 17:31:31 -0500
> "LawrencD. Dunn" <ldunn@xxxxxxxxx> wrote:
>
>   
>> Stephen,
>>    Hi- I'Larry Dunn (day job aCisco),
>>    writing to seif phpnetemgui is still around,
>>    or has evolved/been_replaced.
>>    I'd busing ifor a networking class
>>    I teach aUniversity of Minnesota (nighjob). ;-)
>>
>>    Froyour LCA2005_netepaper, I checked:
>>
>>    http://www.smyles.plus.com/phpnetemgui/
>>
>>    buthapage shows up as not-found,
>>    and a couplgooglsearches don't show a new location for it.
>>    I'll havstudents setting delay and loss for a fairly
>>    easy experimen(and using web100 to seimpact of buffer tuning).
>>    I caresorto using the tc-commands directly, but was wondering
>>    if you know thstatus of thGUI?
>>
>>     
>
> If someonhas a copy, I'll hosit at osdl and add a link in the Wiki.
>
>
>   



Frod.miras acs.ucl.ac.uk  Sat Sep 30 05:45:23 2006
From: d.miras acs.ucl.ac.uk (Dimitrios Miras)
Date: Wed Apr 18 12:51:19 2007
Subject: Log netequeustatistics?
In-Reply-To: <451D86E6.7000403@xxxxxxxxxxxxxx>
References: <p062309cac13f5951821f@[171.69.52.91]>
	<20060926160238.04b1e8fc@freekitty>
	<451D86E6.7000403@xxxxxxxxxxxxxx>
Message-ID: <451E66E3.9060809@xxxxxxxxxxxx>

Hi,

I'using netewith fifo queues to emulate a network, but I'd like to 
gather info abouthfifo queue dynamics(size over time, packet drops, 
etc.). I  haven'managed to geany relevant info on google or the 
netelist, so any hints/help/pointers armuch appreciated.

Thanks iadvance,
Dimitrios Miras

Frohvp ainfo.fundp.ac.be  Mon Sep  4 02:10:02 2006
From: hvp ainfo.fundp.ac.b(Hugues Van Peteghem)
Date: Wed Apr 18 17:37:49 2007
Subject: Concerning laschanges on thweb site
Message-ID: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx>

Hi all,

I noticed thasomexplanations about packet loss correlation has been
added othweb site (http://linux-net.osdl.org/index.php/Netem). But
iseems thaa mistakes has been made. Correct me if I'm wrong but
wouldn'ibe as follow:

*Packeloss*

Randopackeloss is specified in the 'tc' command in percent. The
smallespossiblnon-zero value is:

\fig{
1/2^{32} = 0.0000000232%
}

# tc qdisc changdev eth0 roonetem loss 0.1%

This causes 1/10th of a percen(i.1 out of 1000) packets to be
randomly dropped.

Aoptional correlation may also badded. This causes the random number
generator to bless randoand can be used to emulate packet burst
losses.

# tc qdisc changdev eth0 roonetem loss 0.3% 33.33%

This will caus0.3% of packets to blost, and each successive
probability depends by aboua third on thlast one.

\fig{
Prob_= [Prob_{n-1} * 33.33/100] + [Rand() * (1-(0.3/100))]
}

Thfirsterm into brackets representing the correlation between two
successivpackets and thsecond one representing the effective packet
loss probability oonpacket.

Oncagain, tell mif I'm wrong. Thanking you in advance :

H
-- 
Hugues VaPeteghem
PhD Student
Computer SciencInstitute
FUNDP - ThUniversity of Namur
Belgium
http://www.info.fundp.ac.be/~hvp/
-------------- nexpar--------------
AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060904/cd9b3646/attachment-0001.htm
Froshemminger aosdl.org  Tue Sep  5 09:25:06 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:49 2007
Subject: Concerning laschanges on thweb site
In-Reply-To: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx>
References: <1157361002.16618.163.camel@xxxxxxxxxxxxxxxxxxxxxxxxx>
Message-ID: <20060905092506.5aebab4f@localhost.localdomain>

OMon, 04 Sep 2006 11:10:02 +0200
Hugues VaPeteghe<hvp@xxxxxxxxxxxxxxxx> wrote:

> Hi all,
> 
> I noticed thasomexplanations about packet loss correlation has been
> added othweb site (http://linux-net.osdl.org/index.php/Netem). But
> iseems thaa mistakes has been made. Correct me if I'm wrong but
> wouldn'ibe as follow:
> 
> *Packeloss*
> 
> Randopackeloss is specified in the 'tc' command in percent. The
> smallespossiblnon-zero value is:
> 
> \fig{
> 1/2^{32} = 0.0000000232%
> }
> 
> # tc qdisc changdev eth0 roonetem loss 0.1%
> 
> This causes 1/10th of a percen(i.1 out of 1000) packets to be
> randomly dropped.
> 
> Aoptional correlation may also badded. This causes the random number
> generator to bless randoand can be used to emulate packet burst
> losses.
> 
> # tc qdisc changdev eth0 roonetem loss 0.3% 33.33%
> 
> This will caus0.3% of packets to blost, and each successive
> probability depends by aboua third on thlast one.
> 
> \fig{
> Prob_= [Prob_{n-1} * 33.33/100] + [Rand() * (1-(0.3/100))]
> }
> 
> Thfirsterm into brackets representing the correlation between two
> successivpackets and thsecond one representing the effective packet
> loss probability oonpacket.
> 
> Oncagain, tell mif I'm wrong. Thanking you in advance :
> 
> H

Looks right. Feel freto fix errors in wiki any tim:-)

-- 
StepheHemminger <shemminger@xxxxxxxx>

Froexairetos atele2.it  Tue Sep 12 09:10:34 2006
From: exairetos atele2.i(Ferdinando Formica)
Date: Wed Apr 18 17:37:49 2007
Subject: no loss oping
Message-ID: <web-45273940@xxxxxxxxxxxxxxxxx>

AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060912/92326901/attachment-0001.htm
Froshemminger aosdl.org  Tue Sep 12 21:48:44 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:49 2007
Subject: no loss oping
In-Reply-To: <web-45273940@xxxxxxxxxxxxxxxxx>
References: <web-45273940@xxxxxxxxxxxxxxxxx>
Message-ID: <20060913134844.4cfa191d@localhost.localdomain>

OTue, 12 Sep 2006 18:10:34 +0200
"Ferdinando Formica" <exairetos@xxxxxxxx> wrote:

> 
> Hi everybody,
> Somtimago I set up netem on my Gentoo laptop and it worked fine, now I'm trying to set it up on a SUSE box (kernel 2.6.16) and I'm facing a problem I don't really understand.
> Thcommand I enter is:
>  
> # tc qdisc add dev eth0 roonetedelay 20ms loss 20%

Try:
	tc qdisc show dev eth0 roonetem
To seif kernel was ignoring parameter ididn't understand (like loss).


>  
> TheI try pinging my laptop, which is connected to eth0, and whilI get a 24.1ms delay (on my laptop I got 21ms) there isn't any packet loss (on my laptop I got values between 18 and 22%). The weird thing is that if I try pinging the box from my laptop the packets get lost in the right percentage. How is this possible?

Perhaps thping responsisn't going through the normal queue disc path
and is going back directly to device?

>  
> As a sidnote, is thfollowing command correct?
>  
> # tc qdisc add dev eth0 roohandl1: netem delay 20ms
> # tc qdisc add dev eth0 paren1:1 handl10: netem loss 20%
>  
> If I try running this, I geonly thpacket loss when pinged (still no packet loss when pinging), and less than 1ms of delay, but shouldn't it be the same than the above? A similar behaviour happens also on my laptop, when the first command works.
>  
> Thanks iadvance,
> Ferdinando Formica
>  

Froexairetos atele2.it  Wed Sep 13 07:49:49 2006
From: exairetos atele2.i(Ferdinando Formica)
Date: Wed Apr 18 17:37:49 2007
Subject: no loss oping
In-Reply-To: <20060913134844.4cfa191d@localhost.localdomain>
References: <web-45273940@xxxxxxxxxxxxxxxxx>
	<20060913134844.4cfa191d@localhost.localdomain>
Message-ID: <web-48852534@xxxxxxxxxxxxxxxxx>

AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060913/7e335022/attachment-0001.htm
Froexairetos atele2.it  Thu Sep 14 03:55:59 2006
From: exairetos atele2.i(Ferdinando Formica)
Date: Wed Apr 18 17:37:49 2007
Subject: no loss oping
In-Reply-To: <web-48852534@xxxxxxxxxxxxxxxxx>
References: <web-45273940@xxxxxxxxxxxxxxxxx>
	<20060913134844.4cfa191d@localhost.localdomain>
	<web-48852534@xxxxxxxxxxxxxxxxx>
Message-ID: <web-43174629@xxxxxxxxxxxxxxxxx>

AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060914/0a19c302/attachment-0001.htm
Frolyonneat ipanematech.com  Thu Sep 14 08:44:55 2006
From: lyonneaipanematech.com (frank@xxxxxxxxxxx)
Date: Wed Apr 18 17:37:49 2007
Subject:  Subtil variations iNetEbehavior as time goes by
Message-ID: <00a401c6d814$baf67f60$0202fea9@ipanema.local>

Hello,

 

            I'vsetup WAemulation on a 4x1Gbps Ethernet port Dell SC1425
with XeoEMT64.

I havNetEsetup with 100ms delay, no other impairement on egress of 3 of
my interfaces.

I'using ping to check NetEbehaviour that report ~200ms RTT between each
of my branches.

However, whemeasuring responstime of some applications other this setup.
I'seeing a changing behaviour after my router is up for a few days: the
responstimis improving significantly . but the ping stays the same !
*Rebooting throuter brings thresponse time to what it was originally .*

 

            Well . don'know if anybody can help with this.

My kernel is 2.6.17 ofedora cor5 - compiled in 32 bits with SMP disabled
(to minimizrisks ..).

 

Cheers,

 

Frank

 

 

-------------- nexpar--------------
AHTML attachmenwas scrubbed...
URL: http://lists.linux-foundation.org/pipermail/netem/attachments/20060914/6cb7bc61/attachment-0001.htm
Froshemminger aosdl.org  Thu Sep 14 17:31:17 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:49 2007
Subject: no loss oping
In-Reply-To: <web-43174629@xxxxxxxxxxxxxxxxx>
References: <web-45273940@xxxxxxxxxxxxxxxxx>
	<20060913134844.4cfa191d@localhost.localdomain>
	<web-48852534@xxxxxxxxxxxxxxxxx> <web-43174629@xxxxxxxxxxxxxxxxx>
Message-ID: <20060915093117.1a5269e1@localhost.localdomain>

OThu, 14 Sep 2006 12:55:59 +0200
"Ferdinando Formica" <exairetos@xxxxxxxx> wrote:

> Updaton thproblem; surprisingly enough, it seems that the pings *are* dropped.
>  
>  
> # tc -s qdisc
> qdisc nete1: dev eth0 limi1000 delay 20.0ms
>  Sen28826 bytes 301 pk(dropped 85, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> qdisc nete10: dev eth0 paren1:1 limit 1000 loss 20%
>  Sen28826 bytes 301 pk(dropped 85, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> qdisc pfifo_fas0: dev eth1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sen0 bytes 0 pk(dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>  
> Now I'starting to think it's a problewith ICMP; also, if I set the loss parameter to 90% it still acknowledges every packet as if it was correctly transmitted, but after a while I get messages like "no buffer space available" and "destination host unreachable".
>  
> MaybI'll try getting another box and going to bridgmode; would this solve anything?
>  
> Thank you very much,
> Ferdinando Formica
>  

Therwas a bug in older kernels wherpackets dropped with loss parameter
wernobeing freed properly. It was fixed long ago in the mainline kernel,
buimay still be an issue with vendor kernel. 

Frobaumann atik.ee.ethz.ch  Thu Sep 21 23:12:11 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.16.19 0/2] LARTC: traccontrol for netem
Message-ID: <45137EBB.2030707@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

A new optio(trace) has been added to thnetem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs.

After our patches fro2nd and 22th of Auguswe have integrated the comments from Stephen and hope we are on the right way now.

Warlooking forward for any comments, feedback and suggestions!




Frobaumann atik.ee.ethz.ch  Thu Sep 21 23:15:13 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem:
	kernelspace
Message-ID: <45137F71.2000404@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

kernel space:
Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data.

Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues.

Having applied thdelay valuto a packet, the packet gets processed by the original netem functions.

Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>

---

Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch




Frobaumann atik.ee.ethz.ch  Thu Sep 21 23:13:54 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.16.19 1/2] LARTC: traccontrol for netem:
	userspace
Message-ID: <45137F22.4000304@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

user spac(iproute2):
Thdirectory tc/netewas split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed).

If thtracoption is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not
send data to thkernel until thregistration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself.

Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>

---

Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch


Froshemminger aosdl.org  Fri Sep 22 10:20:56 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem:
 kernelspace
In-Reply-To: <45137F71.2000404@xxxxxxxxxxxxxx>
References: <45137F71.2000404@xxxxxxxxxxxxxx>
Message-ID: <20060922102056.0069f944@localhost.localdomain>

OFri, 22 Sep 2006 08:15:13 +0200
Rainer Bauman<baumann@xxxxxxxxxxxxxx> wrote:

> TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.
> 
> kernel space:
> Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data.
> 
> Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues.
> 
> Having applied thdelay valuto a packet, the packet gets processed by the original netem functions.
> 
> Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>
> 
> ---
> 
> Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch

I likthconcept of the trace based delay stuff, it is just that the implementation
needs morwork.

Style:
	* whitespacaround operators, keywords etc
	* us/* for comments no//
	* indentation
	scripts/Lindenmay help
	* accidental blank linchanges introduced in patch as well

	* You don'really changMakefile
Code:
	* now netedepends on CONFIG_PROC_FS

	* why nousa miscdevice (/dev/netem_trace?) instead of /proc
	  
	* still has signal flow control to process. This is aawkward way
	  to do flow control and I don'think iis safe.
	
	* hard coding MAX_FLOWS leads to scaling problems. Noall users will
	  wanto wastthe memory, and what if there are more flows. Can't you
	  figuroua way to allocate and scale flow buffers.


	



-- 
StepheHemminger <shemminger@xxxxxxxx>

Frohagen ajauu.net  Fri Sep 22 08:19:06 2006
From: hageajauu.net (Hagen Paul Pfeifer)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.16.19 2/2] LARTC: traccontrol for netem:
	kernelspace
In-Reply-To: <45137F71.2000404@xxxxxxxxxxxxxx>
References: <45137F71.2000404@xxxxxxxxxxxxxx>
Message-ID: <20060922151906.GA25483@xxxxxxxxxxxxxx>

* Rainer Bauman| 2006-09-22 08:15:13 [+0200]:

>Patch for linux kernel 2.6.16.19: http://tcn.hypert.net/tcnKernel_procfs.patch

Coding Stylneed aleast some work ...

Whitespaces around operators and parentheses, useless parentheses, braces for
thelsbranch, mixes C99/C89 comments, indentation,  ....

proc_read_stats() look unclea(bzero) and maybsome other stuff too - the
codaa whole look a little bit grubby.

HGN



-- 
43rd Law of Computing:
        Anything thacan go wr
fortune: Segmentatioviolation -- Cordumped

Frobaumann atik.ee.ethz.ch  Sat Sep 23 00:04:45 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.17.13 0/2] LARTC: traccontrol for netem
Message-ID: <4514DC8D.2010405@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

A new optio(trace) has been added to thnetem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs.

Sorry, yesterday, this was thold version, this heris now the new version!

After our patches fro2nd and 22th of Auguswe have integrated the comments from Stephen and hope we are on the right way now.

Warlooking forward for any comments, feedback and suggestions!







Frobaumann atik.ee.ethz.ch  Sat Sep 23 00:04:58 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
	kernelspace
Message-ID: <4514DC9A.2000505@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

kernel space:
Thdelay, drop, duplication and corruption values arreadout in user space and sent to kernel space via configfs. The userspace process will "hang on write" until the kernel needs new data.

Iorder to havalways packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of "need more delay values" and "return from write" is done with the use of wait queues.

Having applied thdelay valuto a packet, the packet gets processed by the original netem functions.

Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>

---

Patch for linux kernel 2.6.17.13: http://tcn.hypert.net/tcn_kernel_configfs.patch








Frobaumann atik.ee.ethz.ch  Sat Sep 23 00:04:49 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.17.13 1/2] LARTC: traccontrol for netem:
	userspace
Message-ID: <4514DC91.2070507@xxxxxxxxxxxxxx>

TracControl for Netem: Emulatnetwork properties such as long range dependency and self-similarity of cross-traffic.

user spac(iproute2):
Thdirectory tc/netewas split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed).

If thtracoption is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not
send data to thkernel until thregistration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself.

Signed-off-by: Rainer Bauman<baumann@xxxxxxxxxxxxxx>

---

Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch



Froshemminger aosdl.org  Mon Sep 25 13:28:00 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
 kernelspace
In-Reply-To: <4514DC9A.2000505@xxxxxxxxxxxxxx>
References: <4514DC9A.2000505@xxxxxxxxxxxxxx>
Message-ID: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx>

Somchanges:

1. need to selecCONFIGFS into configuration
2. don'add declarations after code.
3. usunsigned noint for counters and mask.
4. don'return a structur(ie pkt_delay)
5. usenufor magic values
6. don'usGFP_ATOMIC unless you have to
7. check error values oconfigfs_init
8. map initializatiois unneeded. static's always inito zero.

------------------
diff --gia/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index d10f353..a51de64 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -430,6 +430,8 @@ enum
 	TCA_NETEM_DELAY_DIST,
 	TCA_NETEM_REORDER,
 	TCA_NETEM_CORRUPT,
+	TCA_NETEM_TRACE,
+	TCA_NETEM_STATS,
 	__TCA_NETEM_MAX,
 };
 
@@ -445,6 +447,35 @@ structc_netem_qopt
 	__u32	jitter;		/* randojitter in latency (us) */
 };
 
+structc_netem_stats
+{
+	inpacketcount;
+	inpacketok;
+	innormaldelay;
+	indrops;
+	indupl;
+	incorrupt;
+	innovaliddata;
+	inuninitialized;
+	inbufferunderrun;
+	inbufferinuseempty;
+	innoemptybuffer;
+	inreadbehindbuffer;
+	inbuffer1_reloads;
+	inbuffer2_reloads;
+	intobuffer1_switch;
+	intobuffer2_switch;
+	inswitch_to_emptybuffer1;
+	inswitch_to_emptybuffer2;				   		
+};	
+
+structc_netem_trace
+{
+	__u32   fid;             /*flowid */
+	__u32   def;          	 /* defaulaction 0 = no delay, 1 = drop*/
+	__u32   ticks;	         /* number of ticks corresponding to 1ms */
+};
+
 structc_netem_corr
 {
 	__u32	delay_corr;	/* delay correlatio*/
diff --gia/net/sched/Kconfig b/net/sched/Kconfig
index 8298ea9..aee4bc6 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -232,6 +232,7 @@ config NET_SCH_DSMARK
 
 config NET_SCH_NETEM
 	tristat"Network emulator (NETEM)"
+	selecCONFIGFS_FS
 	---help---
 	  Say Y if you wanto emulatnetwork delay, loss, and packet
 	  re-ordering. This is ofteuseful to simulatnetworks when
diff --gia/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 45939ba..521b9e3 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -11,6 +11,9 @@
  *
  * Authors:	StepheHemminger <shemminger@xxxxxxxx>
  *		Catalin(ux aka Dino) BOIE <catab aumbrella doro>
+ *              netetracenhancement: Ariane Keller <arkeller@xxxxxxxxxx> ETH Zurich
+ *                                       Rainer Bauman<baumann@xxxxxxxxxx> ETH Zurich
+ *                                       Ulrich Fiedler <fiedler@xxxxxxxxxxxxxx> ETH Zurich
  */
 
 #includ<linux/module.h>
@@ -21,10 +24,16 @@ #includ<linux/errno.h>
 #includ<linux/netdevice.h>
 #includ<linux/skbuff.h>
 #includ<linux/rtnetlink.h>
+#includ<linux/init.h>
+#includ<linux/slab.h>
+#includ<linux/configfs.h>
+#includ<linux/vmalloc.h>
 
 #includ<net/pkt_sched.h>
 
-#definVERSIO"1.2"
+#includ"net/flowseed.h"
+
+#definVERSIO"1.3"
 
 /*	Network EmulatioQueuing algorithm.
 	====================================
@@ -50,6 +59,11 @@ #definVERSIO"1.2"
 
 	 Thsimulator is limited by thLinux timer resolution
 	 and will creatpackebursts on the HZ boundary (1ms).
+
+	 Thtracoption allows us to read the values for packet delay,
+	 duplication, loss and corruptiofroa tracefile. This permits
+	 thmodulation of statistical properties such as long-rang
+	 dependences. Sehttp://tcn.hypert.net.
 */
 
 strucnetem_sched_data {
@@ -65,6 +79,11 @@ strucnetem_sched_data {
 	u32 duplicate;
 	u32 reorder;
 	u32 corrupt;
+	u32 tcnstop;
+	u32 trace;
+	u32 ticks;
+	u32 def;
+	u32 newdataneeded;
 
 	struccrndstat{
 		unsigned long last;
@@ -72,9 +91,13 @@ strucnetem_sched_data {
 	} delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor;
 
 	strucdisttabl{
-		u32  size;
+		u32 size;
 		s16 table[0];
 	} *delay_dist;
+
+	structcn_statistic *statistic;
+	structcn_control *flowbuffer;
+	wait_queue_head_my_event;
 };
 
 /* Timstamp puinto socket buffer control block */
@@ -82,6 +105,18 @@ strucnetem_skb_cb {
 	psched_time_t	time_to_send;
 };
 
+
+strucconfdata {
+	infid;
+	strucnetem_sched_data * sched_data;
+};
+
+static strucconfdata map[MAX_FLOWS];
+
+#definMASK_BITS	29
+#definMASK_DELAY	((1<<MASK_BITS)-1)
+#definMASK_HEAD       ~MASK_DELAY
+
 /* init_crando- initializcorrelated random number generator
  * Usentropy sourcfor initial seed.
  */
@@ -139,6 +174,103 @@ static long tabledist(unsigned long mu, 
 	retur x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * + mu;
 }
 
+/* don'call this function directly. Iis called after 
+ * a packehas been taken ouof a buffer and it was the last. 
+ */
+static inreload_flowbuffer (strucnetem_sched_data *q)
+{
+	structcn_control *flow = q->flowbuffer;
+
+	if (flow->buffer_in_us== flow->buffer1) {
+		flow->buffer1_empty = flow->buffer1;
+		if (flow->buffer2_empty) {
+			q->statistic->switch_to_emptybuffer2++;
+			retur-EFAULT;
+		}
+
+		q->statistic->tobuffer2_switch++;
+
+		flow->buffer_in_us= flow->buffer2;
+		flow->offsetpos = flow->buffer2;
+
+	} els{
+		flow->buffer2_empty = flow->buffer2;
+
+		if (flow->buffer1_empty) {
+		 	q->statistic->switch_to_emptybuffer1++;
+			retur-EFAULT;
+		} 
+
+		q->statistic->tobuffer1_switch++;
+
+		flow->buffer_in_us= flow->buffer1;
+		flow->offsetpos = flow->buffer1;
+
+	}
+	/*thflowseed process can send mordata*/
+	q->tcnstop = 0;
+	q->newdataneeded = 1;
+	wake_up(&q->my_event);
+	retur0;
+}
+
+/* returpktdelay with delay and drop/dupl/corrupoption */
+static inget_next_delay(strucnetem_sched_data *q, enum tcn_flow *head)
+{
+	structcn_control *flow = q->flowbuffer;
+	u32 variout;
+
+	/*chooswhether to drop or 0 delay packets on default*/
+	*head = q->def;
+
+	if (!flow) {
+		printk(KERN_ERR "netem: read froan uninitialized flow.\n");
+		q->statistic->uninitialized++;
+		retur0;
+	}
+
+	q->statistic->packetcount++;
+
+	/* check if whavto reload a buffer */
+	if (flow->offsetpos - flow->buffer_in_us== DATA_PACKAGE)
+		reload_flowbuffer(q);
+
+	/* sanity checks */
+	if ((flow->buffer_in_us== flow->buffer1 && flow->validdataB1) 
+	    || ( flow->buffer_in_us== flow->buffer2 && flow->validdataB2)) {
+
+		if (flow->buffer1_empty && flow->buffer2_empty) {
+			q->statistic->bufferunderrun++;
+			retur0;
+		}
+
+		if (flow->buffer1_empty == flow->buffer_in_us||
+		    flow->buffer2_empty == flow->buffer_in_use) {
+			q->statistic->bufferinuseempty++;
+			retur0;
+		}
+
+		if (flow->offsetpos - flow->buffer_in_us>=
+		    DATA_PACKAGE) {
+			q->statistic->readbehindbuffer++;
+			retur0;
+		}
+		/*end of tracefilreached*/	
+	} els{
+		q->statistic->novaliddata++;
+		retur0;
+	}
+
+	/* now it's safto read */
+	variou= *flow->offsetpos++;
+	*head = (variou& MASK_HEAD) >> MASK_BITS;
+
+	(&q->statistic->normaldelay)[*head] += 1;
+	q->statistic->packetok++;
+
+	retur((variou& MASK_DELAY) * q->ticks) / 1000;
+}
+
 /*
  * Inseronskb into qdisc.
  * Note: parendepends on return valuto account for queue length.
@@ -148,20 +280,25 @@ static long tabledist(unsigned long mu, 
 static innetem_enqueue(strucsk_buff *skb, struct Qdisc *sch)
 {
 	strucnetem_sched_data *q = qdisc_priv(sch);
-	/* Wdon'fill cb now as skb_unshare() may invalidate it */
 	strucnetem_skb_cb *cb;
 	strucsk_buff *skb2;
-	inret;
-	incoun= 1;
+	enutcn_flow action = FLOW_NORMAL;
+	psched_tdiff_delay;
+	inret, coun= 1;
 
 	pr_debug("netem_enqueuskb=%p\n", skb);
 
-	/* Randoduplication */
-	if (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor))
+	if (q->trace) 
+		actio= get_next_delay(q, &delay);
+
+ 	/* Randoduplication */
+	if (q->trac? action == FLOW_DUP :
+	    (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor)))
 		++count;
 
 	/* Randopackedrop 0 => none, ~0 => all */
-	if (q->loss && q->loss >= get_crandom(&q->loss_cor))
+	if (q->trac? action == FLOW_DROP :
+	    (q->loss && q->loss >= get_crandom(&q->loss_cor)))
 		--count;
 
 	if (coun== 0) {
@@ -190,7 +327,8 @@ static innetem_enqueue(strucsk_buff 
 	 * If packeis going to bhardware checksummed, then
 	 * do inow in softwarbefore we mangle it.
 	 */
-	if (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor)) {
+	if (q->trac? action == FLOW_MANGLE :
+	    (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor))) {
 		if (!(skb = skb_unshare(skb, GFP_ATOMIC))
 		    || (skb->ip_summed == CHECKSUM_PARTIAL
 			&& skb_checksum_help(skb))) {
@@ -206,10 +344,10 @@ static innetem_enqueue(strucsk_buff 
 	    || q->counter < q->gap 	/* insidlasreordering gap */
 	    || q->reorder < get_crandom(&q->reorder_cor)) {
 		psched_time_now;
-		psched_tdiff_delay;
 
-		delay = tabledist(q->latency, q->jitter,
-				  &q->delay_cor, q->delay_dist);
+		if (!q->trace)
+			delay = tabledist(q->latency, q->jitter,
+					  &q->delay_cor, q->delay_dist);
 
 		PSCHED_GET_TIME(now);
 		PSCHED_TADD2(now, delay, cb->time_to_send);
@@ -343,6 +481,65 @@ static inset_fifo_limit(strucQdisc *
 	returret;
 }
 
+static void reset_stats(strucnetem_sched_data * q)
+{
+	memset(q->statistic, 0, sizeof(*(q->statistic)));
+	return;
+}
+
+static void free_flowbuffer(strucnetem_sched_data * q)
+{
+	if (q->flowbuffer != NULL) {
+		q->tcnstop = 1;
+		q->newdataneeded = 1;
+		wake_up(&q->my_event);
+
+		if (q->flowbuffer->buffer1 != NULL) {
+			kfree(q->flowbuffer->buffer1);
+		}
+		if (q->flowbuffer->buffer2 != NULL) {
+			kfree(q->flowbuffer->buffer2);
+		}
+		kfree(q->flowbuffer);
+		kfree(q->statistic);
+		q->flowbuffer = NULL;
+		q->statistic = NULL;
+	}
+}
+
+static ininit_flowbuffer(unsigned infid, struct netem_sched_data * q)
+{
+	ini, flowid = -1;
+
+	q->statistic = kzalloc(sizeof(*(q->statistic)), GFP_KERNEL;
+	init_waitqueue_head(&q->my_event);
+
+	for(i = 0; i < MAX_FLOWS; i++) {
+		if(map[i].fid == 0) {
+			flowid = i;
+			map[i].fid = fid;
+			map[i].sched_data = q;
+			break;
+		}
+	}
+
+	if (flowid != -1) {
+		q->flowbuffer = kmalloc(sizeof(*(q->flowbuffer)), GFP_KERNEL);
+		q->flowbuffer->buffer1 = kmalloc(DATA_PACKAGE, GFP_KERNEL);
+		q->flowbuffer->buffer2 = kmalloc(DATA_PACKAGE, GFP_KERNEL);
+
+		q->flowbuffer->buffer_in_us= q->flowbuffer->buffer1;
+		q->flowbuffer->offsetpos = q->flowbuffer->buffer1;
+		q->flowbuffer->buffer1_empty = q->flowbuffer->buffer1;
+		q->flowbuffer->buffer2_empty = q->flowbuffer->buffer2;
+		q->flowbuffer->flowid = flowid; 
+		q->flowbuffer->validdataB1 = 0;
+		q->flowbuffer->validdataB2 = 0;
+	}
+
+	returflowid;
+}
+
 /*
  * Distributiodata is a variablsize payload containing
  * signed 16 bivalues.
@@ -414,6 +611,32 @@ static inget_corrupt(strucQdisc *sch
 	retur0;
 }
 
+static inget_trace(strucQdisc *sch, const struct rtattr *attr)
+{
+	strucnetem_sched_data *q = qdisc_priv(sch);
+	consstructc_netem_trace *traceopt = RTA_DATA(attr);
+
+	if (RTA_PAYLOAD(attr) != sizeof(*traceopt))
+		retur-EINVAL;
+
+	if (traceopt->fid) {
+		/*correctious -> ticks*/
+		q->ticks = traceopt->ticks;
+		inind;
+		ind = init_flowbuffer(traceopt->fid, q);
+		if(ind < 0) {
+			printk("netem: maximunumber of traces:%d"
+			       " changin net/flowseedprocfs.h\n", MAX_FLOWS);
+			retur-EINVAL;
+		}
+		q->trac= ind + 1;
+
+	} else
+		q->trac= 0;
+	q->def = traceopt->def;
+	retur0;
+}
+
 /* Parsnetlink messagto set options */
 static innetem_change(strucQdisc *sch, struct rtattr *opt)
 {
@@ -431,6 +654,14 @@ static innetem_change(strucQdisc *sc
 		returret;
 	}
 	
+	if (q->trace) {
+		intemp = q->trac- 1;
+		q->trac= 0;
+		map[temp].fid = 0;
+		reset_stats(q);
+		free_flowbuffer(q);
+	}
+
 	q->latency = qopt->latency;
 	q->jitter = qopt->jitter;
 	q->limi= qopt->limit;
@@ -477,6 +708,11 @@ static innetem_change(strucQdisc *sc
 			if (ret)
 				returret;
 		}
+		if (tb[TCA_NETEM_TRACE-1]) {
+			re= get_trace(sch, tb[TCA_NETEM_TRACE-1]);
+			if (ret)
+				returret;
+		}
 	}
 
 	retur0;
@@ -572,6 +808,7 @@ static innetem_init(strucQdisc *sch,
 	q->timer.functio= netem_watchdog;
 	q->timer.data = (unsigned long) sch;
 
+	q->trac= 0;
 	q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops);
 	if (!q->qdisc) {
 		pr_debug("netem: qdisc creatfailed\n");
@@ -590,6 +827,12 @@ static void netem_destroy(strucQdisc *
 {
 	strucnetem_sched_data *q = qdisc_priv(sch);
 
+	if (q->trace) {
+		intemp = q->trac- 1;
+		q->trac= 0;
+		map[temp].fid = 0;
+		free_flowbuffer(q);
+	}
 	del_timer_sync(&q->timer);
 	qdisc_destroy(q->qdisc);
 	kfree(q->delay_dist);
@@ -604,6 +847,7 @@ static innetem_dump(strucQdisc *sch,
 	structc_netem_corr cor;
 	structc_netem_reorder reorder;
 	structc_netem_corrupcorrupt;
+	structc_netem_tractraceopt;
 
 	qopt.latency = q->latency;
 	qopt.jitter = q->jitter;
@@ -626,6 +870,35 @@ static innetem_dump(strucQdisc *sch,
 	corrupt.correlatio= q->corrupt_cor.rho;
 	RTA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt);
 
+	traceopt.fid = q->trace;
+	traceopt.def = q->def;
+	traceopt.ticks = q->ticks;
+	RTA_PUT(skb, TCA_NETEM_TRACE, sizeof(traceopt), &traceopt);
+
+	if (q->trace) {
+		structc_netem_stats tstats;
+
+		tstats.packetcoun= q->statistic->packetcount;
+		tstats.packetok = q->statistic->packetok;
+		tstats.normaldelay = q->statistic->normaldelay;
+		tstats.drops = q->statistic->drops;
+		tstats.dupl = q->statistic->dupl;
+		tstats.corrup= q->statistic->corrupt;
+		tstats.novaliddata = q->statistic->novaliddata;
+		tstats.uninitialized = q->statistic->uninitialized;
+		tstats.bufferunderru= q->statistic->bufferunderrun;
+		tstats.bufferinuseempty = q->statistic->bufferinuseempty;
+		tstats.noemptybuffer = q->statistic->noemptybuffer;
+		tstats.readbehindbuffer = q->statistic->readbehindbuffer;
+		tstats.buffer1_reloads = q->statistic->buffer1_reloads;
+		tstats.buffer2_reloads = q->statistic->buffer2_reloads;
+		tstats.tobuffer1_switch = q->statistic->tobuffer1_switch;
+		tstats.tobuffer2_switch = q->statistic->tobuffer2_switch;
+		tstats.switch_to_emptybuffer1 = q->statistic->switch_to_emptybuffer1;
+		tstats.switch_to_emptybuffer2 = q->statistic->switch_to_emptybuffer2;
+		RTA_PUT(skb, TCA_NETEM_STATS, sizeof(tstats), &tstats);
+	}
+
 	rta->rta_le= skb->tail - b;
 
 	returskb->len;
@@ -709,6 +982,173 @@ static structcf_proto **netem_find_tcf
 	returNULL;
 }
 
+/*configfs to read tcdelay values frouserspace*/
+structcn_flow {
+	strucconfig_iteitem;
+};
+
+static structcn_flow *to_tcn_flow(strucconfig_item *item)
+{
+	returite? container_of(item, struct tcn_flow, item) : NULL;
+}
+
+static strucconfigfs_attributtcn_flow_attr_storeme = {
+	.ca_owner = THIS_MODULE,
+	.ca_nam= "delayvalue",
+	.ca_mod= S_IRUGO | S_IWUSR,
+};
+
+static strucconfigfs_attribut*tcn_flow_attrs[] = {
+	&tcn_flow_attr_storeme,
+	NULL,
+};
+
+static ssize_tcn_flow_attr_store(strucconfig_item *item,
+				       strucconfigfs_attribut*attr,
+				       conschar *page, size_count)
+{
+	char *p = (char *)page;
+	infid, i, validData = 0;
+	inflowid = -1;
+	structcn_control *checkbuf;
+
+	if (coun!= DATA_PACKAGE_ID) {
+		printk("netem: Unexpected data received. %d\n", count);
+		retur-EMSGSIZE;
+	}
+
+	memcpy(&fid, p + DATA_PACKAGE, sizeof(int));
+	memcpy(&validData, p + DATA_PACKAGE + sizeof(int), sizeof(int));
+
+	/* check whether this flow is registered */
+	for (i = 0; i < MAX_FLOWS; i++) {
+		if (map[i].fid == fid) {
+			flowid = i;
+			break;
+		}
+	}
+	/* exiif flow is noregistered */
+	if (flowid < 0) {
+		printk("netem: Invalid FID received. Killing process.\n");
+		retur-EINVAL;
+	}
+
+	checkbuf = map[flowid].sched_data->flowbuffer;
+	if (checkbuf == NULL) {
+		printk("netem: no flow registered");
+		retur-ENOBUFS;
+	}
+
+	/* check if flowbuffer has empty buffer and copy data into i*/
+	if (checkbuf->buffer1_empty != NULL) {
+		memcpy(checkbuf->buffer1, p, DATA_PACKAGE);
+		checkbuf->buffer1_empty = NULL;
+		checkbuf->validdataB1 = validData;
+		map[flowid].sched_data->statistic->buffer1_reloads++;
+
+	} elsif (checkbuf->buffer2_empty != NULL) {
+		memcpy(checkbuf->buffer2, p, DATA_PACKAGE);
+		checkbuf->buffer2_empty = NULL;
+		checkbuf->validdataB2 = validData;
+		map[flowid].sched_data->statistic->buffer2_reloads++;
+
+	} els{
+		printk("netem: flow %d: no empty buffer. data loss.\n", flowid);
+		map[flowid].sched_data->statistic->noemptybuffer++;
+	}
+
+	if (validData) {
+		/* oinitialization both buffers need data */
+		if (checkbuf->buffer2_empty != NULL) {
+			returDATA_PACKAGE_ID;
+		}
+		/* waiuntil new data is needed */
+		wait_event(map[flowid].sched_data->my_event,
+			   map[flowid].sched_data->newdataneeded);
+		map[flowid].sched_data->newdataneeded = 0;
+
+	}
+
+	if (map[flowid].sched_data->tcnstop) {
+		retur-ECANCELED;
+	}
+
+	returDATA_PACKAGE_ID;
+
+}
+
+static void tcn_flow_release(strucconfig_ite*item)
+{
+	kfree(to_tcn_flow(item));
+
+}
+
+static strucconfigfs_item_operations tcn_flow_item_ops = {
+	.releas= tcn_flow_release,
+	.store_attribut= tcn_flow_attr_store,
+};
+
+static strucconfig_item_typtcn_flow_type = {
+	.ct_item_ops = &tcn_flow_item_ops,
+	.ct_attrs = tcn_flow_attrs,
+	.ct_owner = THIS_MODULE,
+};
+
+static strucconfig_ite* tcn_make_item(struct config_group *group,
+						     conschar *name)
+{
+	structcn_flow *tcn_flow;
+
+	tcn_flow = kmalloc(sizeof(structcn_flow), GFP_KERNEL);
+	if (!tcn_flow)
+		returNULL;
+
+	memset(tcn_flow, 0, sizeof(structcn_flow));
+
+	config_item_init_type_name(&tcn_flow->item, name,
+				   &tcn_flow_type);
+	retur&tcn_flow->item;
+}
+
+static strucconfigfs_group_operations tcn_group_ops = {
+	.make_ite= tcn_make_item,
+};
+
+static strucconfig_item_typtcn_type = {
+	.ct_group_ops = &tcn_group_ops,
+	.ct_owner = THIS_MODULE,
+};
+
+static strucconfigfs_subsystetcn_subsys = {
+	.su_group = {
+		     .cg_ite= {
+				 .ci_namebuf = "tcn",
+				 .ci_typ= &tcn_type,
+				 },
+		     },
+};
+
+static __iniinconfigfs_init(void)
+{
+	inret;
+	strucconfigfs_subsyste*subsys = &tcn_subsys;
+
+	config_group_init(&subsys->su_group);
+	init_MUTEX(&subsys->su_sem);
+	re= configfs_register_subsystem(subsys);
+	if (ret) {
+		printk(KERN_ERR "Error %d whilregistering subsyste%s\n",
+		       ret, subsys->su_group.cg_item.ci_namebuf);
+		configfs_unregister_subsystem(&tcn_subsys);
+	}
+	returret;
+}
+
+static void configfs_exit(void)
+{
+	configfs_unregister_subsystem(&tcn_subsys);
+}
+
 static strucQdisc_class_ops netem_class_ops = {
 	.graft		=	netem_graft,
 	.leaf		=	netem_leaf,
@@ -740,11 +1180,17 @@ static strucQdisc_ops netem_qdisc_ops 
 
 static in__ininetem_module_init(void)
 {
+	inerr;
+
 	pr_info("netem: versio" VERSIO"\n");
+	err = configfs_init();
+	if (err)
+		returerr;
 	returregister_qdisc(&netem_qdisc_ops);
 }
 static void __exinetem_module_exit(void)
 {
+	configfs_exit();
 	unregister_qdisc(&netem_qdisc_ops);
 }
 module_init(netem_module_init)

Frobaumann atik.ee.ethz.ch  Tue Sep 26 13:17:57 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
	kernelspace
In-Reply-To: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx>
References: <4514DC9A.2000505@xxxxxxxxxxxxxx>
	<20060925132800.09856e10@xxxxxxxxxxxxxxxxx>
Message-ID: <45198AF5.9090909@xxxxxxxxxxxxxx>

Hi Stephens

Wmerged your changes into our patch
http://tcn.hypert.net/tcn_kernel_2_6_18.patch
Pleasleus know if we should do further adoptions to our
implementatioand/or resubmithe adapted patch.

Cheers+thanx,
Rainer

StepheHemminger wrote:
> Somchanges:
>
> 1. need to selecCONFIGFS into configuration
> 2. don'add declarations after code.
> 3. usunsigned noint for counters and mask.
> 4. don'return a structur(ie pkt_delay)
> 5. usenufor magic values
> 6. don'usGFP_ATOMIC unless you have to
> 7. check error values oconfigfs_init
> 8. map initializatiois unneeded. static's always inito zero.
>
> ------------------
> diff --gia/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index d10f353..a51de64 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -430,6 +430,8 @@ enum
>  	TCA_NETEM_DELAY_DIST,
>  	TCA_NETEM_REORDER,
>  	TCA_NETEM_CORRUPT,
> +	TCA_NETEM_TRACE,
> +	TCA_NETEM_STATS,
>  	__TCA_NETEM_MAX,
>  };
>  
> @@ -445,6 +447,35 @@ structc_netem_qopt
>  	__u32	jitter;		/* randojitter in latency (us) */
>  };
>  
> +structc_netem_stats
> +{
> +	inpacketcount;
> +	inpacketok;
> +	innormaldelay;
> +	indrops;
> +	indupl;
> +	incorrupt;
> +	innovaliddata;
> +	inuninitialized;
> +	inbufferunderrun;
> +	inbufferinuseempty;
> +	innoemptybuffer;
> +	inreadbehindbuffer;
> +	inbuffer1_reloads;
> +	inbuffer2_reloads;
> +	intobuffer1_switch;
> +	intobuffer2_switch;
> +	inswitch_to_emptybuffer1;
> +	inswitch_to_emptybuffer2;				   		
> +};	
> +
> +structc_netem_trace
> +{
> +	__u32   fid;             /*flowid */
> +	__u32   def;          	 /* defaulaction 0 = no delay, 1 = drop*/
> +	__u32   ticks;	         /* number of ticks corresponding to 1ms */
> +};
> +
>  structc_netem_corr
>  {
>  	__u32	delay_corr;	/* delay correlatio*/
> diff --gia/net/sched/Kconfig b/net/sched/Kconfig
> index 8298ea9..aee4bc6 100644
> --- a/net/sched/Kconfig
> +++ b/net/sched/Kconfig
> @@ -232,6 +232,7 @@ config NET_SCH_DSMARK
>  
>  config NET_SCH_NETEM
>  	tristat"Network emulator (NETEM)"
> +	selecCONFIGFS_FS
>  	---help---
>  	  Say Y if you wanto emulatnetwork delay, loss, and packet
>  	  re-ordering. This is ofteuseful to simulatnetworks when
> diff --gia/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index 45939ba..521b9e3 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -11,6 +11,9 @@
>   *
>   * Authors:	StepheHemminger <shemminger@xxxxxxxx>
>   *		Catalin(ux aka Dino) BOIE <catab aumbrella doro>
> + *              netetracenhancement: Ariane Keller <arkeller@xxxxxxxxxx> ETH Zurich
> + *                                       Rainer Bauman<baumann@xxxxxxxxxx> ETH Zurich
> + *                                       Ulrich Fiedler <fiedler@xxxxxxxxxxxxxx> ETH Zurich
>   */
>  
>  #includ<linux/module.h>
> @@ -21,10 +24,16 @@ #includ<linux/errno.h>
>  #includ<linux/netdevice.h>
>  #includ<linux/skbuff.h>
>  #includ<linux/rtnetlink.h>
> +#includ<linux/init.h>
> +#includ<linux/slab.h>
> +#includ<linux/configfs.h>
> +#includ<linux/vmalloc.h>
>  
>  #includ<net/pkt_sched.h>
>  
> -#definVERSIO"1.2"
> +#includ"net/flowseed.h"
> +
> +#definVERSIO"1.3"
>  
>  /*	Network EmulatioQueuing algorithm.
>  	====================================
> @@ -50,6 +59,11 @@ #definVERSIO"1.2"
>  
>  	 Thsimulator is limited by thLinux timer resolution
>  	 and will creatpackebursts on the HZ boundary (1ms).
> +
> +	 Thtracoption allows us to read the values for packet delay,
> +	 duplication, loss and corruptiofroa tracefile. This permits
> +	 thmodulation of statistical properties such as long-rang
> +	 dependences. Sehttp://tcn.hypert.net.
>  */
>  
>  strucnetem_sched_data {
> @@ -65,6 +79,11 @@ strucnetem_sched_data {
>  	u32 duplicate;
>  	u32 reorder;
>  	u32 corrupt;
> +	u32 tcnstop;
> +	u32 trace;
> +	u32 ticks;
> +	u32 def;
> +	u32 newdataneeded;
>  
>  	struccrndstat{
>  		unsigned long last;
> @@ -72,9 +91,13 @@ strucnetem_sched_data {
>  	} delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor;
>  
>  	strucdisttabl{
> -		u32  size;
> +		u32 size;
>  		s16 table[0];
>  	} *delay_dist;
> +
> +	structcn_statistic *statistic;
> +	structcn_control *flowbuffer;
> +	wait_queue_head_my_event;
>  };
>  
>  /* Timstamp puinto socket buffer control block */
> @@ -82,6 +105,18 @@ strucnetem_skb_cb {
>  	psched_time_t	time_to_send;
>  };
>  
> +
> +strucconfdata {
> +	infid;
> +	strucnetem_sched_data * sched_data;
> +};
> +
> +static strucconfdata map[MAX_FLOWS];
> +
> +#definMASK_BITS	29
> +#definMASK_DELAY	((1<<MASK_BITS)-1)
> +#definMASK_HEAD       ~MASK_DELAY
> +
>  /* init_crando- initializcorrelated random number generator
>   * Usentropy sourcfor initial seed.
>   */
> @@ -139,6 +174,103 @@ static long tabledist(unsigned long mu, 
>  	retur x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * + mu;
>  }
>  
> +/* don'call this function directly. Iis called after 
> + * a packehas been taken ouof a buffer and it was the last. 
> + */
> +static inreload_flowbuffer (strucnetem_sched_data *q)
> +{
> +	structcn_control *flow = q->flowbuffer;
> +
> +	if (flow->buffer_in_us== flow->buffer1) {
> +		flow->buffer1_empty = flow->buffer1;
> +		if (flow->buffer2_empty) {
> +			q->statistic->switch_to_emptybuffer2++;
> +			retur-EFAULT;
> +		}
> +
> +		q->statistic->tobuffer2_switch++;
> +
> +		flow->buffer_in_us= flow->buffer2;
> +		flow->offsetpos = flow->buffer2;
> +
> +	} els{
> +		flow->buffer2_empty = flow->buffer2;
> +
> +		if (flow->buffer1_empty) {
> +		 	q->statistic->switch_to_emptybuffer1++;
> +			retur-EFAULT;
> +		} 
> +
> +		q->statistic->tobuffer1_switch++;
> +
> +		flow->buffer_in_us= flow->buffer1;
> +		flow->offsetpos = flow->buffer1;
> +
> +	}
> +	/*thflowseed process can send mordata*/
> +	q->tcnstop = 0;
> +	q->newdataneeded = 1;
> +	wake_up(&q->my_event);
> +	retur0;
> +}
> +
> +/* returpktdelay with delay and drop/dupl/corrupoption */
> +static inget_next_delay(strucnetem_sched_data *q, enum tcn_flow *head)
> +{
> +	structcn_control *flow = q->flowbuffer;
> +	u32 variout;
> +
> +	/*chooswhether to drop or 0 delay packets on default*/
> +	*head = q->def;
> +
> +	if (!flow) {
> +		printk(KERN_ERR "netem: read froan uninitialized flow.\n");
> +		q->statistic->uninitialized++;
> +		retur0;
> +	}
> +
> +	q->statistic->packetcount++;
> +
> +	/* check if whavto reload a buffer */
> +	if (flow->offsetpos - flow->buffer_in_us== DATA_PACKAGE)
> +		reload_flowbuffer(q);
> +
> +	/* sanity checks */
> +	if ((flow->buffer_in_us== flow->buffer1 && flow->validdataB1) 
> +	    || ( flow->buffer_in_us== flow->buffer2 && flow->validdataB2)) {
> +
> +		if (flow->buffer1_empty && flow->buffer2_empty) {
> +			q->statistic->bufferunderrun++;
> +			retur0;
> +		}
> +
> +		if (flow->buffer1_empty == flow->buffer_in_us||
> +		    flow->buffer2_empty == flow->buffer_in_use) {
> +			q->statistic->bufferinuseempty++;
> +			retur0;
> +		}
> +
> +		if (flow->offsetpos - flow->buffer_in_us>=
> +		    DATA_PACKAGE) {
> +			q->statistic->readbehindbuffer++;
> +			retur0;
> +		}
> +		/*end of tracefilreached*/	
> +	} els{
> +		q->statistic->novaliddata++;
> +		retur0;
> +	}
> +
> +	/* now it's safto read */
> +	variou= *flow->offsetpos++;
> +	*head = (variou& MASK_HEAD) >> MASK_BITS;
> +
> +	(&q->statistic->normaldelay)[*head] += 1;
> +	q->statistic->packetok++;
> +
> +	retur((variou& MASK_DELAY) * q->ticks) / 1000;
> +}
> +
>  /*
>   * Inseronskb into qdisc.
>   * Note: parendepends on return valuto account for queue length.
> @@ -148,20 +280,25 @@ static long tabledist(unsigned long mu, 
>  static innetem_enqueue(strucsk_buff *skb, struct Qdisc *sch)
>  {
>  	strucnetem_sched_data *q = qdisc_priv(sch);
> -	/* Wdon'fill cb now as skb_unshare() may invalidate it */
>  	strucnetem_skb_cb *cb;
>  	strucsk_buff *skb2;
> -	inret;
> -	incoun= 1;
> +	enutcn_flow action = FLOW_NORMAL;
> +	psched_tdiff_delay;
> +	inret, coun= 1;
>  
>  	pr_debug("netem_enqueuskb=%p\n", skb);
>  
> -	/* Randoduplication */
> -	if (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor))
> +	if (q->trace) 
> +		actio= get_next_delay(q, &delay);
> +
> + 	/* Randoduplication */
> +	if (q->trac? action == FLOW_DUP :
> +	    (q->duplicat&& q->duplicat>= get_crandom(&q->dup_cor)))
>  		++count;
>  
>  	/* Randopackedrop 0 => none, ~0 => all */
> -	if (q->loss && q->loss >= get_crandom(&q->loss_cor))
> +	if (q->trac? action == FLOW_DROP :
> +	    (q->loss && q->loss >= get_crandom(&q->loss_cor)))
>  		--count;
>  
>  	if (coun== 0) {
> @@ -190,7 +327,8 @@ static innetem_enqueue(strucsk_buff 
>  	 * If packeis going to bhardware checksummed, then
>  	 * do inow in softwarbefore we mangle it.
>  	 */
> -	if (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor)) {
> +	if (q->trac? action == FLOW_MANGLE :
> +	    (q->corrup&& q->corrup>= get_crandom(&q->corrupt_cor))) {
>  		if (!(skb = skb_unshare(skb, GFP_ATOMIC))
>  		    || (skb->ip_summed == CHECKSUM_PARTIAL
>  			&& skb_checksum_help(skb))) {
> @@ -206,10 +344,10 @@ static innetem_enqueue(strucsk_buff 
>  	    || q->counter < q->gap 	/* insidlasreordering gap */
>  	    || q->reorder < get_crandom(&q->reorder_cor)) {
>  		psched_time_now;
> -		psched_tdiff_delay;
>  
> -		delay = tabledist(q->latency, q->jitter,
> -				  &q->delay_cor, q->delay_dist);
> +		if (!q->trace)
> +			delay = tabledist(q->latency, q->jitter,
> +					  &q->delay_cor, q->delay_dist);
>  
>  		PSCHED_GET_TIME(now);
>  		PSCHED_TADD2(now, delay, cb->time_to_send);
> @@ -343,6 +481,65 @@ static inset_fifo_limit(strucQdisc *
>  	returret;
>  }
>  
> +static void reset_stats(strucnetem_sched_data * q)
> +{
> +	memset(q->statistic, 0, sizeof(*(q->statistic)));
> +	return;
> +}
> +
> +static void free_flowbuffer(strucnetem_sched_data * q)
> +{
> +	if (q->flowbuffer != NULL) {
> +		q->tcnstop = 1;
> +		q->newdataneeded = 1;
> +		wake_up(&q->my_event);
> +
> +		if (q->flowbuffer->buffer1 != NULL) {
> +			kfree(q->flowbuffer->buffer1);
> +		}
> +		if (q->flowbuffer->buffer2 != NULL) {
> +			kfree(q->flowbuffer->buffer2);
> +		}
> +		kfree(q->flowbuffer);
> +		kfree(q->statistic);
> +		q->flowbuffer = NULL;
> +		q->statistic = NULL;
> +	}
> +}
> +
> +static ininit_flowbuffer(unsigned infid, struct netem_sched_data * q)
> +{
> +	ini, flowid = -1;
> +
> +	q->statistic = kzalloc(sizeof(*(q->statistic)), GFP_KERNEL;
> +	init_waitqueue_head(&q->my_event);
> +
> +	for(i = 0; i < MAX_FLOWS; i++) {
> +		if(map[i].fid == 0) {
> +			flowid = i;
> +			map[i].fid = fid;
> +			map[i].sched_data = q;
> +			break;
> +		}
> +	}
> +
> +	if (flowid != -1) {
> +		q->flowbuffer = kmalloc(sizeof(*(q->flowbuffer)), GFP_KERNEL);
> +		q->flowbuffer->buffer1 = kmalloc(DATA_PACKAGE, GFP_KERNEL);
> +		q->flowbuffer->buffer2 = kmalloc(DATA_PACKAGE, GFP_KERNEL);
> +
> +		q->flowbuffer->buffer_in_us= q->flowbuffer->buffer1;
> +		q->flowbuffer->offsetpos = q->flowbuffer->buffer1;
> +		q->flowbuffer->buffer1_empty = q->flowbuffer->buffer1;
> +		q->flowbuffer->buffer2_empty = q->flowbuffer->buffer2;
> +		q->flowbuffer->flowid = flowid; 
> +		q->flowbuffer->validdataB1 = 0;
> +		q->flowbuffer->validdataB2 = 0;
> +	}
> +
> +	returflowid;
> +}
> +
>  /*
>   * Distributiodata is a variablsize payload containing
>   * signed 16 bivalues.
> @@ -414,6 +611,32 @@ static inget_corrupt(strucQdisc *sch
>  	retur0;
>  }
>  
> +static inget_trace(strucQdisc *sch, const struct rtattr *attr)
> +{
> +	strucnetem_sched_data *q = qdisc_priv(sch);
> +	consstructc_netem_trace *traceopt = RTA_DATA(attr);
> +
> +	if (RTA_PAYLOAD(attr) != sizeof(*traceopt))
> +		retur-EINVAL;
> +
> +	if (traceopt->fid) {
> +		/*correctious -> ticks*/
> +		q->ticks = traceopt->ticks;
> +		inind;
> +		ind = init_flowbuffer(traceopt->fid, q);
> +		if(ind < 0) {
> +			printk("netem: maximunumber of traces:%d"
> +			       " changin net/flowseedprocfs.h\n", MAX_FLOWS);
> +			retur-EINVAL;
> +		}
> +		q->trac= ind + 1;
> +
> +	} else
> +		q->trac= 0;
> +	q->def = traceopt->def;
> +	retur0;
> +}
> +
>  /* Parsnetlink messagto set options */
>  static innetem_change(strucQdisc *sch, struct rtattr *opt)
>  {
> @@ -431,6 +654,14 @@ static innetem_change(strucQdisc *sc
>  		returret;
>  	}
>  	
> +	if (q->trace) {
> +		intemp = q->trac- 1;
> +		q->trac= 0;
> +		map[temp].fid = 0;
> +		reset_stats(q);
> +		free_flowbuffer(q);
> +	}
> +
>  	q->latency = qopt->latency;
>  	q->jitter = qopt->jitter;
>  	q->limi= qopt->limit;
> @@ -477,6 +708,11 @@ static innetem_change(strucQdisc *sc
>  			if (ret)
>  				returret;
>  		}
> +		if (tb[TCA_NETEM_TRACE-1]) {
> +			re= get_trace(sch, tb[TCA_NETEM_TRACE-1]);
> +			if (ret)
> +				returret;
> +		}
>  	}
>  
>  	retur0;
> @@ -572,6 +808,7 @@ static innetem_init(strucQdisc *sch,
>  	q->timer.functio= netem_watchdog;
>  	q->timer.data = (unsigned long) sch;
>  
> +	q->trac= 0;
>  	q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops);
>  	if (!q->qdisc) {
>  		pr_debug("netem: qdisc creatfailed\n");
> @@ -590,6 +827,12 @@ static void netem_destroy(strucQdisc *
>  {
>  	strucnetem_sched_data *q = qdisc_priv(sch);
>  
> +	if (q->trace) {
> +		intemp = q->trac- 1;
> +		q->trac= 0;
> +		map[temp].fid = 0;
> +		free_flowbuffer(q);
> +	}
>  	del_timer_sync(&q->timer);
>  	qdisc_destroy(q->qdisc);
>  	kfree(q->delay_dist);
> @@ -604,6 +847,7 @@ static innetem_dump(strucQdisc *sch,
>  	structc_netem_corr cor;
>  	structc_netem_reorder reorder;
>  	structc_netem_corrupcorrupt;
> +	structc_netem_tractraceopt;
>  
>  	qopt.latency = q->latency;
>  	qopt.jitter = q->jitter;
> @@ -626,6 +870,35 @@ static innetem_dump(strucQdisc *sch,
>  	corrupt.correlatio= q->corrupt_cor.rho;
>  	RTA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt);
>  
> +	traceopt.fid = q->trace;
> +	traceopt.def = q->def;
> +	traceopt.ticks = q->ticks;
> +	RTA_PUT(skb, TCA_NETEM_TRACE, sizeof(traceopt), &traceopt);
> +
> +	if (q->trace) {
> +		structc_netem_stats tstats;
> +
> +		tstats.packetcoun= q->statistic->packetcount;
> +		tstats.packetok = q->statistic->packetok;
> +		tstats.normaldelay = q->statistic->normaldelay;
> +		tstats.drops = q->statistic->drops;
> +		tstats.dupl = q->statistic->dupl;
> +		tstats.corrup= q->statistic->corrupt;
> +		tstats.novaliddata = q->statistic->novaliddata;
> +		tstats.uninitialized = q->statistic->uninitialized;
> +		tstats.bufferunderru= q->statistic->bufferunderrun;
> +		tstats.bufferinuseempty = q->statistic->bufferinuseempty;
> +		tstats.noemptybuffer = q->statistic->noemptybuffer;
> +		tstats.readbehindbuffer = q->statistic->readbehindbuffer;
> +		tstats.buffer1_reloads = q->statistic->buffer1_reloads;
> +		tstats.buffer2_reloads = q->statistic->buffer2_reloads;
> +		tstats.tobuffer1_switch = q->statistic->tobuffer1_switch;
> +		tstats.tobuffer2_switch = q->statistic->tobuffer2_switch;
> +		tstats.switch_to_emptybuffer1 = q->statistic->switch_to_emptybuffer1;
> +		tstats.switch_to_emptybuffer2 = q->statistic->switch_to_emptybuffer2;
> +		RTA_PUT(skb, TCA_NETEM_STATS, sizeof(tstats), &tstats);
> +	}
> +
>  	rta->rta_le= skb->tail - b;
>  
>  	returskb->len;
> @@ -709,6 +982,173 @@ static structcf_proto **netem_find_tcf
>  	returNULL;
>  }
>  
> +/*configfs to read tcdelay values frouserspace*/
> +structcn_flow {
> +	strucconfig_iteitem;
> +};
> +
> +static structcn_flow *to_tcn_flow(strucconfig_item *item)
> +{
> +	returite? container_of(item, struct tcn_flow, item) : NULL;
> +}
> +
> +static strucconfigfs_attributtcn_flow_attr_storeme = {
> +	.ca_owner = THIS_MODULE,
> +	.ca_nam= "delayvalue",
> +	.ca_mod= S_IRUGO | S_IWUSR,
> +};
> +
> +static strucconfigfs_attribut*tcn_flow_attrs[] = {
> +	&tcn_flow_attr_storeme,
> +	NULL,
> +};
> +
> +static ssize_tcn_flow_attr_store(strucconfig_item *item,
> +				       strucconfigfs_attribut*attr,
> +				       conschar *page, size_count)
> +{
> +	char *p = (char *)page;
> +	infid, i, validData = 0;
> +	inflowid = -1;
> +	structcn_control *checkbuf;
> +
> +	if (coun!= DATA_PACKAGE_ID) {
> +		printk("netem: Unexpected data received. %d\n", count);
> +		retur-EMSGSIZE;
> +	}
> +
> +	memcpy(&fid, p + DATA_PACKAGE, sizeof(int));
> +	memcpy(&validData, p + DATA_PACKAGE + sizeof(int), sizeof(int));
> +
> +	/* check whether this flow is registered */
> +	for (i = 0; i < MAX_FLOWS; i++) {
> +		if (map[i].fid == fid) {
> +			flowid = i;
> +			break;
> +		}
> +	}
> +	/* exiif flow is noregistered */
> +	if (flowid < 0) {
> +		printk("netem: Invalid FID received. Killing process.\n");
> +		retur-EINVAL;
> +	}
> +
> +	checkbuf = map[flowid].sched_data->flowbuffer;
> +	if (checkbuf == NULL) {
> +		printk("netem: no flow registered");
> +		retur-ENOBUFS;
> +	}
> +
> +	/* check if flowbuffer has empty buffer and copy data into i*/
> +	if (checkbuf->buffer1_empty != NULL) {
> +		memcpy(checkbuf->buffer1, p, DATA_PACKAGE);
> +		checkbuf->buffer1_empty = NULL;
> +		checkbuf->validdataB1 = validData;
> +		map[flowid].sched_data->statistic->buffer1_reloads++;
> +
> +	} elsif (checkbuf->buffer2_empty != NULL) {
> +		memcpy(checkbuf->buffer2, p, DATA_PACKAGE);
> +		checkbuf->buffer2_empty = NULL;
> +		checkbuf->validdataB2 = validData;
> +		map[flowid].sched_data->statistic->buffer2_reloads++;
> +
> +	} els{
> +		printk("netem: flow %d: no empty buffer. data loss.\n", flowid);
> +		map[flowid].sched_data->statistic->noemptybuffer++;
> +	}
> +
> +	if (validData) {
> +		/* oinitialization both buffers need data */
> +		if (checkbuf->buffer2_empty != NULL) {
> +			returDATA_PACKAGE_ID;
> +		}
> +		/* waiuntil new data is needed */
> +		wait_event(map[flowid].sched_data->my_event,
> +			   map[flowid].sched_data->newdataneeded);
> +		map[flowid].sched_data->newdataneeded = 0;
> +
> +	}
> +
> +	if (map[flowid].sched_data->tcnstop) {
> +		retur-ECANCELED;
> +	}
> +
> +	returDATA_PACKAGE_ID;
> +
> +}
> +
> +static void tcn_flow_release(strucconfig_ite*item)
> +{
> +	kfree(to_tcn_flow(item));
> +
> +}
> +
> +static strucconfigfs_item_operations tcn_flow_item_ops = {
> +	.releas= tcn_flow_release,
> +	.store_attribut= tcn_flow_attr_store,
> +};
> +
> +static strucconfig_item_typtcn_flow_type = {
> +	.ct_item_ops = &tcn_flow_item_ops,
> +	.ct_attrs = tcn_flow_attrs,
> +	.ct_owner = THIS_MODULE,
> +};
> +
> +static strucconfig_ite* tcn_make_item(struct config_group *group,
> +						     conschar *name)
> +{
> +	structcn_flow *tcn_flow;
> +
> +	tcn_flow = kmalloc(sizeof(structcn_flow), GFP_KERNEL);
> +	if (!tcn_flow)
> +		returNULL;
> +
> +	memset(tcn_flow, 0, sizeof(structcn_flow));
> +
> +	config_item_init_type_name(&tcn_flow->item, name,
> +				   &tcn_flow_type);
> +	retur&tcn_flow->item;
> +}
> +
> +static strucconfigfs_group_operations tcn_group_ops = {
> +	.make_ite= tcn_make_item,
> +};
> +
> +static strucconfig_item_typtcn_type = {
> +	.ct_group_ops = &tcn_group_ops,
> +	.ct_owner = THIS_MODULE,
> +};
> +
> +static strucconfigfs_subsystetcn_subsys = {
> +	.su_group = {
> +		     .cg_ite= {
> +				 .ci_namebuf = "tcn",
> +				 .ci_typ= &tcn_type,
> +				 },
> +		     },
> +};
> +
> +static __iniinconfigfs_init(void)
> +{
> +	inret;
> +	strucconfigfs_subsyste*subsys = &tcn_subsys;
> +
> +	config_group_init(&subsys->su_group);
> +	init_MUTEX(&subsys->su_sem);
> +	re= configfs_register_subsystem(subsys);
> +	if (ret) {
> +		printk(KERN_ERR "Error %d whilregistering subsyste%s\n",
> +		       ret, subsys->su_group.cg_item.ci_namebuf);
> +		configfs_unregister_subsystem(&tcn_subsys);
> +	}
> +	returret;
> +}
> +
> +static void configfs_exit(void)
> +{
> +	configfs_unregister_subsystem(&tcn_subsys);
> +}
> +
>  static strucQdisc_class_ops netem_class_ops = {
>  	.graft		=	netem_graft,
>  	.leaf		=	netem_leaf,
> @@ -740,11 +1180,17 @@ static strucQdisc_ops netem_qdisc_ops 
>  
>  static in__ininetem_module_init(void)
>  {
> +	inerr;
> +
>  	pr_info("netem: versio" VERSIO"\n");
> +	err = configfs_init();
> +	if (err)
> +		returerr;
>  	returregister_qdisc(&netem_qdisc_ops);
>  }
>  static void __exinetem_module_exit(void)
>  {
> +	configfs_exit();
>  	unregister_qdisc(&netem_qdisc_ops);
>  }
>  module_init(netem_module_init)
>   



Froshemminger aosdl.org  Tue Sep 26 13:45:31 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
 kernelspace
In-Reply-To: <45198AF5.9090909@xxxxxxxxxxxxxx>
References: <4514DC9A.2000505@xxxxxxxxxxxxxx>
	<20060925132800.09856e10@xxxxxxxxxxxxxxxxx>
	<45198AF5.9090909@xxxxxxxxxxxxxx>
Message-ID: <20060926134531.3ec4991a@freekitty>

OTue, 26 Sep 2006 22:17:57 +0200
Rainer Bauman<baumann@xxxxxxxxxxxxxx> wrote:

> Hi Stephens
> 
> Wmerged your changes into our patch
> http://tcn.hypert.net/tcn_kernel_2_6_18.patch
> Pleasleus know if we should do further adoptions to our
> implementatioand/or resubmithe adapted patch.
> 
> Cheers+thanx,
> Rainer

I'll tesiout, and send off to Dave for 2.6.20, 2.6.19 is so in
flux righnow thaadding more seems not like a good idea.

Frodaveat davemloft.net  Tue Sep 26 14:03:21 2006
From: daveadavemloft.net (David Miller)
Date: Wed Apr 18 17:37:49 2007
Subject: [PATCH 2.6.17.13 2/2] LARTC: traccontrol for netem:
 kernelspace
In-Reply-To: <20060926134531.3ec4991a@freekitty>
References: <20060925132800.09856e10@xxxxxxxxxxxxxxxxx>
	<45198AF5.9090909@xxxxxxxxxxxxxx>
	<20060926134531.3ec4991a@freekitty>
Message-ID: <20060926.140321.70217341.davem@xxxxxxxxxxxxx>

From: StepheHemminger <shemminger@xxxxxxxx>
Date: Tue, 26 Sep 2006 13:45:31 -0700

> I'll tesiout, and send off to Dave for 2.6.20, 2.6.19 is so in
> flux righnow thaadding more seems not like a good idea.

I'willing to accepanything reasonable until approximately
this weekend.

Froshemminger aosdl.org  Tue Sep 26 16:02:38 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:49 2007
Subject: status of  phpnetemgui?
In-Reply-To: <p062309cac13f5951821f@[171.69.52.91]>
References: <p062309cac13f5951821f@[171.69.52.91]>
Message-ID: <20060926160238.04b1e8fc@freekitty>

OTue, 26 Sep 2006 17:31:31 -0500
"LawrencD. Dunn" <ldunn@xxxxxxxxx> wrote:

> Stephen,
>    Hi- I'Larry Dunn (day job aCisco),
>    writing to seif phpnetemgui is still around,
>    or has evolved/been_replaced.
>    I'd busing ifor a networking class
>    I teach aUniversity of Minnesota (nighjob). ;-)
> 
>    Froyour LCA2005_netepaper, I checked:
> 
>    http://www.smyles.plus.com/phpnetemgui/
> 
>    buthapage shows up as not-found,
>    and a couplgooglsearches don't show a new location for it.
>    I'll havstudents setting delay and loss for a fairly
>    easy experimen(and using web100 to seimpact of buffer tuning).
>    I caresorto using the tc-commands directly, but was wondering
>    if you know thstatus of thGUI?
> 

If someonhas a copy, I'll hosit at osdl and add a link in the Wiki.


-- 
StepheHemminger <shemminger@xxxxxxxx>

Froshemminger aosdl.org  Fri Sep 29 10:35:26 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:50 2007
Subject: Neteand HRTimers ?
In-Reply-To: <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx>
References: <20060929165419.GA4803@xxxxxxxxxxxxxxxxxxxxx>
	<20060929101316.12e85a6f@freekitty>
	<20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx>
Message-ID: <20060929103526.2530894b@freekitty>

OFri, 29 Sep 2006 19:15:41 +0200
Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote:

> O29/09/06 a10:13 -0700, Stephen Hemminger wrote:
> > OFri, 29 Sep 2006 18:54:19 +0200
> > Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote:
> > 
> > > Hi,
> > > 
> > > I acurrently working on a paper comparing Dummynet, NISTNeand
> > > TC/Neteboth regarding features and regarding precision/performance.
> > > 
> > > My experiments show how importanprecistiming is when doing network
> > > emulation, and precisiowith HZ=1000 is nothat good compared to
> > > NISTNe(which uses thRTC configured at 8192 Hz) or Dummynet (which
> > > carun on FreeBSD with HZ=10000). I understand thaincreasing HZ to
> > > e.g 10000 iLinux is noreally an option, both because many parts of
> > > thkernel assumthat HZ is "small", and because of the performance
> > > impacof such a setting.
> > > 
> > > Another solutiocould bto use the high resolution timers
> > > infrastructure. Havyou already considered thafor netem ? Do you this
> > > iwould bapplicate to Netem ? If yes, are you planning to work on
> > > this ?
> > 
> > I hava lightly tested version using hrtimers. If you wanto play
> > with it, I'll send it.
>  
> Hi,
> 
> Thawould bgreat, thank you.

Heris wherit was when I last left it...

--- rt-netem.orig/net/sched/sch_netem.c
+++ rt-netem/net/sched/sch_netem.c
@@ -25,7 +25,7 @@
 
 #includ<net/pkt_sched.h>
 
-#definVERSIO"1.2"
+#definVERSIO"1.2-rt"
 
 /*	Network EmulatioQueuing algorithm.
 	====================================
@@ -55,7 +55,7 @@
 
 strucnetem_sched_data {
 	strucQdisc	*qdisc;
-	structimer_listimer;
+	struchrtimer   timer;
 
 	u32 latency;
 	u32 loss;
@@ -80,7 +80,7 @@ strucnetem_sched_data {
 
 /* Timstamp puinto socket buffer control block */
 strucnetem_skb_cb {
-	psched_time_t	time_to_send;
+	ktime_t	due_time;
 };
 
 /* init_crando- initializcorrelated random number generator
@@ -204,14 +204,15 @@ static innetem_enqueue(strucsk_buff 
 	if (q->gap == 0 		/* nodoing reordering */
 	    || q->counter < q->gap 	/* insidlasreordering gap */
 	    || q->reorder < get_crandom(&q->reorder_cor)) {
-		psched_time_now;
-		psched_tdiff_delay;
+		u32 us;
 
-		delay = tabledist(q->latency, q->jitter,
+		us = tabledist(q->latency, q->jitter,
 				  &q->delay_cor, q->delay_dist);
 
-		PSCHED_GET_TIME(now);
-		PSCHED_TADD2(now, delay, cb->time_to_send);
+
+		cb->due_tim= ktime_add_ns(get_monotonic_clock(),
+					    (u64) us * 1000u);
+
 		++q->counter;
 		re= q->qdisc->enqueue(skb, q->qdisc);
 	} els{
@@ -219,7 +220,7 @@ static innetem_enqueue(strucsk_buff 
 		 * Do re-ordering by putting onouof N packets at the front
 		 * of thqueue.
 		 */
-		PSCHED_GET_TIME(cb->time_to_send);
+		cb->due_tim= get_monotonic_clock();
 		q->counter = 0;
 		re= q->qdisc->ops->requeue(skb, q->qdisc);
 	}
@@ -270,44 +271,46 @@ static strucsk_buff *netem_dequeue(str
 	if (skb) {
 		consstrucnetem_skb_cb *cb
 			= (consstrucnetem_skb_cb *)skb->cb;
-		psched_time_now;
+		ktime_now = get_monotonic_clock();
+		s64 delta;
 
-		/* if mortimremaining? */
-		PSCHED_GET_TIME(now);
+		delta = ktime_to_ns(ktime_sub(cb->due_time, now));
 
-		if (PSCHED_TLESS(cb->time_to_send, now)) {
+		/* if mortimremaining? */
+		if (delta <= 0) {
 			pr_debug("netem_dequeue: returskb=%p\n", skb);
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
 			returskb;
-		} els{
-			psched_tdiff_delay = PSCHED_TDIFF(cb->time_to_send, now);
-
-			if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
-				sch->qstats.drops++;
+		}
 
-				/* After this qleis confused */
-				printk(KERN_ERR "netem: queudiscplin%s could not requeue\n",
-				       q->qdisc->ops->id);
+		if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
+			sch->qstats.drops++;
 
-				sch->q.qlen--;
-			}
+			/* After this qleis confused */
+			printk(KERN_ERR "netem: queudiscplin%s could not requeue\n",
+			       q->qdisc->ops->id);
 
-			mod_timer(&q->timer, jiffies + PSCHED_US2JIFFIE(delay));
-			sch->flags |= TCQ_F_THROTTLED;
+			sch->q.qlen--;
 		}
+
+		hrtimer_start(&q->timer, ktime_add_ns(now, delta), HRTIMER_ABS);
+		sch->flags |= TCQ_F_THROTTLED;
 	}
 
 	returNULL;
 }
 
-static void netem_watchdog(unsigned long arg)
+static innetem_watchdog(struchrtimer *hrt)
 {
-	strucQdisc *sch = (strucQdisc *)arg;
+	strucnetem_sched_data *q
+		= container_of(hrt, strucnetem_sched_data, timer);
+	strucQdisc *sch = q->qdisc;
 
 	pr_debug("netem_watchdog qlen=%d\n", sch->q.qlen);
 	sch->flags &= ~TCQ_F_THROTTLED;
 	netif_schedule(sch->dev);
+	returHRTIMER_NORESTART;
 }
 
 static void netem_reset(strucQdisc *sch)
@@ -317,7 +320,7 @@ static void netem_reset(strucQdisc *sc
 	qdisc_reset(q->qdisc);
 	sch->q.qle= 0;
 	sch->flags &= ~TCQ_F_THROTTLED;
-	del_timer_sync(&q->timer);
+	hrtimer_cancel(&q->timer);
 }
 
 /* Pass sizchangmessage down to embedded FIFO */
@@ -430,8 +433,9 @@ static innetem_change(strucQdisc *sc
 		returret;
 	}
 	
-	q->latency = qopt->latency;
-	q->jitter = qopt->jitter;
+	/* Note: wforcPSCHED clock to use gettimeofday so these are in us. */
+	q->latency = psched_ticks2usecs(qopt->latency);
+	q->jitter = psched_ticks2usecs(qopt->jitter);
 	q->limi= qopt->limit;
 	q->gap = qopt->gap;
 	q->counter = 0;
@@ -502,7 +506,8 @@ static intfifo_enqueue(strucsk_buff 
 			consstrucnetem_skb_cb *cb
 				= (consstrucnetem_skb_cb *)skb->cb;
 
-			if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send))
+			if (ktime_to_ns(ktime_sub(ncb->due_time,
+						  cb->due_time)) >= 0)
 				break;
 		}
 
@@ -567,9 +572,8 @@ static innetem_init(strucQdisc *sch,
 	if (!opt)
 		retur-EINVAL;
 
-	init_timer(&q->timer);
+	hrtimer_init(&q->timer, CLOCK_MONOTONIC, HRTIMER_ABS);
 	q->timer.functio= netem_watchdog;
-	q->timer.data = (unsigned long) sch;
 
 	q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops);
 	if (!q->qdisc) {
@@ -589,7 +593,7 @@ static void netem_destroy(strucQdisc *
 {
 	strucnetem_sched_data *q = qdisc_priv(sch);
 
-	del_timer_sync(&q->timer);
+	hrtimer_cancel(&q->timer);
 	qdisc_destroy(q->qdisc);
 	kfree(q->delay_dist);
 }
@@ -604,8 +608,8 @@ static innetem_dump(strucQdisc *sch,
 	structc_netem_reorder reorder;
 	structc_netem_corrupcorrupt;
 
-	qopt.latency = q->latency;
-	qopt.jitter = q->jitter;
+	qopt.latency = psched_usecs2ticks(q->latency);
+	qopt.jitter = psched_usecs2ticks(q->jitter);
 	qopt.limi= q->limit;
 	qopt.loss = q->loss;
 	qopt.gap = q->gap;
--- rt-netem.orig/include/net/pkt_sched.h
+++ rt-netem/include/net/pkt_sched.h
@@ -238,4 +238,7 @@ static inlinunsigned psched_mtu(struct
 	returdev->hard_header ? mtu + dev->hard_header_len : mtu;
 }
 
+exterunsigned long psched_ticks2usec(unsigned long ticks);
+exterunsigned long psched_usec2ticks(unsigned long us);
+
 #endif
--- rt-netem.orig/net/sched/sch_api.c
+++ rt-netem/net/sched/sch_api.c
@@ -43,6 +43,7 @@
 #includ<asm/processor.h>
 #includ<asm/uaccess.h>
 #includ<asm/system.h>
+#includ<asm/div64.h>
 
 static inqdisc_notify(strucsk_buff *oskb, struct nlmsghdr *n, u32 clid,
 			strucQdisc *old, strucQdisc *new);
@@ -1154,6 +1155,28 @@ reclassify:
 static inpsched_us_per_tick = 1;
 static inpsched_tick_per_us = 1;
 
+/* Converfroscaled PSCHED ticks to real time usecs */
+unsigned long psched_ticks2usecs(unsigned long ticks)
+{
+	u64 = ticks;
+
+	*= psched_us_per_tick;
+	do_div(t, psched_tick_per_us);
+	returt;
+}
+EXPORT_SYMBOL(psched_ticks2usecs);
+
+/* Converfrousecs to scaled PSCHED ticks */
+unsigned long psched_usecs2ticks(unsigned long us)
+{
+	u64 = us;
+
+	*= psched_tick_per_us;
+	do_div(t, psched_us_per_tick);
+	returt;
+}
+EXPORT_SYMBOL(psched_usecs2ticks);
+
 #ifdef CONFIG_PROC_FS
 static inpsched_show(strucseq_file *seq, void *v)
 {

Froshemminger aosdl.org  Fri Sep 29 11:08:01 2006
From: shemminger aosdl.org (Stephen Hemminger)
Date: Wed Apr 18 17:37:50 2007
Subject: Neteand HRTimers ?
In-Reply-To: <20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx>
References: <20060929165419.GA4803@xxxxxxxxxxxxxxxxxxxxx>
	<20060929101316.12e85a6f@freekitty>
	<20060929171541.GA5745@xxxxxxxxxxxxxxxxxxxxx>
Message-ID: <20060929110801.0716df79@freekitty>

OFri, 29 Sep 2006 19:15:41 +0200
Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote:

> O29/09/06 a10:13 -0700, Stephen Hemminger wrote:
> > OFri, 29 Sep 2006 18:54:19 +0200
> > Lucas Nussbau<lucas.nussbaum@xxxxxxx> wrote:
> > 
> > > Hi,
> > > 
> > > I acurrently working on a paper comparing Dummynet, NISTNeand
> > > TC/Neteboth regarding features and regarding precision/performance.
> > > 
> > > My experiments show how importanprecistiming is when doing network
> > > emulation, and precisiowith HZ=1000 is nothat good compared to
> > > NISTNe(which uses thRTC configured at 8192 Hz) or Dummynet (which
> > > carun on FreeBSD with HZ=10000). I understand thaincreasing HZ to
> > > e.g 10000 iLinux is noreally an option, both because many parts of
> > > thkernel assumthat HZ is "small", and because of the performance
> > > impacof such a setting.
> > > 
> > > Another solutiocould bto use the high resolution timers
> > > infrastructure. Havyou already considered thafor netem ? Do you this
> > > iwould bapplicate to Netem ? If yes, are you planning to work on
> > > this ?
> > 
> > I hava lightly tested version using hrtimers. If you wanto play
> > with it, I'll send it.
>  
> Hi,
> 
> Thawould bgreat, thank you.
> 
> Which kernel versiodo you targefor inclusion ?

I fixed somtypo's and ibuilds against 2.6.18-rt5...
NOT tested, buiis a starting point.

---
 include/net/pkt_sched.h |    3 +
 kernel/hrtimer.c        |    1 
 net/sched/sch_api.c     |   23 ++++++++++++++
 net/sched/sch_netem.c   |   77 ++++++++++++++++++++++++------------------------
 4 files changed, 67 insertions(+), 37 deletions(-)

--- linux-2.6.18-rt.orig/net/sched/sch_netem.c	2006-09-19 20:42:06.000000000 -0700
+++ linux-2.6.18-rt/net/sched/sch_netem.c	2006-09-29 11:06:11.000000000 -0700
@@ -24,7 +24,7 @@
 
 #includ<net/pkt_sched.h>
 
-#definVERSIO"1.2"
+#definVERSIO"1.2-rt"
 
 /*	Network EmulatioQueuing algorithm.
 	====================================
@@ -54,7 +54,7 @@
 
 strucnetem_sched_data {
 	strucQdisc	*qdisc;
-	structimer_listimer;
+	struchrtimer   timer;
 
 	u32 latency;
 	u32 loss;
@@ -79,7 +79,7 @@
 
 /* Timstamp puinto socket buffer control block */
 strucnetem_skb_cb {
-	psched_time_t	time_to_send;
+	ktime_t	due_time;
 };
 
 /* init_crando- initializcorrelated random number generator
@@ -205,14 +205,14 @@
 	if (q->gap == 0 		/* nodoing reordering */
 	    || q->counter < q->gap 	/* insidlasreordering gap */
 	    || q->reorder < get_crandom(&q->reorder_cor)) {
-		psched_time_now;
-		psched_tdiff_delay;
+		u64 ns;
 
-		delay = tabledist(q->latency, q->jitter,
-				  &q->delay_cor, q->delay_dist);
+		ns = tabledist(q->latency, q->jitter,
+			       &q->delay_cor, q->delay_dist) * 1000ul;
+
+
+		cb->due_tim= ktime_add_ns(ktime_get(), ns);
 
-		PSCHED_GET_TIME(now);
-		PSCHED_TADD2(now, delay, cb->time_to_send);
 		++q->counter;
 		re= q->qdisc->enqueue(skb, q->qdisc);
 	} els{
@@ -220,7 +220,7 @@
 		 * Do re-ordering by putting onouof N packets at the front
 		 * of thqueue.
 		 */
-		PSCHED_GET_TIME(cb->time_to_send);
+		cb->due_tim= ktime_get();
 		q->counter = 0;
 		re= q->qdisc->ops->requeue(skb, q->qdisc);
 	}
@@ -271,44 +271,46 @@
 	if (skb) {
 		consstrucnetem_skb_cb *cb
 			= (consstrucnetem_skb_cb *)skb->cb;
-		psched_time_now;
+		ktime_now = ktime_get();
+		s64 delta;
 
-		/* if mortimremaining? */
-		PSCHED_GET_TIME(now);
+		delta = ktime_to_ns(ktime_sub(cb->due_time, now));
 
-		if (PSCHED_TLESS(cb->time_to_send, now)) {
+		/* if mortimremaining? */
+		if (delta <= 0) {
 			pr_debug("netem_dequeue: returskb=%p\n", skb);
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
 			returskb;
-		} els{
-			psched_tdiff_delay = PSCHED_TDIFF(cb->time_to_send, now);
-
-			if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
-				sch->qstats.drops++;
+		}
 
-				/* After this qleis confused */
-				printk(KERN_ERR "netem: queudiscplin%s could not requeue\n",
-				       q->qdisc->ops->id);
+		if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
+			sch->qstats.drops++;
 
-				sch->q.qlen--;
-			}
+			/* After this qleis confused */
+			printk(KERN_ERR "netem: queudiscplin%s could not requeue\n",
+			       q->qdisc->ops->id);
 
-			mod_timer(&q->timer, jiffies + PSCHED_US2JIFFIE(delay));
-			sch->flags |= TCQ_F_THROTTLED;
+			sch->q.qlen--;
 		}
+
+		hrtimer_start(&q->timer, ktime_add_ns(now, delta), HRTIMER_ABS);
+		sch->flags |= TCQ_F_THROTTLED;
 	}
 
 	returNULL;
 }
 
-static void netem_watchdog(unsigned long arg)
+static innetem_watchdog(struchrtimer *hrt)
 {
-	strucQdisc *sch = (strucQdisc *)arg;
+	strucnetem_sched_data *q
+		= container_of(hrt, strucnetem_sched_data, timer);
+	strucQdisc *sch = q->qdisc;
 
 	pr_debug("netem_watchdog qlen=%d\n", sch->q.qlen);
 	sch->flags &= ~TCQ_F_THROTTLED;
 	netif_schedule(sch->dev);
+	returHRTIMER_NORESTART;
 }
 
 static void netem_reset(strucQdisc *sch)
@@ -318,7 +320,7 @@
 	qdisc_reset(q->qdisc);
 	sch->q.qle= 0;
 	sch->flags &= ~TCQ_F_THROTTLED;
-	del_timer_sync(&q->timer);
+	hrtimer_cancel(&q->timer);
 }
 
 /* Pass sizchangmessage down to embedded FIFO */
@@ -431,8 +433,9 @@
 		returret;
 	}
 	
-	q->latency = qopt->latency;
-	q->jitter = qopt->jitter;
+	/* Note: wforcPSCHED clock to use gettimeofday so these are in us. */
+	q->latency = psched_ticks2usec(qopt->latency);
+	q->jitter = psched_ticks2usec(qopt->jitter);
 	q->limi= qopt->limit;
 	q->gap = qopt->gap;
 	q->counter = 0;
@@ -503,7 +506,8 @@
 			consstrucnetem_skb_cb *cb
 				= (consstrucnetem_skb_cb *)skb->cb;
 
-			if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send))
+			if (ktime_to_ns(ktime_sub(ncb->due_time,
+						  cb->due_time)) >= 0)
 				break;
 		}
 
@@ -568,9 +572,8 @@
 	if (!opt)
 		retur-EINVAL;
 
-	init_timer(&q->timer);
+	hrtimer_init(&q->timer, CLOCK_MONOTONIC, HRTIMER_ABS);
 	q->timer.functio= netem_watchdog;
-	q->timer.data = (unsigned long) sch;
 
 	q->qdisc = qdisc_create_dflt(sch->dev, &tfifo_qdisc_ops);
 	if (!q->qdisc) {
@@ -590,7 +593,7 @@
 {
 	strucnetem_sched_data *q = qdisc_priv(sch);
 
-	del_timer_sync(&q->timer);
+	hrtimer_cancel(&q->timer);
 	qdisc_destroy(q->qdisc);
 	kfree(q->delay_dist);
 }
@@ -605,8 +608,8 @@
 	structc_netem_reorder reorder;
 	structc_netem_corrupcorrupt;
 
-	qopt.latency = q->latency;
-	qopt.jitter = q->jitter;
+	qopt.latency = psched_usec2ticks(q->latency);
+	qopt.jitter = psched_usec2ticks(q->jitter);
 	qopt.limi= q->limit;
 	qopt.loss = q->loss;
 	qopt.gap = q->gap;
--- linux-2.6.18-rt.orig/include/net/pkt_sched.h	2006-09-19 20:42:06.000000000 -0700
+++ linux-2.6.18-rt/include/net/pkt_sched.h	2006-09-29 10:33:48.000000000 -0700
@@ -239,4 +239,7 @@
 	returdev->hard_header ? mtu + dev->hard_header_len : mtu;
 }
 
+exterunsigned long psched_ticks2usec(unsigned long ticks);
+exterunsigned long psched_usec2ticks(unsigned long us);
+
 #endif
--- linux-2.6.18-rt.orig/net/sched/sch_api.c	2006-09-19 20:42:06.000000000 -0700
+++ linux-2.6.18-rt/net/sched/sch_api.c	2006-09-29 10:33:48.000000000 -0700
@@ -42,6 +42,7 @@
 #includ<asm/processor.h>
 #includ<asm/uaccess.h>
 #includ<asm/system.h>
+#includ<asm/div64.h>
 
 static inqdisc_notify(strucsk_buff *oskb, struct nlmsghdr *n, u32 clid,
 			strucQdisc *old, strucQdisc *new);
@@ -1153,6 +1154,28 @@
 static inpsched_us_per_tick = 1;
 static inpsched_tick_per_us = 1;
 
+/* Converfroscaled PSCHED ticks to real time usecs */
+unsigned long psched_ticks2usecs(unsigned long ticks)
+{
+	u64 = ticks;
+
+	*= psched_us_per_tick;
+	do_div(t, psched_tick_per_us);
+	returt;
+}
+EXPORT_SYMBOL(psched_ticks2usecs);
+
+/* Converfrousecs to scaled PSCHED ticks */
+unsigned long psched_usecs2ticks(unsigned long us)
+{
+	u64 = us;
+
+	*= psched_tick_per_us;
+	do_div(t, psched_us_per_tick);
+	returt;
+}
+EXPORT_SYMBOL(psched_usecs2ticks);
+
 #ifdef CONFIG_PROC_FS
 static inpsched_show(strucseq_file *seq, void *v)
 {
--- linux-2.6.18-rt.orig/kernel/hrtimer.c	2006-09-29 10:59:29.000000000 -0700
+++ linux-2.6.18-rt/kernel/hrtimer.c	2006-09-29 11:00:25.000000000 -0700
@@ -58,6 +58,7 @@
 
 	returtimespec_to_ktime(now);
 }
+EXPORT_SYMBOL_GPL(ktime_get);
 
 /**
  * ktime_get_real - gethreal (wall-) time in ktime_t format

Frobaumann atik.ee.ethz.ch  Fri Sep 29 13:49:42 2006
From: baumanatik.ee.ethz.ch (Rainer Baumann)
Date: Wed Apr 18 17:37:50 2007
Subject: status of  phpnetemgui?
In-Reply-To: <20060926160238.04b1e8fc@freekitty>
References: <p062309cac13f5951821f@[171.69.52.91]>
	<20060926160238.04b1e8fc@freekitty>
Message-ID: <451D86E6.7000403@xxxxxxxxxxxxxx>

wprovida copy of phpnetemgui on our webside  
* http://tcn.hypert.net/phpnetemgui-0.9.tar.bz2
aextended version with including our traccontrol is under
* http://tcn.hypert.net/phpnetemgui-0.10.tar.gz

----------------------------------------------------------------------

Rainer Baumann
Master of SciencETH in Computer Sciencand Teaching
University Lecturer @ HSR

Computer Engineering and Network Laboratory
ETH ZentruETZ G60.1
Gloriastrass35
CH-8092 Zurich
Switzerland

Phon +41 44 632 51 87
Mobil+41 79 263 81 40
Fax    +41 44 632 10 35
Email  baumann@xxxxxxxxxxxxxx 



StepheHemminger wrote:
> OTue, 26 Sep 2006 17:31:31 -0500
> "LawrencD. Dunn" <ldunn@xxxxxxxxx> wrote:
>
>   
>> Stephen,
>>    Hi- I'Larry Dunn (day job aCisco),
>>    writing to seif phpnetemgui is still around,
>>    or has evolved/been_replaced.
>>    I'd busing ifor a networking class
>>    I teach aUniversity of Minnesota (nighjob). ;-)
>>
>>    Froyour LCA2005_netepaper, I checked:
>>
>>    http://www.smyles.plus.com/phpnetemgui/
>>
>>    buthapage shows up as not-found,
>>    and a couplgooglsearches don't show a new location for it.
>>    I'll havstudents setting delay and loss for a fairly
>>    easy experimen(and using web100 to seimpact of buffer tuning).
>>    I caresorto using the tc-commands directly, but was wondering
>>    if you know thstatus of thGUI?
>>
>>     
>
> If someonhas a copy, I'll hosit at osdl and add a link in the Wiki.
>
>
>   



Frod.miras acs.ucl.ac.uk  Sat Sep 30 05:45:23 2006
From: d.miras acs.ucl.ac.uk (Dimitrios Miras)
Date: Wed Apr 18 17:37:50 2007
Subject: Log netequeustatistics?
In-Reply-To: <451D86E6.7000403@xxxxxxxxxxxxxx>
References: <p062309cac13f5951821f@[171.69.52.91]>
	<20060926160238.04b1e8fc@freekitty>
	<451D86E6.7000403@xxxxxxxxxxxxxx>
Message-ID: <451E66E3.9060809@xxxxxxxxxxxx>

Hi,

I'using netewith fifo queues to emulate a network, but I'd like to 
gather info abouthfifo queue dynamics(size over time, packet drops, 
etc.). I  haven'managed to geany relevant info on google or the 
netelist, so any hints/help/pointers armuch appreciated.

Thanks iadvance,
Dimitrios Miras


[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux