Hi everyone, i know this is not the right place to discuss this, but i assume some people here might have some good ideas which could help me. Also, i don't really know where else to turn .. I'm writing a tcp rate control implementation for linux at the moment. For people not familiar with rate control, is basically works by manipulating the tcp window size to force the sender not to exceed the bandwidth you would like a particular connection to have. The nice thing about it is it works without throwing packets away, you just "tell" the sender how fast you would like it to go. For the window size calculation the roundtrip time needs to be known. One approach would be to remember the time a segment passed and to calculate the difference when it is acknowledged. In order not to have to remeber many sequence numbers/times usually only one rtt per window size is calculated. This works fine for low packet rates (small windows), for high rates the estimated rtt may be seriously wrong. RFC1323 comes up with a solution for this, the TCP timestamp option: The sender puts a 32bit timestamp in the tcp header, the receiver echos this field in its acknowledge. The sender just has to calculate the difference to get the rtt. This can be done with every packet sent without storing additional data. The problem arises if you want to calculate rtt using timestamps from a man-in-the-middle position. The timestamps themselves are meaningless, you can't know how the sender chose them. One could remeber all timestamps and when they passed and calculate the difference when it is echoed back by the receiver, again this would mean storing probably many timestamps/receive times. Another way would be to replace them by your own timestamps, but this would prevent the real sender to perform accurate rtt estimation. A solution could work like this: RFC1323 specifies the senders timestamp clock should increase by one every 1ms - 1s. This means the low 16bit will wrap every ~11 minutes - ~18hours. We could just remeber the high 16bit and replace them with a 16bit timestamp of our own. On reception of a echoed timestamp, we calculate the difference and put the original 16bit back in and pass it on. The problem with this is that timestamps are not only used by the sender to calculate the rtt but also by the receiver for PAWS (protect against wrapped sequence numbers). From RFC1323: "PAWS uses the same TCP Timestamps option as the RTTM mechanism described earlier, and assumes that every received TCP segment (including data and ACK segments) contains a timestamp SEG.TSval whose values are monotone non-decreasing in time. The basic idea is that a segment can be discarded as an old duplicate if it is received with a timestamp SEG.TSval less than some timestamp recently received on this connection." This means we have to make sure the resulting timestamp (16bit our timestamp, original low 16bit) still has the property of beeing monotone non-decreasing in time, otherwise PAWS will reject retransmitted segments. The solution i came up with breaks PAWS itself, the protection against wrapped sequence numbers will be gone. This is not really a problem (remeber i need it for rate control) since rate control is usually not done on gigabit backbone routers but on corporate border routers. It works like this: Timestamp Option: 31 16 15 0 31 16 15 0 tsval: [ UH | LH ] tsecr: [ UH | LH ] UH means upper half, LH lower half, tsval is the senders timestamp, tsecr the echoed value. For each direction, three variables need to be kept: ts.UH Upper half of timestamps currently transmitted by sender ts.UH.last ts.UH before LH wraparound ts.wrap time wraparound occured On reception of a timestamp the following is done (in pseudo C code): /* tsval handling */ if (! ts.UH) ts.UH = tsval.UH; /* remeber upper half */ if (tsval.UH != ts.UH) { /* low 16 bit wraped */ ts.wrap = now; ts.UH.last = ts.UH; ts.UH = tsval.UH; } tsval.UH = now; /* put in out timestamp */ if (now == ts.wrap) tsval.UH++; /* increment UH to reflect LH wraparound */ /* tsecr handling */ rtt = tsecr.UH - now; if (tsecr.UH < ts.wrap) UH = ts.UH.last; /* if timestamp was generated before LH wrap around, put back last LH */ else UH = ts.UH; /* current LH otherwise */ This seems to keep the timestamp values seen by the receiver non-decreasing. The remaining problem are "Outdated Timestamps". From RFC1323: "If a connection remains idle long enough for the timestamp clock of the other TCP to wrap its sign bit, then the value saved in TS.Recent will become too old; as a result, the PAWS mechanism will cause all subsequent segments to be rejected, freezing the connection (until the timestamp clock wraps its sign bit again). With the chosen range of timestamp clock frequencies (1 sec to 1 ms), the time to wrap the sign bit will be between 24.8 days and 24800 days." A TCP usually takes care of this (wraparound after min. 24.8 days), but this will not be true anymore. if we choose our timestamp clock to increase once every 1 ms the sign bit will wrap after 5.5 minutes. I'm not sure what to do about this (this is why i'm writing), does anyone here have good ideas? I would also be happy about a completly different approach, somehing totaly passive would be nice .. :) Thanks (for the time you spent reading until down here :) Patrick _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/