Re: Burst size of TFRC mechanism unbounded

Eddie Kohler <kohler@xxxxxxxxxxx> · Wed, 07 Feb 2007 16:44:36 -0800

Hi Gerrit,

In my response to Ian, I proposed a time-based compensation mechanism rather 
than a fixed token bucket byte limit.  (Please note that I corrected the first 
mail.)  I think a byte limit of floor(MSS/s) is way too conservative, and any 
fixed byte-based compensation mechanism is the wrong way to go.  My reasoning 
is as follows.

TFRC and RFC3448 Section 4.6 allow a sender to achieve an average rate via 
packet bursts.  This may be necessary if a sender has coarse timer granularity 
or otherwise finds it difficult or expensive to send packets at smooth rates. 
 For instance because high-rate timers are expensive, as you have seen. :)

An application with coarse timer granularity can achieve high rates *ONLY* by 
sending bursts of packets.  And the higher the rate, the larger burwantssts 
are required.

Any byte-based/token-bucket compensation mechanism, but ESPECIALLY something 
really conservative like limiting burst size to floor(MSS/s), will therefore 
penalize fast flows, preventing endpoints from achieving high rates.  This is 
exactly what the RFC tries to prevent.

The solution is to base the token bucket size on the current rate: the higher 
the rate, the higher the token bucket size.  This follows the RFC's stated 
intent of allowing coarse timer implementations to achieve high rates.  It 
also makes intuitive sense.  If high rates are allowed then medium-sized 
bursts are obviously not too much of a problem!

My suggested time-based compensation mechanisms achieve this goal.  If t_nom 
can be at most N seconds behind, then the token bucket has size X_inst*N.  Of 
my two suggested values for N, RTT/2 is the more aggressive; t_gran, or a 
small multiple of t_gran, may be safer in practice.  It would seem easy to 
experiment with these choices.

(Please note that my first response to Ian had an error.  I said t_ipi := 
max(t_ipi, t_now - N) when i meant t_nominal := max(t_nominal, t_now - N).)

Eddie

Gerrit Renker wrote:
We were experiencing problems with the TFRC/CCID 3 packet scheduling mechanism on
the implementation mailing list; these can be summarised as follows.

The packet scheduling mechanism of [RFC 3448, 4.6] is in principle a simple Token
Bucket Filter: tokens are placed at a rate of 1/t_ipi into the bucket, and each
time the sender finds at least one token in the bucket, the packet can be sent.

Under `normal' conditions, the bucket size of the TBF is equal to 1; in this case,
a continuous stream of packets is always scheduled at the precalculated nominal
sending times t_nom.

The following conditions result in a bucket size different from 1 (the last two
were pointed out by Ian in the previous email):

  (a) tardiness due to scheduling granularity (as per 4.6 in RFC 3448)
  (b) application is idle for a longer while
  (c) application emits packets at a rate which is small compared to X/s

In these `non-normal' cases, it can happen that the current time t_now is several
multiples of t_ipi later _after_ the scheduled nominal sending time t_nom. This
accrues a burst size of
                 beta = floor( (t_now - t_nom) / t_ipi ) - 1
packets which the sender is permitted to send immediately. 

The problem that we are experiencing is that beta grows unbounded. 

A previous attempt to fix this problem has been to re-set the nominal sending time 
t_nom whenever such a credit had accumulated: this is equivalent of enforcing a bucket
size of 1. There was disagreement with this solution, since RFC 3448 explicitly permits
bursts. But we were then experiencing almost arbitrarily large burst sizes (note the 
cases stated in Ians earlier email); without an upper bound for beta there is no 
regulation of the sender behaviour anymore.

PROBLEM: What is a reasonable upper bound for the bucket/burst size? Is for instance
         floor(MSS/s) (where `s' is the packet size and MSS the path MTU minus the size
	 of the IP and DCCP headers) a conservative-enough upper bound?