Re: [PATCH] TCP Backup patch

Stephen Hemminger <shemminger@xxxxxxxx> · Wed, 27 Sep 2006 11:01:23 -0700

On Tue, 26 Sep 2006 17:55:48 -0700
Ben Woodard <woodard@xxxxxxxxxx> wrote:

> Here at LLNL we have a rather challenging network environment on our 
> clusters. We basically have 1000's of gigE links attached to an 
> oversubscribed federated network. Most of the time this network is idle 
> but the expected workload is for regular spikes extremely heavy activity 
> lasting a few minutes. All end-points in a highly coordinated manor, 
> typically after exiting an MPI barrier, start pushing as much data as 
> possible through the oversubscribed core.  The result is a wave of TCP 
> back-offs where all the TCP streams back-off in lock step. The network 
> oscillates from highly congested for brief moments to largely idle. 
> Given enough time TCP will settle down in to something mostly reasonable 
> but even then it causes us a few problems:

How far between flood points?  Are the connections correctly
going back to slow start?

Why not just set the cwnd clamp for that path to be low enough
to avoid excessive greediness? The clamp is per TCP connection so
if you have application specific knowledge you could just set
the limit to be:
	Bandwidth Delay Product / N connections = Cwnd limit

Probably add 10% to allow for some settling.

Also, perhaps you are seeing effect of older kernel and buggy version
of BIC?
-- 
Stephen Hemminger <shemminger@xxxxxxxx>
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html