Re: [Last-Call] Iotdir telechat review of draft-ietf-babel-rtt-extension-05

Juliusz Chroboczek <jch@xxxxxxx> · Sat, 17 Feb 2024 20:22:17 +0100

Thanks for the lengthy review, Pascal.

> Missing ref to updated document

Right, I'll fix that.

> an applicability statement of the various possibilities would be useful
> in the future.  Could be a paper or an RFC.  AT least it would make
> sense to have an applicability section here.  For instance, IOT may
> experience large and asymetric delays

Section 1 describes the conditions in which we know the protocol to be
applicable: cross-continent overlay networks.  At this time, we are not
proposing this protocol for IoT-style applications, which have completely
different time-scales than cross-continent overlay networks.

We encourage people interested in IoT to borrow from our ideas, we'll take
it as a compliment.

> There's art in IGPs that allow configuring metrics or derive metrics from
> line speed. Is that availabe in current implementations?

One implementation does that, although with a coarser granularity than
e.g. OSPF.  However, that does not help us here, we're concerned about the
case where the interfaces are identical: we've got one tunnel to Lille and
one to Tokyo, and there's no local information to distinguish the two.

> Note that for the given example speed of light will certainly have
> measurable effects. But going to Orleans and back may be hidden inside
> e.g., wireless delays.

Yes, Orléans is at 500µs from Paris, way below the recommended value of
rtt-min (Section 4.2), and will therefore be classified as a local link.

> I'm effectively concerned with the effect of buffer bloats which could
> create oscillations exactly like early ARPNET load-based metric.

So are we, and that's what the whole of Section 4 is about.

>    We believe that this protocol may be useful in other situations...

> Not sure we want that text. Highly debatable until experimented with, see
> curre,t experimentations of ARVR on Wi-Fi which suffer from variable lags.

I'll remove this whole paragraph, it appears to only cause confusion.

>   A Babel speaker periodically sends Hello messages to its neighbours
>   (Section 3.4.1 of [RFC8966]).  Additionally, it occasionally sends a
>   set of IHU messages, at most one per neighbour (Section 3.4.2 of
>   [RFC8966]).

> define IHU on first use

Will do.

> explain what it is for vs Hello

The bit that you quoted explicitly references Section 3.4.2 of RFC 8966.
Are you suggesting that we need to repeat the contents of RFC 8966 here?
Please clarify.

> Ref IEEE 1588? there are many profiles for it; maybe this work could
> show as one.

I'm not sure what you're suggesting exactly.  Please clarify.

> Important to indicate which time stamps are used (eg where in the stack is t1
> measured). Do we measure the latency inside the sender meaning that the time
> stamp is that of the software above, or do we measure stating at MAX enqueue,
> or starting at PHY XMIT?

The implementation note in Section 3.4 recommends timestamping just before
the call to sendmsg.  I'll see if I can add some normative language to
this effect.

> For short distance / high precision as claimed in the introduction,

There is no such claim in the introduction.  The paragraph that confused
you was only meant to point at potentially interesting further reasearch,
I'll remove it.

>  Do we need a sequence counter to filter out bloated IHU answers that
> are received out of sync?

No, the Origin Timestamp field avoids the amibiguity.

>    In principle, this algorithm is inaccurate in the presence of clock
>    drift (i.e., when A's and B's clocks are running at different
>    frequencies).  However, t2' - t1' is usually on the order of seconds,
>    and significant clock drift is unlikely to happen at that time scale.

> back to applicability of the work. I believe some expectations on the clock
> drift vs RTT can be made for modern hardware. Nodes have an idea of which clock
> they use and what drift they have. The draft could recommend that the clocking
> error be 2 orders of magnitude less than the RTTs that the protocol measures,
> else the measurement cannot be trusted.

With the default parameters used by Babel, the time between Hello and IHU
is 2s on average.  A cheap crystal oscillator, such as used in consumer
electronics, has a typical drift of 10ppm (30ppm worst case), leading to an
error of 20µs (60µs worst case).

I also don't see where the "two orders of magnitude" figure comes from.
The goal of this protocol is to disambiguate between local and distant
routes, not to accurately determine the physical properties of links.

>    When a Hello TLV is buffered for transmission, we insert a PadN sub-
>    TLV (Section 4.7.2 of [RFC8966]) with a length of 4 octets within the
>    TLV.  When the packet is ready to be sent, we check whether it
>    contains a 4-octet PadN sub-TLV; we then overwrite the PadN sub-TLV
>    with a Timestamp sub-TLV with the current time, and send out the
>    packet.
> 
> hardware will not do that.

Granted, it probably won't.

> Back to my earlier question of which step in the stack is relevant for
> this measurement. Surelly any step that is dependent on the load of this
> system (variable but independent of the link being used) as opposed to
> the load to the transmission should be omitted.

So if a router is loaded, we'll get an extra 1ms jitter.  This is not
likely to impact route selection, and even if it does, it will merely
cause the protocol to route around overloaded routers.

>    Second, using the RTT signal for route selection gives rise to a
>    negative feedback loop: when a route has a low RTT, it is deemed to
>    be more desirable, which causes it to be used for more data traffic,
>    which may lead to congestion, which in turn increases the RTT.
>    Without some form of hysteresis, using RTT for route selection would
>    lead to oscillations between parallel routes, which might lead to
>    packet reordering and negatively affect upper-layer protocols (such
>    as TCP).
> 
> I believe this discussion should be seen earlier in the text, eg in the
> introduction (not the solution but at least that the issue exists and is
> addressed in the protocol). See my early comment on ARPANET.

I most respectfully disagree.  This document is structured in two parts,
a first part that defines a subprotocol that produces a continuous stream
of RTT samples, and a second part that describes an algorithm to extract
from that stream information that is useful for route selection.

> 4.3.  Hysteresis
> 
>    Even after applying a bounded mapping from smoothed RTT to a cost
>    value, the cost may fluctuate when a link's RTT is between rtt-min
>    and rtt-max.  This is effectively mitigated by using a robust
>    hysteresis algorithm, such as the one described in Appendix A.3 of
>    [RFC8966].
> 
> if this is what solves the oscillation issue please mention it,

No, it's more complex than that.  There are three disctinct mechanisms
that collaborate to avoid oscilliations.  The smoothing in Section 4.1
avoids oscillations due to outliers.  The non-linear mapping from RTT to
cost described in Section 4.2 avoids oscillations for good links (below
rtt-min) and for bad links (above rtt-max).  Hysteresis is a last-resort
mechanism that mitigates the issue for links between rtt-min and rtt-max.

I've just re-read Section 4, and I think it's clear enough.  Please let me
know if you have suggestions to make it better.

> Maybe discuss the consequences of a MIM that modifies the values eg to
> discourage Paris to Paris and cause routing via Tokyo?

If you're not using cryptographic signatures, then a MITM has easier ways
to redirect traffic.  See Section 6 of RFC 8966.

-- Juliusz

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call