Re: [Last-Call] Opsdir last call review of draft-ietf-v6ops-slaac-renum-03

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Jürgen,

Thanks a lot for your comments! In-line....

On 9/9/20 19:05, Jürgen Schönwälder via Datatracker wrote:
[....]

Perhaps indicate a bit earlier what unacceptably long means, i.e. we
are talking about days and weeks.

This is a bit subjective. If I'm sitting on my computer doing e.g. video-conferencing (i.e., anything interactive), probably anything over a few minutes would be unacceptable. In a more general case, what's acceptable is a function of how often the problem happens and whether there's any ongoing interactive usage -- and that's still subjective.


The scenarios described read a bit
like somewhat rare events and hence it is useful for the reader to
have an idea what unacceptably long means in such events.

I wondering if adding something like:
" Any definition of what is considered 'acceptable' here would be subjective, and would probably also depend on how often these flash-renumbering events occur, whether the affected hosts are employing any interactive applications, and other parameters. However, one rough estimate would be that hosts should be able to deal with flash-renumbering events with a similar timeliness with which they can deal with failing default routers."

would help?


(BTW, I find
the scenario not described at the beginning where a router announces
SLAAC lifetimes that are not synchronized with obtained prefix
lifetimes operationally the more tricky problem since this can lead to
regular failures.)

Fair enough. How about adding this to the bulleted-list:

" o A router (e.g. Customer Edge router) may advertise autoconfiguration prefixes corresponding to prefixes learned via DHCPv6-PD with constant PIO lifetimes that are not synchronized with the DHCPv6-PD lease time (as required in Section 6.3 of [RFC8415]). While this behavior violates the aforementioned requirement from [RFC8415], it is not an unusual behavior, particularly when e.g. DHCPv6-PD is implemented in a different software module than the SLAAC router component.".

?



Section 2.2 seems to confuse soft-state (this is what a learned IPv6
prefix is for me) with certain protocol timers. There are many places
where protocols use soft-state and implementations use timers to purge
or refresh soft-state. That timers generally do not go off in normal
conditions is not really correct in this context, DHCP leases are
renewed when their lifetime expires, a normal operation.

Normally, you renew the lease before the lease expires.


IP address
mappings to Ethernet addresses expire when their lifetime timer goes
off.

This one is not the necessarily the best example ;-) (while RFC1122 requires that, IIRC in many implementations the entry is refreshed when referenced, and it only expires when not referenced/refreshed frequently enough).

But I do see where you are going and I realize that the text is a bit sloppy in this respect. How about tweaking the text as follows:

---- cut here ----
Many protocols, from different layers, normally employ timers for fault isolation/recovery. The
   general logic is as follows:

   o  A timer is set with a value such that, under normal conditions,
      the timer does *not* go off.

   o  Whenever a fault condition arises, the timer goes off, and the
      protocol can perform fault recovery

For example, when implementing reliability mechanisms, a timer is normally set when a packet is transmitted and, unless a response is received before the timer goes off, a fault recovery action (such as packet re-transmission) is triggered.
---- cut here ----

?

One might also look at this same issue as the timer implying a sensible period of time where information should be refreshed, as you correctly point out, though.

(I guess the only difference is that when looking at this form the soft-state angle, you're mostly considering the case where information changes, whereas when looking at this from the fault-recovery pov, you're mostly thinking about failures, rather than updates).


Switches purge forwarding state regularly when forwarding entries
expire. Cached DNS name to IP resolutions expire. The only problem
here seems to be that a lifetime of 7 days / 30 days is a bit
ridiculous.

Agreed.


Is anyone shipping the RFC 4861 defaults?

Yes, unfortunately. Some implementations override the RFC4861 defaults. Still, RFC4861 defaults are extremely common and widespread.



The few
implementations I have seen do use a bit more reasonable defaults.  I
think this section should be rewritten to replace the "timer going off
is associated with a failure" text with a discussion of	soft-state in
other protocols. (Section 2.2 is why I ticked 'has issues'.)

As a second alternative to what I've suggested above:

---- cut here ----
   Many protocols, from different layers, normally employ timers for a
   variety of purposes, such as in fault isolation/recovery mechanisms,
   and in the maintenance of data structures that contain bindings of
   some sort (e.g., the IPv6 Neighbor Cache [RFC4861]).

   In the case of fault recovery/isolation, the general logic is as
   follows:

   o  A timer is set with a value such that, under normal conditions,
      the timer does *not* go off.

   o  Whenever a fault condition arises, the timer goes off, and the
      protocol can perform fault recovery

    For example, when implementing reliability mechanisms, a timer is
    normally set when a packet is transmitted and, unless a response is
    received before the timer goes off, a fault recovery action (such as
    packet re-transmission) is triggered.

On the other hand, when maintaining bindings in data structures, timers are usually selected in a way that any bindings that become stale are updated in a timely manner.
---- cut here ----


?



Isn't a part of the solution (other than moving to less ridiculous
default) that SLAAC hosts experiencing connectivity problems should
try to validate the prefix that they have learned (and if the
validation fails move to a newly learned prefix)?

Yes, indeed. That's what we are pursuing in draft-ietf-6man-slaac-renum. (see Section 4 of this (draft-ietf-v6ops-slaac-renum-03) document).

draft-ietf-v6ops-slaac-renum-03 contains the problem statement and *operational* mitigations only.


Involving the hosts
in a resolution of the problem may be	more robust than expecting that
something in the network takes care of invalidating stale soft-state.

I agree 100%. That is and has been, indeed, the motivation for pursuing draft-ietf-6man-slaac-renum.

Thanks!

Regards,
--
Fernando Gont
SI6 Networks
e-mail: fgont@xxxxxxxxxxxxxxx
PGP Fingerprint: 6666 31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492




--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call




[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux