I've reviewed this document as part of the transport area directorate's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors for their information and to allow them to address any issues raised. The authors should consider this review together with any other last-call comments they receive. Please always CC tsv-dir@xxxxxxxx if you reply to or forward this review. Generally the draft reads very well, and is clear and comprehensible. I had no difficulty understanding the draft overall, although occasionally terms were used before they were properly defined. This isn't really a big deal, but I've highlighted the below. >From a transport area point of view, the main things we're looking for are whether the protocol will be well-behaved, especially from the point of view of congestion control. VRRP does not perform any form of congestion control, but as it is purely a link-local protocol, configured by a network operator to provide redundancy between two routers on the same LAN segment, this is not really an issue. One presumes that a network operator choosing sub-second advertisement intervals knows what he or she is doing, and knows appropriate rates for their local circumstances. Routers provide many ways to shoot yourself in the foot, and this one doesn't seem worth of concern. However, the draft does have a weakness when it comes to congestion. It does not provide any guidance to router vendors as to whether VRRP packets should receive priority treatment when being transmitted. The potentially problematic situation occurs when a router is delivering more packets onto the LAN than can be accomodated, and so a queue builds up in the router. Typical default queuing delays tend to be some generic wide-area RTT (so that a delay-bandwidth product of packets can be queued). Thus packets being transmitted onto the VRRP-protected LAN could see perhaps 100ms or more of queuing delay. If VRRP packets enter such a queue, and the smallest VRRP Advertisement_Interval is configured, the Master_Down_Interval will be between 30 and 40ms. Thus normal queuing delays might cause a VRRP backup to conclude that the master is down, and therefore promote itself to master. Very shortly afterwards, the delayed VRRP packets from the master would arrive, causing a switch back to backup status. However this process can repeat many times per second, causing significant disruption to traffic. My feeling is that, if possible, VRRP packets should be priority-forwarded onto the LAN, mitigating this problem. However, it's not clear this is always possible. At the least, the draft ought to comment on this possible scenario and the risks of very low Advertisement_Interval values in the presence of congestion. It would presumably be possible for a VRRP master to observe such a situation is occurring frequently. Under such circumstances, at the least a good implementation should log that there is a problem. It might also be possible to specify that the master should automatically backoff its Advertisement_Interval value when it observes such thrashing. I'll leave it to the VRRP authors to think over whether that might be desirable or might have unintended consequences. I think the draft can be progressed without any such adaptive mechanism, but the authors may wish to think about it anyway, as it might improve VRRP's robustness. Detailed comments on the spec (many of these are nit-picking, but a few might be substantive): Section 1.2, para 4: "virtual router's IPv4 addresses" - at this point in reading the draft, it wasn't clear to me whether this was any of the addresses of any of the routers comprising the virtual router or whether it was the address being virtualized. This became clear later on in the draft. Section 1.3, para 4: "could be speeded up" - this is awkward English; I'd have written "could be sped up" or "could be made quicker" Section 2.1, para 1: "Backup of an IPvX address(es)" - awkward English - better would be "Backup of an IPvX address or addresses" Section 2.1, bullet 4 "Provide for election ... load balancing". It wasn't really clear to me what was meant by this. Section 2.4, para 1: "Sending either IPvX packets on..." It wasn't clear what this meant - do you mean "Sending either IPv4 or IPv6 packets"? Section 2.5 - is the internet draft referred to now dead? If so, this reference should probably be deleted, and the sentence rephrased appropriately. Section 3, para 1: "Each VRRP virtual router has a single well-known MAC address allocated to it." I understand what this means, but I wonder if this should be scoped to indicate it's true per link? There's some potential for misunderstanding that the same MAC address must be used for the same VRID on other interfaces of the same router, which I don't think is intended. Section 3, para 2: I found this whole paragraph somewhat confusing. It's trying to say too many things in too few sentences. Section 5.2.9, para 2: "associated with" is a rather vague term. Maybe "being backed up by" or something similar would be clearer? Section 6.1. Should the VR MAC address be a parameter? I think it should. Section 6.1: "Accept_Mode" - the explanation here could be clearer. An example would help. Section 6.4.1 - a "Startup event" is not defined. I can guess was is meant, but it might be better to define it. Section 6.4.2 "Broadcast gratuitous ARP request" should probably say "Broadcast gratuitous ARP request on that interface". Section 6.4.3 . "Note: IPv6 Neighbor Solicitations and Neighbor Advertisements should not be dropped when Accept_Mode is False." This would probably be better being stated a few lines earlier, in the two MUST bullets. In any event, it seems like it's a MUST not a SHOULD. Section 6.4.3, top of page 27. I think the actions are incorrect. They are currently: @ Cancel Adver_Timer @ Recompute the Master_Down_Interval @ Set Master_Adver_Interval to Adver Interval contained in the ADVERTISEMENT @ Set Master_Down_Timer to Master_Down_Interval @ Transition to the {Backup} state I think they should be: @ Cancel Adver_Timer @ Set Master_Adver_Interval to Adver Interval contained in the ADVERTISEMENT @ Recompute the Skew_Time @ Recompute the Master_Down_Interval @ Set Master_Down_Timer to Master_Down_Interval @ Transition to the {Backup} state Ie, it should explicitly say recompute Skew_Time, but more importantly, the Master_Adver_Internal needs to be set before computing Master_Down_Interval, otherwise you use the old (obsolete) value. Section 7.1: - MAY verify that "Count IPvX Addrs" and the list of IPvX Address matches the IPvX Address(es) configured for the VRID If the above check fails, the receiver SHOULD log the event and MAY indicate via network management that a misconfiguration was detected. If the packet was not generated by the address owner (Priority does not equal 255 (decimal)), the receiver MUST drop the packet, otherwise continue processing. This combination of MAY, SHOULD and MUST is confusing. Also it's not clear if the second "If" is conditional on the first "If". It seems like you may have a mandatory action that you can't do unless you do an optional action. Section 7.4, para 3: An implementation might choose simplify this for the operator by using the VRRP MAC in the formation of these link local addresses. Should be "might" be a MAY? "VRRP MAC" is not a defined term. I think the right term is "Virtual Router MAC Address". This whole paragraph seemed a little vague. Section 8.1.2: "When a VRRP router restarts or boots, it SHOULD not send any ARP messages with its physical MAC address for the IPv4 address it owns, it should only send ARP messages that include Virtual MAC addresses." How do you ssh to the physical router, if you're not sure which router you'll actually reach? Does this require a separate IPv4 address? "When configuring an interface, VRRP routers should broadcast a gratuitous ARP request containing the virtual router MAC address for each IPv4 address on that interface." Surely a VRRP router only does this when becoming the master? Otherwise backup routers can cause traffic to be blackholed when their interface is configured. Similar text appears in 8.2.2. Section 9.1. I don't know much about FDDI. If you do as described, I assume there's some way to avoid the packet circulating forever? To summarize: I think the draft has a few issues that need to be addressed before publication. Most of the issues are pretty minor; it's up to the IESG as to whether addressing them would require a new last call, but my expection is that it would not. That's all. I hope this is of some help to you. Cheers, Mark _______________________________________________ Ietf@xxxxxxxx https://www.ietf.org/mailman/listinfo/ietf