>> To respond to your comment about "insufficient prior outreach to subject >> matter experts", We did have this very discussion, David, at my kitchen table, in the summer of 2018. We came to the conclusion that since Babel is designed to be robust against packet loss, drops due to occasional reordering (such as due to FSTP reonvergence) are not likely to be harmful. On the other hand, we were unable to envision a realistic situation in which reordering would cause systematic drops, of the kind that would harm Babel's convergence. You were somewhat doubtful that we could disregard reordering, but what convinced you was my argument that we could always add an IPsec-style sliding window without breaking compatibility. It turns out that we were right to wait for evidence: the reordering due to WiFi powersave is on the order of multiple milliseconds, way larger than what can reasonably be handled by a sliding window. Hence the mechanism RECOMMENDED in Section 3.1 of the current draft, which does not have a bound on the maximum amount of reordering that it can handle. > It's not like this was unforeseeable: it is well known that UDP and IP > guarantee nothing about packet delivery ordering, so any assumption > about ordering should immediately prompt additional scrutiny. As David explained, the issue here is not random reordering, which we are well aware of. The phenomenon that this draft aims to handle arises from a combination of multiple mechanisms at different layers: the handling of link-local multicast by WiFi powersave, the fact that one Babel implementation uses a combination of unicast and multicast in fixed patterns, and the fact that RFC 8967 does not handle reordering. It is fairly subtle: it only occurs with babeld, not with BIRD, and only in a non-default configuration. With all respect, Kyle, I find it difficult to envision that the problem could have been foreseen. > Back to your first point, yes the document does assume that, when treated > independently, link-local unicast and link-local multicast are generally > not reordered much in known deployed link layers. > > Can you provide a reference for this claim? Note again that the protocol will handle occasional, Babel is robust enough to handle occasional drops due to random reordering. We do not believe that systematic reodering within a 5-tuple, with the same ToS value, does happen in real networks. We only have circumstantial evidence: none of our users have reported issues with 8967 since we implemented Section 3.1. (Section 3.2 is implemented in BIRD but not in babeld.) While I share your anxiety, I am unwilling to make the protocol any more complex than what is required in order to work well in actual deployments. -- Juliusz -- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call