On 6/19/2017 10:48 PM, Petr Špaček wrote: > > On 18.6.2017 20:09, Joe Touch wrote: >> ... >> You're asking for 10,000 page specifications and 10MB protocol > We do not understand each other here. Let me explain what I meant in > different terms: > > For purpose of this example, please imagine that protocol description > (and sometimes also implementation) is a finite state automata. > > Right now, if a edge in the automata is not defined as result of > protocol under-specification, it is up to implementation to decide what > to do, I agree. A complete FSM specification, including specific actions for all undefined combinations of state and input, is what I suggest might be 10,000 pages and result in a 10MB implementation. > i.e. implementation has to guess sender's intent. The Postel Principle provides guidance here. The implementation isn't supposed to guess the other side's intent - it's supposed to consider (in some sense) any possible other side intent and act as safely as possible. This is why I think it helps to refer to Shannon/Weaver, who take intent out of the picture. > This "guess" part is where I see problems because various > implementations naturally differ in their "guesses" and this is causing > problems later on. The Postel Principle, IMO, is about agnosticism. If there is more than one way to interpret a received message, then consider that before reacting, and react safely (conservatively). > I argue that default action triggered by "imagined missing edge" in our > hypothetical protocol automata should be "report protocol specific error > to sender". A missing edge in a protocol spec isn't defined as an error unless the spec says exactly that. So reporting that edge as an error otherwise would be wrong. > This could be done by adding ~ two paragraphs to protocol specification. > One defining how error message looks like (this is already present in > e.g. DNS spec as FORMERR) and second explaining that non-defined inputs > should be treated as errors. That would result in a very vulnerable protocol, and a good reason why protocol specifications do not say this. The most correct thing a protocol FSM could do when receiving an undefined input is to DO NOTHING. Logging or user reporting might be useful when debugging, but there is no part of the TCP API that specifies that all errors are to be reported to the user. > Implementations always has to have some > code to handle erroneous inputs (ala switch { default: } ) as well. > I cannot see the 10 MB of code here. You're correct that adding a "simple" default to report all undefined inputs would be easy to code, but then you'd have to consider code to do the reporting, to throttle the reporting when load gets high, etc. Things get complicated quickly. > >> implementations that are vulnerable to attack. > Could you elaborate on that? I still do not see how early error > reporting invites DoS. In fact I believe that is is improving overall > situation because smaller set of accepted inputs makes it easier to > handle corner cases and to avoid crash bugs. Turn on error full error logging in any kernel and watch performance drop. Writing messages is a non-zero cost of cycles, storage, and memory bandwidth. If you report every unexpected message, you're putting up a very large attack surface - I just have to send you a series of unexpected messages repeatedly and your system will grind to a halt. >>> Nice set of reasons for being strict when receiving messages is >>> described in the following article: >>> >>> "A Patch for Postel's Robustness Principle", >>> Len Sassaman, Meredith L. Patterson, Sergey Bratus, >>> 2012 IEEE S&P Journal, >>> http://langsec.org/papers/postel-patch.pdf >> I would encourage them to read Shannon/Weaver as well. > (copied from Message-ID: <3a0a263f-f3b4-1931-3c49-4fa1526658e5@xxxxxxx>) >> IMO, it's somewhat an extension of Shannon's recognition that >> communication is about symbol agreement, not semantics or intent. > We have tried to be liberal in DNS protocol in last 25 years. It > resulted in humongous mess which in the end forces sender to do multiple > guesses until he finds meaning of particular "symbol" for that > particular "receiver". IMO, you're overdoing the "liberal in what you receive". Liberal means that if it's possibly valid, you should accept it as such. It does not mean that all ambiguous messages need to be handled in the best possible light. Let's say that DNS says "all domains are lowercase, dash, and dot", and you get uppercase. Sure, maybe that means convert to lowercase. It also means you should not crash if you get some other character that isn't valid - but it also does not mean that you strip invalid characters and provide a response anyway. Maybe a trailing CR, LF, NL, etc., but not much else. So if that's the point, I do agree - it's possible to overdo the Postel Principle, but IMO that usually requires deliberate intent to make a mockery of what is otherwise a generally useful principle. Joe