On 10/13/2015 11:11 PM, A. Rothman wrote: > Ok, thanks for your analysis and for looking into this (Mark as well). > > I shall change my decoder implementation to the lenient interpretation, > adjust my unit tests, and hope it is considered RFC-compliant by > everyone :-) Note that this is a reprise of the UTF-8 "overlong encoding" debate, where we ended up banning overlong encodings because of the security issues it posed (see the UTF-8 RFC for more details on the security issues found). > > Amichai > > On 10/09/2015 08:08 AM, Viktor Dukhovni wrote: >> On Thu, Oct 08, 2015 at 09:40:25PM +0300, A. Rothman wrote: >> >>> Just in case someone missed it (I almost did): Mark added his own >>> detailed comments on the test cases, but they got buried within a long >>> quote from my original email so may have gone unnoticed. To recap, here >>> are the two interpretations: >>> >>> +A- empty + 6 (unnecessary) padding bits >>> +AA- empty + 12 (unnecessary) padding bits >>> +AAA- \U+0000, and 2 (required) padding bits >>> +AAAA- \U+0000, and 8 (6 extra) padding bits >>> +AAAAA- \U+0000, and 14 (12 extra) padding bits >>> +AAAAAA- \U+0000\U+0000, and 4 (required) padding bits >>> +AAAAAAA- \U+0000\U+0000, and 10 (6 extra) padding bits >>> >>> >>> +A- illegal !modified base64 >>> +AA- illegal !a multiple of 16 bits in modified base64 >>> +AAA- legal 0x0000 (last 2 bits zero) >>> +AAAA- illegal !a multiple of 16 bits in modified base64 >>> +AAAAA- illegal !modified base64 >>> +AAAAAA- legal 0x0000, 0x0000 (last 4 bits zero) >>> +AAAAAAA- illegal !a multiple of 16 bits in modified base64 >>> >>> >>> Does anyone else want to vote or comment on the two interpretations above? >> Thanks for pointing this out more clearly. Yes, they disagree. >> However, the manner in which they disagree is rather simple. >> >> They agree in all the cases where the padding is *minimal*. >> >> The first variant always tolerates non-minimal padding allowing >> anything less than 16 bits per the specification. The second >> variant never tolerates non-minimal padding, because there's no >> need to produce it. >> >> It is clear that clients should produce minimal padding, and we >> seem to disgree on wether to apply Postel's principle to the >> decoder or not. >> >> This is not a major disagreement, such differences of interpretation >> are endemic whether the standard is clear or not. Many implementors >> are lazy, and stop writing code when the expected cases work. >> >> While this is no excuse for ambiguous specifications, in this case >> I don't think a revision is warranted. Encoders that generate >> sensibly minimal padding will not run into any friction with >> non-broken decoders. Encoders that get creative might find that >> some decoders object whether the standard allows their creativity >> or not. >> > >