RE: xts fuzz testing and lack of ciphertext stealing support

Pascal Van Leeuwen <pvanleeuwen@xxxxxxxxxxxxxx> · Fri, 19 Jul 2019 20:07:00 +0000

Hi Ard,

> -----Original Message-----
> From: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
> Sent: Friday, July 19, 2019 7:15 PM
> To: Pascal Van Leeuwen <pvanleeuwen@xxxxxxxxxxxxxx>
> Cc: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>; Milan Broz <gmazyland@xxxxxxxxx>; Horia Geanta <horia.geanta@xxxxxxx>; linux-
> crypto@xxxxxxxxxxxxxxx; dm-devel@xxxxxxxxxx
> Subject: Re: xts fuzz testing and lack of ciphertext stealing support
> 
> On Fri, 19 Jul 2019 at 09:29, Pascal Van Leeuwen
> <pvanleeuwen@xxxxxxxxxxxxxx> wrote:
> >
> > > -----Original Message-----
> > > From: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
> > > Sent: Friday, July 19, 2019 7:35 AM
> > > To: Pascal Van Leeuwen <pvanleeuwen@xxxxxxxxxxxxxx>
> > > Cc: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>; Milan Broz <gmazyland@xxxxxxxxx>; Horia Geanta <horia.geanta@xxxxxxx>; linux-
> > > crypto@xxxxxxxxxxxxxxx; dm-devel@xxxxxxxxxx
> > > Subject: Re: xts fuzz testing and lack of ciphertext stealing support
> >  >
> > > I would argue that these cases are diametrically opposite: you
> > > proposed to remove support for zero length input vectors from the
> > > entire crypto API to prevent your driver from having to deal with
> > > inputs that the hardware cannot handle.
> > >
> > I did not propose any such thing - I just proposed to make zero length hash support *optional*
> > (i.e. don't fail and disable the driver on it) as it's totally irrelevant for 99.99999% of use cases.
> > (including *all* use cases I consider relevant for HW acceleration)
> >
> 
> Fair enough. But it did involve making modifications to the generic
> layer, since there are known users of the AF_ALG interface that may
> pass zero length inputs (e.g., sha1sum).
> 
Which is why I gave up and grudgingly implemented those workarounds ;-)

But they complicate the driver by orders of a magnitude and for that the
same validation argument you give below applies: there's now a lot more
- and worse: a lot more complicated! - code to validate in this driver.

While I seriously question the value of that contribution as well: I don't
expect this driver to ever be used for these corner cases, as you would 
simply never *select* it for the applications that might hit them (or, if
you try it an run into trouble, just select another implementation). So 
its really just there to keep testmgr happy and nothing else (IMHO).

Also: hardware may have a lot more "problematic" limitations that are not
currently hit by testmgr. The limitations that are hit are kind of arbitrary.
(and I'm not going to shoot myself in the foot by going into detail there ;-)

> > > I am proposing not to add support for cases that we have no need for.
> > >
> > While you are proposing to stick with an implementation that can only deal with 6.25% (1/16th) of
> > *legal* input data for XTS and fails on the remaining 93.75%. That's hardly a corner case anymore.
> >
> 
> I never said it was a corner case, nor does it make a lot of sense to
> reason about fractional compliance, given that 100% of the inputs we
> ever encounter are covered by your 6.25% of legal input data.
> 
> What i did say was that the moving parts we will add to the code will
> never be put into motion, while they do increase the validation space,
> and so the value of the contribution will be negative.
> 
How can you be so sure that it will never be used?

> Perhaps I should emphasize that my concern is mainly about in-kernel
> usage of the sync software ciphers, since they typically have no use
> for userland, given that they can simply issue the same instructions
> directly. For AF_ALG, I agree that exposing a non-compliant XTS
> implementation is a bad idea.
> 
And my concern is that I want to accelerate xts but I will now have to 
cripple my driver to return -EINVAL on non-multiple-of-16 inputs while
my hardware can actually handle that just fine, just because the actual
"reference" implementation is not compliant. And if I don't match its
(incorrect!) behavior *I* will be the one being failed by testmgr.
Which seems rather unfair :-(

Note that I'm not just interested in providing support to existing 
implementations, I may want to support additional features to provide
to my customers that they can use from their own drivers/applications
they build on top of it (which do not exist yet at this moment)

> > > XTS without CTS is indistinguishable from XTS with CTS if the inputs
> > > are always a multiple of the block size, and in 12 years, nobody has
> > > ever raised the issue that our support is limited to that. So what
> > > problem are we fixing by changing this? dm-crypt does not care,
> > > fscrypt does not care, userland does not care (given that it does not
> > > work today and we are only finding out now due to some fuzz test
> > > failing on CAAM)
> > >
> > If it's not supported, then it cannot be used. Most people would not start complaining about that,
> > they would just roll their own locally or they'd give up and/or use something else.
> > So the fact that it's currently not being used does not mean a whole lot. Also, it does not mean
> > that there will not be a relevant use case tomorrow. (and I can assure you there *are* definitely
> > real-life use cases, so I have to assume these are currently handled outside of the base kernel)
> >
> > In any case, if you try to use XTS you would *expect* it to work for any input >= 16 bytes as that's
> > how the algorithm is *specified*. Without the CTS part it's simply not XTS.
> >
> 
> I really don't care what we call it. My point is that we don't need
> functionality that we will not use, regardless of how it is called.
> 
If you make an implementation that is considered (e.g. by testmgr for fuzz testing)
to be the golden reference, then it'd better be fully compliant with the relevant
specification(s). I guess that's the real point I'm trying to make. 
Now a fully compliant implementation would get penalized for being fully compliant.

> > > > I pretty much made the same argument about all these driver workarounds
> > > > slowing down my driver fast path but that was considered a non-issue.
> > > >
> > > > In this particular case, it should not need to be more than:
> > > >
> > > > if (unlikely(size & 15)) {
> > > >   xts_with_partial_last_block();
> > > > } else {
> > > >   xts_with_only_full_blocks();
> > > > }
> > > >
> > >
> > > Of course. But why add this at all if it is known to be dead code?
> > >
> > But that's just an assumption and assumptions are the root of all evil ;-)
> >
> 
> I think it was premature optimization that is the root of all evil, no?

You are talking to a guy who used to prefer to use hand-optimized assembly 
for *everything* :-D There's no such thing as time wasted on optimization!
(if it wasn't useful, at least it was fun to do! :-P)

Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
www.insidesecure.com