Re: another testmgr question

Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> · Fri, 24 May 2019 06:24:30 +0100

On Thu, 23 May 2019 at 22:44, Pascal Van Leeuwen
<pvanleeuwen@xxxxxxxxxxxxxxxx> wrote:
>
> > -----Original Message-----
> > From: Eric Biggers [mailto:ebiggers@xxxxxxxxxx]
> > Sent: Thursday, May 23, 2019 10:06 PM
> > To: Pascal Van Leeuwen <pvanleeuwen@xxxxxxxxxxxxxxxx>
> > Cc: linux-crypto@xxxxxxxxxxxxxxx
> > Subject: Re: another testmgr question
> >
> > On Thu, May 23, 2019 at 01:07:25PM +0000, Pascal Van Leeuwen wrote:
> > > Eric,
> > >
> > > I'm running into some trouble with some random vectors that do *zero*
> > > length operations. Now you can go all formal about how the API does
> > > not explictly disallow this, but how much sense does it really make
> > > to essentially encrypt, hash or authenticate absolutely *nothing*?
> > >
> > > It makes so little sense that we never bothered to support it in any
> > > of our hardware developed over the past two decades ... and no
> > > customer has ever complained about this, to the best of my knowledge.
> > >
> > > Can't you just remove those zero length tests?
> > >
> >
> > For hashes this is absolutely a valid case.  Try this:
> >
> > $ touch file
> > $ sha256sum file
> > e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  file
> >
> > That shows the SHA-256 digest of the empty message.
> >
> Valid? A totally fabricated case, if you ask me. Yes, you could do that,
> but is it *useful* at all? Really?

Yes, really. The likelihood of a test vector occurring in practice is
entirely irrelevant. What matters is that the test vectors provide
known outputs for known inputs, and many algorithm specifications
explicitly contain the empty message as one of the documented test
vectors.

In fact, given the above, I am slightly shocked that your hardware
does not handle empty messages correctly. Are you sure it is a
hardware problem and not a driver problem?

In any case, as Eric points out as well, nothing is stopping you from
adding a special case to your driver that falls back to the software
implementation for known broken test cases.

Removing test cases from the set because of broken hardware is out of
the question IMO. It doesn't actually fix the problem, and it may
actually result in breakage, especially for h/w accelerated crypto
exposed to userland, of which we have no idea whatsoever how it is
being used, and whether correct handling of zero length vectors is
likely to break anything or not.

> No, it's not because a file of length 0 is a file of length 0, the length
> in itself is sufficient guarantee of its contents. The hash does not add
> *anything* in this case. It's a constant anyway, the same value for *any*
> zero-length file. It doesn't tell you anything you didn't already know.
> IMHO the tool should just return a message stating "hashing an empty file
> does not make any sense at all ...".
>

You are making assumptions about how the crypto is being used at a
higher level. Eric's example may not make sense to you, but arguing
that *any* use of sha256sum on empty files is guaranteed to be
non-sensical in all imaginable cases is ridiculous.

> >
> > For AEADs it's a valid case too.  You still get an authenticated ciphertext
> > even
> > if the plaintext and/or AAD are empty, telling you that the (plaintext, AAD)
> > pair is authentically from someone with the key.
> >
> Again, you could *theoretically* do that, but I don't know of any *practicle*
> use  case (protocol, application) where you can have *and* 0 length AAD *and*
> 0 length payload (but do correct me when I'm wrong there!)
> In any case, that would result in a value *only* depending on the key (same
> thing applies to zero-length HMAC), which is likely some sort of security
> risk anyway.
>
> As I mentioned before, we made a lot of hashing and authentication hardware
> over the past 20+ years that has never been capable of doing zero-length
> operations and this has *never* proved to be a problem to *anyone*. Not a
> single support question has *ever* been raised on the subject.
>

Again, that is shocking to me, since it means nobody has ever bothered
to run, e.g., the documented SHA-xxx test vectors on them. Weird.

> >
> > It's really only skciphers (length preserving encryption) where it's
> > questionable, since for those an empty input can only map to an empty output.
> >
> True, although that's also the least problematic case to handle.
> Doing nothing at all is not so hard ...
>
> > Regardless of what we do, I think it's really important that the behavior is
> > *consistent*, so users see the same behavior no matter what implementation of
> > the algorithm is used.
> >
> Consistency should only be required for *legal* ranges of input parameters.
> Which then obviously need to be properly specified somewhere.
> It should be fair to put some reasonable restrictions on these inputs as to
> not burden implementions with potentially difficult to handle fringe cases.
>
> > Allowing empty messages works out naturally for most skcipher
> > implementations,
> > and it also conceptually simplifies the length restrictions of the API (e.g.
> > for
> > most block cipher modes: just need nbytes % blocksize == 0, as opposed to
> > that
> > *and* nbytes != 0).  So that seems to be how we ended up with it.
> >
> I fail to see the huge burden of the extra len>0 restriction.
> Especially if you compare it to the burden of adding all the code for
> handling such useless exception cases to all individual drivers.
>
> > If we do change this, IMO we need to make it the behavior for all
> > implementations, not make it implementation-defined.
> >
> I don't see how disallowing 0 length inputs would affect implementations that
> ARE capable of processing them. Unless you would require them to start
> throwing errors for these cases, which I'm not suggesting.
>
> > Note that it's not necessary that your *hardware* supports empty messages,
> > since you can simply do this in the driver instead:
> >
> >       if (req->cryptlen == 0)
> >               return 0;
> >
> For skciphers, yes, it's not such a problem. Neither for basic hash.
> (And thanks for the code suggestion BTW, this will be a lot more efficient
> then what I'm doing now for this particular case :-)
> For HMAC, however, where you would have to return a value depending on the
> key ... not so easy to solve. I don't have a solution for that yet :-(
>
> And I'm pretty sure this affects all Inside Secure HW drivers in the tree:
> inside-secure, amcc, mediatek and omap ...
>
> Regards,
> Pascal van Leeuwen
> Silicon IP Architect, Multi-Protocol Engines, Inside Secure
> www.insidesecure.com