some comments on Linux CryptoAPI - this "atomic" cipher business

Michael Richardson <mcr@xxxxxxxxxxxxxxxxxxxxxx> · Fri, 19 Jul 2002 15:08:30 -0400

-----BEGIN PGP SIGNED MESSAGE-----

  I have been looking at CryptoAPI with the view towards using the CryptoAPI
routines in FreeSWAN 3.X rather than our own.  (see http://www.kerneli.org for
those on the CC and BCC lists. I know that there are certain other
organizations that are supposed to be developing similar APIs, feel free to
forward to those organizations)

  We have several requirements that we must meet:
   1) ability to use multiple processors (SMP support)
   2) ability to use hardware acceleration
   3) ability to seperately account for time spent on crypto vs other
      networking.

  Note that the FreeSWAN decrypt code is usually invoked from net_rx_action(), and
this will typically be single-threaded. (note that there is nothing in
net_rx_action to force this [i.e. a lock], but the network bottom half is
kicked from the interrupt handlers to occur on the same CPU as the interrupt,
so unless interrupts ping-pong, networking code tends to stick to a single
CPU).
  Encrypt code is typically invoked from net_tx_action().

  So, for software implemented ciphers, we want to stick all packets that
must be encrypted into a queue and process them in a seperate kernel thread,
with a callback at the end. If there is in fact hardware involved, then this
just turns into queueing the packet to the hardware, and kicking a crypto_bh
to invoke the callback when the completion interrupt occurs.

  Looking at CryptoAPI is a bit hampered by the fact that I'm not entirely
clear how a piece of hardware is supposed to interface. Yes, it needs to
provide encrypt/decrypt routines directly rather than relying on the
_encrypt/_decrypt simplification. 

  I guess that it should sleep if required inside, except if atomic is set,
then one must fall back to using software. I had assumed that this might
be necessary when doing, for instance, cryptoswap.  Many have talked about
lifting this API up and putting a lower-level API underneath. That's what I'm
here to do.

  Looking at the cryptoloop.c file, I don't even see the ATOMIC stuff enabled
by default. So, why all the bother?

  In the case of FreeSWAN with hardware, we do *not* want to sleep. We want a
callback. I am looking at doing this.

  Some more comments:

Naming of ciphers
=================

  I think that there should be a hierarchy of names with longest matching
wins. Transforms should provide relative weights for binding to the less
specific names.

  Specifically:
	aes-cbc
	aes-cbc/software	(or aes/atomic if you prefer)
	aes-cbc/hardware
	aes-cbc/hardware/hifn/pci/01/00/0		[PCI bus/device/function]
	aes-cbc/hardware/chrysalis/pci/04/06/0
	aes-cbc/hardware/broadcom/csix/5/7/8	[to make something up]

  This permits one to VERY specifically attach to the implementation that one
wants, while still permitting "aes-cbc" to get something useful. This all
would occur at registration and lookup of cipher_implementation time.

Hardware vendors
================

  Who is left, btw?
  Chrysalis is out of the cipher chip business, AFAIK.

  Intel's board has basically gone closed source. Ditto for 3COMs. Neither
was a general purpose crypto board, but did un-auditable IPsec. (You never
get a chance to see the output packets to confirm that they were in fact
encrypted with the right key, that the key didn't leak, etc..)

  That leaves Broadcom and HiFn that I know of.
  Are there others? Any of non-US origin?

  I'm still looking for data sheets on no-NDA,public-domain required data
sheets that could be used as a basis for an non-USA origin open source
driver. This doesn't have to be for the latest 10Gb/s SPI4.2 CSIX 2 capable
product - 100Mb/s half-duplex boards are still useful to get the APIs right.

The digest/cipher split
=======================

I see that the transform_implementation is subclassed to be
digest_implementation and cipher_implementation. I have some problems with
this. Many pieces of hardware can do both at the same time, and can even do
some of the IPsec ESP checking along the way. 

Further, there is compression. Compresion is basically identical to
cryptography. (There is ongoing research on doing both at the same as well. A
fellow now at Nortel may have succeeded from what I hear) Of course hardware
can do all of these things too.

So, I propose that all operations are essentially "encode"/"decode". A
straight digest only ever does "encode". The digest routines follow the
original MD5 libraries with open/update/etc. right there. It isn't clear to
me that this is really a useful interface for a lot of applications, and it
certainly can not be replaced by hardware.

The operation queue
===================

I would propose that all fucntions take a "struct transform_command *" as
an argument, defined essentially like this:

struct transform_command {
  struct list_head           tc_cmdqueue;
  struct cipher_context     *tc_context;
  transform_unit_callback    tc_callback;
  cipher_usercontext         tc_user;    /* whatever the user callback wants */
  unsigned int               tc_flags;
  const u8                   tc_iv[MAX_IV_SIZE];
  const u8                  *tc_in;  
  u8                        *tc_out;     /* if NULL, use cc_in */
  u8                        *tc_mac;     /* must point to space of tc_macsize */
  size_t                     tc_insize;  /* size of input buffer */
  size_t                     tc_outsize; /* size of output buffer */
  size_t                     tc_macsize; /* size of MAC output buffer */
  size_t                     tc_resultsize;  /* amount of output buffer used */
};

#define TC_FLAGS_GENERATE_IV   (1<<0)    /* if set, then IV must be generated */

Unfortunately, this is too big. It is 52 bytes + MAX_IV_SIZE. 

Making all of the sizes 16 bit integers, and putting the IV outline (a
pointer) would get us down to 44 bytes.  

To be memory efficient, we need to fit this into the 48 bytes in the skb->cb,
which avoids having to allocate another control structure for each
packet. The result is therefore:

struct transform_command {
  struct list_head           tc_cmdqueue;
  struct cipher_context     *tc_context;
  transform_unit_callback    tc_callback;
  cipher_usercontext         tc_user;    /* whatever the user callback wants */
  unsigned int               tc_flags;
  const u8                  *tc_iv;
  const u8                  *tc_in;  
  const u8                  *tc_out;  
  u8                        *tc_mac;     /* must point to space of tc_macsize */
  u16                        tc_insize;  /* size of input buffer */
  union {
    u16	                     tc_buffersize;    /* size of output buffer */
    u16                      tc_resultsize; /* number of bytes in output */
  } tc_output;			    
};

yes, it is necessary to provide outsize as well as insize somehow.
De-Compression could expand the output, and it must bounds check the
output. We re-use the tc_outsize as the tc_resultsize. The tc_macsize
is also dropped, as I believe that the MAC result will always be the same for
a given digest. 

Note that the operation is implied by the cipher_context now, or we can steal
some bits from tc_flags for it.

Compression
===========

I am hoping that this interface will also permit application to compression
algorithms. There are a number of copies of libz already in the
kernel. Getting the framework in for compression is very valuable.

Name of project
===============

The project should be renamed "dataxform" or "compressapi", since the libz 
replacement stuff could be mainlined.

]       ON HUMILITY: to err is human. To moo, bovine.           |  firewalls  [
]   Michael Richardson, Sandelman Software Works, Ottawa, ON    |net architect[
] mcr@sandelman.ottawa.on.ca http://www.sandelman.ottawa.on.ca/ |device driver[
] panic("Just another NetBSD/notebook using, kernel hacking, security guy");  [

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3ia
Charset: latin1
Comment: Finger me for keys

iQCVAwUBPSocfIqHRg3pndX9AQEuvwP+JpIOj24SaES5Nd5ZgNpXmlP3aSPtBP/u
5og1eHEyYl+kh339UMs6D2QWvspzPiyACdBa9YnRaNtDdiMj0jJaNaYeIHnvUwH3
GulQSgJWPqZxkq/LIsRv6hbMik0bnbQ2h+5sEfzNPRRiYLXIdmCXbVwEFYvIujMB
qhCBPOrcJuk=
=kuT6
-----END PGP SIGNATURE-----
-
Linux-crypto:  cryptography in and on the Linux system
Archive:       http://mail.nl.linux.org/linux-crypto/