On Friday, October 31, 2014 at 08:23:53 AM, Herbert Xu wrote: > On Fri, Oct 31, 2014 at 04:01:04AM +0100, Marek Vasut wrote: > > I can share the last state of the document I wrote. Currently, > > it is not possible for me to keep up with my workload and do > > anything else, so that's all I can do. > > Posting your latest revision would be great. Please see below, mine is much less complete than Stephan's though and likely contains some bugs. Linux Crypto API :: Drivers =========================== This document outlines how to implement drivers for cryptographic hardware. The Linux Crypto API supports different types of transformations and we will explain here how to write drivers for each one of them. Note: Transformation and algorithm are used interchangably Note: We support multiple transformation types: CIPHER ....... Simple single-block cipher BLKCIPHER .... Synchronous multi-block cipher ABLKCIPHER ... Asynchronous multi-block cipher SHASH ........ Synchronous multi-block hash AHASH ........ Asynchronous multi-block hash AEAD ......... Authenticated Encryption with Associated Data (MAC) COMPRESS ..... Compression RNG .......... Random Number Generation 0) Terminology -------------- - The transformation implementation is an actual code or interface to hardware which implements a certain trasformation with percisely defined behavior. - The transformation object (TFM) is an instance of a transformation implementation. There can be multiple transformation objects associated with a single transformation implementation. Each of those transformation objects is held by a crypto API consumer. Transformation object is allocated when a crypto API consumer requests a transformation implementation. The consumer is then provided with a structure, which contains a transformation object (TFM). - The transformation context is private data associated with the transformation object. 1) The struct crypto_alg description ------------------------------------ The struct crypto_alg describes a generic Crypto API algorithm and is common for all of the transformations. We will first explain what each entry means as this is a fundamental building block. We will not follow the order of fields as defined in include/linux/crypto.h , but will instead explain them in logical order. .cra_name .......... Name of the transformation algorithm . - This is the name of the transformation itself. This field is used by the kernel when looking up the providers of particular transformation. - Examples: "md5", "cbc(cast5)", "rfc4106(gcm(aes))" - You can find a good approximation for values of this field by running: $ git grep tcrypt_test crypto/tcrypt.c .cra_driver_name ... Name of the transformation provider . - This is the name of the provider of the transformation. This can be any arbitrary value, but in the usual case, this contains the name of the chip or provider and the name of the transformation algorithm. - Examples: "sha1-dcp", "atmel-ecb-aes" .cra_priority ...... Priority of this transformation implementation. - In case multiple transformations with same .cra_name are available to the Crypto API, the kernel will use the one with highest .cra_priority . - The software implementations of transformations have this field set to 0 so they are picked only in case no other higher-priority implementation is available. .cra_module ........ Owner of this transformation implementation. - Set to THIS_MODULE . .cra_blocksize ..... Minimum block size of this transformation. - The size in bytes of the smallest possible unit which can be transformed with this algorithm. The users must respect this value. - In case of HASH transformation, it is possible for a smaller block than .cra_blocksize to be passed to the crypto API for transformation, in case of any other transformation type, an error will be returned upon any attempt to transform smaller than .cra_blocksize chunks. - Examples: SHA1_BLOCK_SIZE, AES_BLOCK_SIZE - You can find predefined values for this field in the kernel source tree with: $ git grep _BLOCK_SIZE include/crypto/ .cra_alignmask ..... Alignment mask for the input and output data buffer. - The data buffer containing the input data for the algorithm must be aligned to this alignment mask. - The data buffer for the output data must be aligned to this alignment mask. - Note that the Crypto API will do the re-alignment in software, but only under special conditions and there is a performance hit. The re-alignment happens at these occassions for different .cra_u types: cipher: For both input data and output data buffer ahash: For output hash destination buffer shash: For output hash destination buffer /* FIXME ... others ? */ - This is needed on hardware which is flawed by design and cannot pick data from arbitrary addresses. .cra_ctxsize ....... Size of the transformation context. - This is the size of data, which are associated with the transformation object. These data are valid during the entire existence of the transformation object. These data can only ever be modified by the driver. - The driver can retrieve a pointer to these data via the crypto_tfm_ctx() function . .cra_type .......... Type of the cryptographic transformation. - This is a pointer to struct crypto_type, which implements callbacks common for all trasnformation types. - There are multiple options: crypto_blkcipher_type .... Sync block cipher crypto_ablkcipher_type ... Async block cipher crypto_ahash_type ........ Async hash crypto_aead_type ......... AEAD crypto_rng_type .......... Random number generator - This field might be empty. In that case, there are no common callbacks. This is the case for: cipher ................... Single-block cipher compress ................. Compression shash .................... Sync hash .cra_flags ......... Flags describing this transformation. - See include/linux/crypto.h CRYPTO_ALG_* flags for the flags which go in here. Those are used for fine-tuning the description of the transformation algorithm. .cra_u ............. Callbacks implementing the transformation. - This is a union of multiple structures. Depending on the type of transformation selected by .cra_type and .cra_flags above, the associated structure must be filled with callbacks. - There are multiple options: .cipher ....... Cipher .blkcipher .... Sync block cipher .ablkcipher ... Async block cipher .aead ......... AEAD .compress ..... Compression .rng .......... Random number generator - This field might be empty. This is the case for: ahash ......... Async hash shash ......... Sync hash .cra_init() ........ Initialize the cryptographic transformation object. - This function is used to initialize the cryptographic transformation object. This function is called only once at the instantiation time, right after the transformation context was allocated. - In case the cryptographic hardware has some special requirements which need to be handled by software, this function shall check for the precise requirement of the transformation and put any software fallbacks in place. .cra_exit() ........ Deinitialize the cryptographic transformation object. - This is a counterpart to .cra_init(), used to remove various changes set in .cra_init() . .cra_list .......... List header. - This internal field of the crypto API is used as a list head. It allows for this structure to be added into the list of other crypto algorithms. .cra_users ......... List of all users of this transformation. - This internal field to the crypto API is used to track all the users which are currently using this particular transformation implementation. .cra_refcnt ........ Reference counter for this structure. - This internal field of the crypto API is used to count number of references of this structure so it can be checked when removal is requested. .cra_destroy() ..... Deallocate resources of the crypto transformation. - This is used internally by the crypto API. When there are multiple spawns of the algorithm, this is set for all of then and when the refcount reaches zero, this function is called to dealloc all the remaining data. 2) Registering and unregistering transformation ----------------------------------------------- There are three distinct types of registration functions in the Crypto API. One is used to register a generic cryptographic transformation, while the other two are specific to HASH transformations and COMPRESSion . We will discuss the latter two in a separate chapter, here we will only look at the generic ones. The generic registration functions can be found in include/linux/crypto.h and their definition can be seen below. The former function registers a single transformation, while the latter works on an array of transformation descriptions. The latter is useful when registering transformations in bulk. int crypto_register_alg(struct crypto_alg *alg); int crypto_register_algs(struct crypto_alg *algs, int count); The counterparts to those functions are listed below. int crypto_unregister_alg(struct crypto_alg *alg); int crypto_unregister_algs(struct crypto_alg *algs, int count); Notice that both registration and unregistration functions do return a value, so make sure to handle errors. 3) Single-block ciphers [CIPHER] -------------------------------- Example of transformations: aes, arc4, ... This section describes the simplest of all transformation implementations, that being the CIPHER type. The CIPHER type is used for transformations which operate on exactly one block at a time and there are no dependencies between blocks at all. 3.1) Registration specifics --------------------------- The registration of [CIPHER] algorithm is specific in that struct crypto_alg field .cra_type is empty. The .cra_u.cipher has to be filled in with proper callbacks to implement this transformation. 3.2) Fields in struct cipher_alg explained ------------------------------------------ This section explains the .cra_u.cipher fields and how they are called. All of the fields are mandatory and must be filled: .cia_min_keysize ... Minimum key size supported by the transformation. - This is the smallest key length supported by this transformation algorithm. This must be set to one of the pre-defined values as this is not hardware specific. - Possible values for this field can be found via: $ git grep "_MIN_KEY_SIZE" include/crypto/ .cia_max_keysize ... Maximum key size supported by the transformation. - This is the largest key length supported by this transformation algorithm. This must be set to one of the pre-defined values as this is not hardware specific. - Possible values for this field can be found via: $ git grep "_MAX_KEY_SIZE" include/crypto/ .cia_setkey() ...... Set key for the transformation. - This function is used to either program a supplied key into the hardware or store the key in the transformation context for programming it later. Note that this function does modify the transformation context. - This function can be called multiple times during the existence of the transformation object, so one must make sure the key is properly reprogrammed into the hardware. - This function is also responsible for checking the key length for validity. - In case a software fallback was put in place in the .cra_init() call, this function might need to use the fallback if the algorithm doesn't support all of the key sizes. .cia_encrypt() ..... Encrypt a single block. - This function is used to encrypt a single block of data, which must be .cra_blocksize big. This always operates on a full .cra_blocksize and it is not possible to encrypt a block of smaller size. The supplied buffers must therefore also be at least of .cra_blocksize size. - Both the input and output buffers are always aligned to .cra_alignmask . In case either of the input or output buffer supplied by user of the crypto API is not aligned to .cra_alignmask, the crypto API will re-align the buffers. The re-alignment means that a new buffer will be allocated, the data will be copied into the new buffer, then the processing will happen on the new buffer, then the data will be copied back into the original buffer and finally the new buffer will be freed. - In case a software fallback was put in place in the .cra_init() call, this function might need to use the fallback if the algorithm doesn't support all of the key sizes. - In case the key was stored in transformation context, the key might need to be re-programmed into the hardware in this function. - This function shall not modify the transformation context, as this function may be called in parallel with the same transformation object. .cia_decrypt() ..... Decrypt a single block. - This is a reverse counterpart to .cia_encrypt(), and the conditions are exactly the same. Here are schematics of how these functions are called when operated from other part of the kernel. Note that the .cia_setkey() call might happen before or after any of these schematics happen, but must not happen during any of these are in-flight. KEY ---. PLAINTEXT ---. v v .cia_setkey() -> .cia_encrypt() | '-----> CIPHERTEXT Please note that a pattern where .cia_setkey() is called multiple times is also valid: KEY1 --. PLAINTEXT1 --. KEY2 --. PLAINTEXT2 --. v v v v .cia_setkey() -> .cia_encrypt() -> .cia_setkey() -> .cia_encrypt() | | '---> CIPHERTEXT1 '---> CIPHERTEXT2 4) Multi-block ciphers [BLKCIPHER] [ABLKCIPHER] ----------------------------------------------- Example of transformations: cbc(aes), ecb(arc4), ... This section describes the multi-block cipher transformation implementations for both synchronous [BLKCIPHER] and asynchronous [ABLKCIPHER] case. The multi-block ciphers are used for transformations which operate on scatterlists of data supplied to the transformation functions. They output the result into a scatterlist of data as well. 4.1) Registration specifics --------------------------- The registration of [BLKCIPHER] or [ABLKCIPHER] algorithm is one of the most standard procedures throughout the crypto API. There are no specifics for this case other that re-aligning of input and output buffers does not happen automatically within the crypto API, but is the responsibility of the crypto API consumer. The crypto API consumer shall use crypto_blkcipher_alignmask() or crypto_ablkcipher_alignmask() respectively to determine the needs of the transformation object and prepare the scatterlist with data accordingly. 4.2) Fields in struct blkcipher_alg and struct ablkcipher_alg explained ----------------------------------------------------------------------- This section explains the .cra_u.blkcipher and .cra_u.cra_ablkcipher fields and how they are called. Please note that this is very similar to the basic CIPHER case for all but minor details. All of the fields but .geniv are mandatory and must be filled: .min_keysize ... Minimum key size supported by the transformation. - This is the smallest key length supported by this transformation algorithm. This must be set to one of the pre-defined values as this is not hardware specific. - Possible values for this field can be found via: $ git grep "_MIN_KEY_SIZE" include/crypto/ .max_keysize ... Maximum key size supported by the transformation. - This is the largest key length supported by this transformation algorithm. This must be set to one of the pre-defined values as this is not hardware specific. - Possible values for this field can be found via: $ git grep "_MAX_KEY_SIZE" include/crypto/ .setkey() ...... Set key for the transformation. - This function is used to either program a supplied key into the hardware or store the key in the transformation context for programming it later. Note that this function does modify the transformation context. - This function can be called multiple times during the existence of the transformation object, so one must make sure the key is properly reprogrammed into the hardware. - This function is also responsible for checking the key length for validity. - In case a software fallback was put in place in the .cra_init() call, this function might need to use the fallback if the algorithm doesn't support all of the key sizes. .encrypt() ..... Encrypt a scatterlist of blocks. - This function is used to encrypt the supplied scatterlist containing the blocks of data. The crypto API consumer is responsible for aligning the entries of the scatterlist properly and making sure the chunks are correctly sized. - In case a software fallback was put in place in the .cra_init() call, this function might need to use the fallback if the algorithm doesn't support all of the key sizes. - In case the key was stored in transformation context, the key might need to be re-programmed into the hardware in this function. - This function shall not modify the transformation context, as this function may be called in parallel with the same transformation object. .decrypt() ..... Decrypt a single block. - This is a reverse counterpart to .encrypt(), and the conditions are exactly the same. Please refer to section 3.2) for schematics of the block cipher usage. The usage patterns are exactly the same for [ABLKCIPHER] and [BLKCIPHER] as they are for plain [CIPHER]. 4.3) Specifics of asynchronous multi-block cipher ------------------------------------------------- There are a couple of specifics to the [ABLKCIPHER] interface. First of all, some of the drivers will want to use the Generic ScatterWalk in case the hardware needs to be fed separate chunks of the scatterlist which contains the plaintext and will contain the ciphertext. Please refer to the section 9.1) of this document on the description and usage of the Generic ScatterWalk interface. It is recommended to enqueue cryptographic transformation requests into generic crypto queues. This allows for these requests to be processed in sequence as the cryptographic hardware becomes free. For details on the crypto queues, please refer to section 9.2) further down in this text. 5) Hashing [HASH] ----------------- Example of transformations: crc32, md5, sha1, sha256,... 5.1) Registering and unregistering the transformation ----------------------------------------------------- There are multiple ways to register a HASH transformation, depending on whether the transformation is synchronous [SHASH] or asynchronous [AHASH] and the amount of HASH transformations we are registering. You can find the prototypes defined in include/crypto/internal/hash.h : int crypto_register_ahash(struct ahash_alg *alg); int crypto_register_shash(struct shash_alg *alg); int crypto_register_shashes(struct shash_alg *algs, int count); The respective counterparts for unregistering the HASH transformation are as follows: int crypto_unregister_ahash(struct ahash_alg *alg); int crypto_unregister_shash(struct shash_alg *alg); int crypto_unregister_shashes(struct shash_alg *algs, int count); 5.2) Common fields of struct shash_alg and ahash_alg explained -------------------------------------------------------------- For definition of these structures, please refer to include/crypto/hash.h . We will now explain the meaning of each field: .init() ......... Initialize the transformation context. - Intended only to initialize the state of the HASH transformation at the begining. This shall fill in the internal structures used during the entire duration of the whole transformation. - No data processing happens at this point. .update() ....... Push chunk of data into the driver for transformation. - This function actually pushes blocks of data from upper layers into the driver, which then passes those to the hardware as seen fit. - This function must not finalize the HASH transformation, this only adds more data into the transformation. - This function shall not modify the transformation context, as this function may be called in parallel with the same transformation object. - Data processing can happen synchronously [SHASH] or asynchronously [AHASH] at this point. .final() ....... Retrieve result from the driver. - This function finalizes the transformation and retrieves the resulting hash from the driver and pushes it back to upper layers. - No data processing happens at this point. .finup() ........ Combination of update()+final() . - This function is effectively a combination of update() and final() calls issued in sequence. - As some hardware cannot do update() and final() separately, this callback was added to allow such hardware to be used at least by IPsec. - Data processing can happen synchronously [SHASH] or asynchronously [AHASH] at this point. .digest() ....... Combination of init()+update()+final() . - This function effectively behaves as the entire chain of operations, init(), update() and final() issued in sequence. - Just like .finup(), this was added for hardware which cannot do even the .finup(), but can only do the whole transformation in one run. - Data processing can happen synchronously [SHASH] or asynchronously [AHASH] at this point. .setkey() ....... Set optional key used by the hashing algorithm . - Intended to push optional key used by the hashing algorithm from upper layers into the driver. - This function can store the key in the transformation context or can outright program it into the hardware. In the former case, one must be careful to program the key into the hardware at appropriate time and one must be careful that .setkey() can be called multiple times during the existence of the transformation object. - Not all hashing algorithms do implement this function. -> SHAx/MDx/CRCx do NOT implement this function. -> HMAC(MDx)/HMAC(SHAx) do implement this function. - This function must be called before any other of the init()/update()/final()/finup()/digest() is called. - No data processing happens at this point. .export() ....... Export partial state of the transformation . - This function dumps the entire state of the ongoing transformation into a provided block of data so it can be .import()ed back later on. - This is useful in case you want to save partial result of the transformation after processing certain amount of data and reload this partial result multiple times later on for multiple re-use. - No data processing happens at this point. .import() ....... Import partial state of the transformation . - This function loads the entire state of the ongoing transformation from a provided block of data so the transformation can continue from this point onward. - No data processing happens at this point. Here are schematics of how these functions are called when operated from other part of the kernel. Note that the .setkey() call might happen before or after any of these schematics happen, but must not happen during any of these are in-flight. Please note that calling .init() followed immediatelly by .finish() is also a perfectly valid transformation. I) DATA -----------. v .init() -> .update() -> .final() ! .update() might not be called ^ | | at all in this scenario. '----' '---> HASH II) DATA -----------.-----------. v v .init() -> .update() -> .finup() ! .update() may not be called ^ | | at all in this scenario. '----' '---> HASH III) DATA -----------. v .digest() ! The entire process is handled | by the .digest() call. '---------------> HASH Here is a schematic of how the .export()/.import() functions are called when used from another part of the kernel. KEY--. DATA--. v v ! .update() may not be called .setkey() -> .init() -> .update() -> .export() at all in this scenario. ^ | | '-----' '--> PARTIAL_HASH ----------- other transformations happen here ----------- PARTIAL_HASH--. DATA1--. v v .import -> .update() -> .final() ! .update() may not be called ^ | | at all in this scenario. '----' '--> HASH1 PARTIAL_HASH--. DATA2-. v v .import -> .finup() | '---------------> HASH2 5.3) The struct hash_alg_common fields and it's mirror in struct shash_alg -------------------------------------------------------------------------- This structure defines various size constraints and generic properties of the hashing algorithm that is being implemented. Let us first inspect the size properties: digestsize .... Size of the result of the transformation. - A buffer of this size must be available to the .final() and .finup() calls, so they can store the resulting hash into it. - For various predefined sizes, search include/crypto/ using 'git grep _DIGEST_SIZE include/crypto' . statesize ..... Size of the block for partial state of the transformation. - A buffer of this size must be passed to the .export() function as it will save the partial state of the transformation into it. On the other side, the .import() function will load the state from a buffer of this size as well. /* FIXME */ We will now discuss HASH-specific details of struct crypto_alg . In order to understand the rest of the text, please read the section 1) at the begining of this documentation first. /* FIXME ... this needs expanding */ 5.4) Specifics of asynchronous HASH transformation -------------------------------------------------- There are a couple of specifics to the [AHASH] interface. First of all, some of the drivers will want to use the Generic ScatterWalk in case the hardware needs to be fed separate chunks of the scatterlist which contains the input data. The buffer containing the resulting hash will always be properly aligned to .cra_alignmask so there is no need to worry about this. Please refer to the section 9.1) of this document of the description and usage of the Generic ScatterWalk interface. It is recommended to enqueue cryptographic transformation requests into generic crypto queues. This allows for these requests to be processed in sequence as the cryptographic hardware becomes free. For details on the crypto queues, please refer to section 9.2) further down in this text. 6) Authenticated Encryption with Associated Data (MAC) [AEAD] ------------------------------------------------------------- 7) Compression [COMPRESS] ------------------------- 8) Random Number Generation [RNG] --------------------------------- 9) Additional helper interfaces ------------------------------- This section outlines specific helpers available across various types of cryptographic transformation implementations, which are not specific to a particular transformation type. 9.1) Generic ScatterWalk ------------------------ 9.2) Crypto Request Queue ------------------------- /* FIXME -- others? Multi-queue API ? ... */ -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html