> >> On 2022/7/26 19:34, Neal Liu wrote: > >>> Hash and Crypto Engine (HACE) is designed to accelerate the > >>> throughput of hash data digest, encryption, and decryption. > >>> > >>> Basically, HACE can be divided into two independently engines > >>> - Hash Engine and Crypto Engine. This patch aims to add HACE hash > >>> engine driver for hash accelerator. > >>> > >>> Signed-off-by: Neal Liu <neal_liu@xxxxxxxxxxxxxx> > >>> Signed-off-by: Johnny Huang <johnny_huang@xxxxxxxxxxxxxx> > >>> --- > >>> MAINTAINERS | 7 + > >>> drivers/crypto/Kconfig | 1 + > >>> drivers/crypto/Makefile | 1 + > >>> drivers/crypto/aspeed/Kconfig | 32 + > >>> drivers/crypto/aspeed/Makefile | 6 + > >>> drivers/crypto/aspeed/aspeed-hace-hash.c | 1389 > >> ++++++++++++++++++++++ > >>> drivers/crypto/aspeed/aspeed-hace.c | 213 ++++ > >>> drivers/crypto/aspeed/aspeed-hace.h | 186 +++ > >>> 8 files changed, 1835 insertions(+) create mode 100644 > >>> drivers/crypto/aspeed/Kconfig create mode 100644 > >>> drivers/crypto/aspeed/Makefile create mode 100644 > >>> drivers/crypto/aspeed/aspeed-hace-hash.c > >>> create mode 100644 drivers/crypto/aspeed/aspeed-hace.c > >>> create mode 100644 drivers/crypto/aspeed/aspeed-hace.h > >>> > >>> diff --git a/MAINTAINERS b/MAINTAINERS index > >>> f55aea311af5..23a0215b7e42 100644 > >>> --- a/MAINTAINERS > >>> +++ b/MAINTAINERS > >>> @@ -3140,6 +3140,13 @@ S: Maintained > >>> F: Documentation/devicetree/bindings/media/aspeed-video.txt > >>> F: drivers/media/platform/aspeed/ > >>> > >>> +ASPEED CRYPTO DRIVER > >>> +M: Neal Liu <neal_liu@xxxxxxxxxxxxxx> > >>> +L: linux-aspeed@xxxxxxxxxxxxxxxx (moderated for non-subscribers) > >>> +S: Maintained > >>> +F: > >> Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml > >>> +F: drivers/crypto/aspeed/ > >>> + > >>> ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS > >>> M: Corentin Chary <corentin.chary@xxxxxxxxx> > >>> L: acpi4asus-user@xxxxxxxxxxxxxxxxxxxxx > >>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index > >>> ee99c02c84e8..b9f5ee126881 100644 > >>> --- a/drivers/crypto/Kconfig > >>> +++ b/drivers/crypto/Kconfig > >>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL > >>> acceleration for cryptographic algorithms on these devices. > >>> > >>> source "drivers/crypto/keembay/Kconfig" > >>> +source "drivers/crypto/aspeed/Kconfig" > >>> > >>> endif # CRYPTO_HW > >>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index > >>> f81703a86b98..116de173a66c 100644 > >>> --- a/drivers/crypto/Makefile > >>> +++ b/drivers/crypto/Makefile > >>> @@ -1,5 +1,6 @@ > >>> # SPDX-License-Identifier: GPL-2.0 > >>> obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/ > >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/ > >>> obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o > >>> obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o > >>> obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o diff --git > >>> a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig new > >>> file mode 100644 index 000000000000..059e627efef8 > >>> --- /dev/null > >>> +++ b/drivers/crypto/aspeed/Kconfig > >>> @@ -0,0 +1,32 @@ > >>> +config CRYPTO_DEV_ASPEED > >>> + tristate "Support for Aspeed cryptographic engine driver" > >>> + depends on ARCH_ASPEED > >>> + help > >>> + Hash and Crypto Engine (HACE) is designed to accelerate the > >>> + throughput of hash data digest, encryption and decryption. > >>> + > >>> + Select y here to have support for the cryptographic driver > >>> + available on Aspeed SoC. > >>> + > >>> +config CRYPTO_DEV_ASPEED_HACE_HASH > >>> + bool "Enable Aspeed Hash & Crypto Engine (HACE) hash" > >>> + depends on CRYPTO_DEV_ASPEED > >>> + select CRYPTO_ENGINE > >>> + select CRYPTO_SHA1 > >>> + select CRYPTO_SHA256 > >>> + select CRYPTO_SHA512 > >>> + select CRYPTO_HMAC > >>> + help > >>> + Select here to enable Aspeed Hash & Crypto Engine (HACE) > >>> + hash driver. > >>> + Supports multiple message digest standards, including > >>> + SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on. > >>> + > >>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG > >>> + bool "Enable HACE hash debug messages" > >>> + depends on CRYPTO_DEV_ASPEED_HACE_HASH > >>> + help > >>> + Print HACE hash debugging messages if you use this option > >>> + to ask for those messages. > >>> + Avoid enabling this option for production build to > >>> + minimize driver timing. > >>> diff --git a/drivers/crypto/aspeed/Makefile > >> b/drivers/crypto/aspeed/Makefile > >>> new file mode 100644 > >>> index 000000000000..8bc8d4fed5a9 > >>> --- /dev/null > >>> +++ b/drivers/crypto/aspeed/Makefile > >>> @@ -0,0 +1,6 @@ > >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o > >>> +aspeed_crypto-objs := aspeed-hace.o \ > >>> + $(hace-hash-y) > >>> + > >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += > aspeed-hace-hash.o > >>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := > >> aspeed-hace-hash.o > >>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c > >> b/drivers/crypto/aspeed/aspeed-hace-hash.c > >>> new file mode 100644 > >>> index 000000000000..63a8ad694996 > >>> --- /dev/null > >>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c > >>> @@ -0,0 +1,1389 @@ [...] > >>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev > >> *hace_dev) > >>> +{ > >>> + struct aspeed_engine_hash *hash_engine = > &hace_dev->hash_engine; > >>> + struct ahash_request *req = hash_engine->req; > >>> + struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req); > >>> + struct aspeed_sg_list *src_list; > >>> + struct scatterlist *s; > >>> + int length, remain, sg_len, i; > >>> + int rc = 0; > >>> + > >>> + remain = (rctx->total + rctx->bufcnt) % rctx->block_size; > >>> + length = rctx->total + rctx->bufcnt - remain; > >>> + > >>> + AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n", > >>> + "rctx total", rctx->total, "bufcnt", rctx->bufcnt, > >>> + "length", length, "remain", remain); > >>> + > >>> + sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents, > >>> + DMA_TO_DEVICE); > >>> + if (!sg_len) { > >>> + dev_warn(hace_dev->dev, "dma_map_sg() src error\n"); > >>> + rc = -ENOMEM; > >>> + goto end; > >>> + } > >>> + > >>> + src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr; > >>> + rctx->digest_dma_addr = dma_map_single(hace_dev->dev, > rctx->digest, > >>> + SHA512_DIGEST_SIZE, > >>> + DMA_BIDIRECTIONAL); > >>> + if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) { > >>> + dev_warn(hace_dev->dev, "dma_map() rctx digest error\n"); > >>> + rc = -ENOMEM; > >>> + goto free_src_sg; > >>> + } > >>> + > >>> + if (rctx->bufcnt != 0) { > >>> + rctx->buffer_dma_addr = dma_map_single(hace_dev->dev, > >>> + rctx->buffer, > >>> + rctx->block_size * 2, > >>> + DMA_TO_DEVICE); > >>> + if (dma_mapping_error(hace_dev->dev, > rctx->buffer_dma_addr)) { > >>> + dev_warn(hace_dev->dev, "dma_map() rctx buffer > error\n"); > >>> + rc = -ENOMEM; > >>> + goto free_rctx_digest; > >>> + } > >>> + > >>> + src_list[0].phy_addr = rctx->buffer_dma_addr; > >>> + src_list[0].len = rctx->bufcnt; > >>> + length -= src_list[0].len; > >>> + > >>> + /* Last sg list */ > >>> + if (length == 0) > >>> + src_list[0].len |= HASH_SG_LAST_LIST; > >>> + > >>> + src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr); > >>> + src_list[0].len = cpu_to_le32(src_list[0].len); > >>> + src_list++; > >>> + } > >>> + > >>> + if (length != 0) { > >>> + for_each_sg(rctx->src_sg, s, sg_len, i) { > >>> + src_list[i].phy_addr = sg_dma_address(s); > >>> + > >>> + if (length > sg_dma_len(s)) { > >>> + src_list[i].len = sg_dma_len(s); > >>> + length -= sg_dma_len(s); > >>> + > >>> + } else { > >>> + /* Last sg list */ > >>> + src_list[i].len = length; > >>> + src_list[i].len |= HASH_SG_LAST_LIST; > >>> + length = 0; > >>> + } > >>> + > >>> + src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr); > >>> + src_list[i].len = cpu_to_le32(src_list[i].len); > >>> + } > >>> + } > >>> + > >>> + if (length != 0) { > >>> + rc = -EINVAL; > >>> + goto free_rctx_buffer; > >>> + } > >>> + > >>> + rctx->offset = rctx->total - remain; > >>> + hash_engine->src_length = rctx->total + rctx->bufcnt - remain; > >>> + hash_engine->src_dma = hash_engine->ahash_src_dma_addr; > >>> + hash_engine->digest_dma = rctx->digest_dma_addr; > >>> + > >>> + goto end; > >> Exiting via "goto xx" is not recommended in normal code logic (this > >> requires two jumps), exiting via "return 0" is more efficient. > >> This code method has many times in your entire driver, it is > >> recommended to modify it. > > > > If not exiting via "goto xx", how to release related resources without any > problem? > > Is there any proper way to do this? > maybe I didn't describe it clearly enough. > "in normal code logic" means rc=0 > In this scenario (rc=0), "goto xx" is no longer required, it can be replaced with > "return 0" Okay, I got your point. In this case, "goto end" is no longer required of course. I would send next patch with this fixed included. > > > >>> + > >>> +free_rctx_buffer: > >>> + if (rctx->bufcnt != 0) > >>> + dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr, > >>> + rctx->block_size * 2, DMA_TO_DEVICE); > >>> +free_rctx_digest: > >>> + dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr, > >>> + SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL); > >>> +free_src_sg: > >>> + dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents, > >>> + DMA_TO_DEVICE); > >>> +end: > >>> + return rc; > >>> +} > >>> + > >>> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev) > >>> +{ > >>> + struct aspeed_engine_hash *hash_engine = > &hace_dev->hash_engine; > >>> + struct ahash_request *req = hash_engine->req; > >>> + > >>> + AHASH_DBG(hace_dev, "\n"); > >>> + > >>> + hash_engine->flags &= ~CRYPTO_FLAGS_BUSY; > >>> + > >>> + crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0); > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +/* > >>> + * Copy digest to the corresponding request result. > >>> + * This function will be called at final() stage. > >>> + */ > >>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev) > >>> +{ > >>> + struct aspeed_engine_hash *hash_engine = > &hace_dev->hash_engine; > >>> + struct ahash_request *req = hash_engine->req; > >>> + struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req); > >>> + > >>> + AHASH_DBG(hace_dev, "\n"); > >>> + > >>> + dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr, > >>> + SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL); > >>> + > >>> + dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr, > >>> + rctx->block_size * 2, DMA_TO_DEVICE); > >>> + > >>> + memcpy(req->result, rctx->digest, rctx->digsize); > >>> + > >>> + return aspeed_ahash_complete(hace_dev); } > >>> + > >>> +/* > >>> + * Trigger hardware engines to do the math. > >>> + */ > >>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev > *hace_dev, > >>> + aspeed_hace_fn_t resume) > >>> +{ > >>> + struct aspeed_engine_hash *hash_engine = > &hace_dev->hash_engine; > >>> + struct ahash_request *req = hash_engine->req; > >>> + struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req); > >>> + > >>> + AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x, > >> length:0x%x\n", > >>> + hash_engine->src_dma, hash_engine->digest_dma, > >>> + hash_engine->src_length); > >>> + > >>> + rctx->cmd |= HASH_CMD_INT_ENABLE; > >>> + hash_engine->resume = resume; > >>> + > >>> + ast_hace_write(hace_dev, hash_engine->src_dma, > >> ASPEED_HACE_HASH_SRC); > >>> + ast_hace_write(hace_dev, hash_engine->digest_dma, > >>> + ASPEED_HACE_HASH_DIGEST_BUFF); > >>> + ast_hace_write(hace_dev, hash_engine->digest_dma, > >>> + ASPEED_HACE_HASH_KEY_BUFF); > >>> + ast_hace_write(hace_dev, hash_engine->src_length, > >>> + ASPEED_HACE_HASH_DATA_LEN); > >>> + > >>> + /* Memory barrier to ensure all data setup before engine starts */ > >>> + mb(); > >>> + > >>> + ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD); > >> A hardware service sending requires 5 hardware commands to complete. > >> In a multi-concurrency scenario, how to ensure the order of commands? > >> (If two processes send hardware task at the same time, How to ensure > >> that the hardware recognizes which task the current command belongs > >> to?) > > > > Linux crypto engine would guarantee that only one request at each time to > be dequeued from engine queue to process. > > And there has lock mechanism inside Linux crypto engine to prevent the > scenario you mentioned. > > So only 1 aspeed_hace_ahash_trigger() hardware service would go through > at a time. > > > > [...] > > . > > > You may not understand what I mean, the command flow in a normal scenario: > request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5 > request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5 > In a multi-process concurrent scenario, multiple crypto engines can be enabled, > and each crypto engine sends a request. If multiple requests here enter > aspeed_hace_ahash_trigger() at the same time, the command flow will be > intertwined like this: > request_A, request_B: > Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A > cmd5-->Bcmd5 > > In this command flow, how does your hardware identify whether these > commands belong to request_A or request_B? > Thanks. > Longfang. For my understanding, all requests will transfer into engine queue through crypto_transfer_hash_request_to_engine(). In your example, request_A & request_B would also enqueue into the engine queue, and pump out 1 request which might be FIFO to handle it. crypto_pump_requests() will dequeue only 1 request at a time and to prepare_request() & do_one_request() if it's registered. And aspeed_hace_ahash_trigger() is inside do_one_request(), so that means no other requests would come in during aspeed_hace_ahash_trigger() whole process. The command flow intertwined Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->Acmd5-->Bcmd5 would not exist in any scenario. Correct me if I'm misunderstanding, Thanks.