Re: Qualcomm Crypto Engine performance numbers on mainline kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/06/2021 17:32, Ard Biesheuvel wrote:
Hello Thara,

On Fri, 4 Jun 2021 at 18:49, Thara Gopinath <thara.gopinath@xxxxxxxxxx> wrote:


Hi All,

Below are the performance numbers from running "crypsetup benchmark" on
CE algorithms in the mainline kernel. All numbers are in MiB/s. The
platform used is RB3 for sdm845 and MTPs for rest of them.


                         SDM845    SM8150     SM8250     SM8350
AES-CBC (128)
Encrypt / Decrypt       114/106  36/48       120/188    133/197

AES-XTS (256)
Encrypt / Decrypt       100/102  49/48       186/187    n/a


The CPU instruction based ones are apparently an order of magnitude
faster, and are synchronous so their latency should be lower.

So, as Eric already pointed out IIRC, there doesn't seem to be much
value in enabling this IP in Linux - it should not be the default
choice/highest priority, and it is not obvious to me whether/when you
would prefer this implementation over the CPU based one. Do you have
any idea how many queues it has, or how much data it can process in
parallel? Are there other features that stand out?

While I can't say much for the qce-crypto. I do know that "cryptsetup
benchmark" isn't the greatest for pitting the hardware accelerated
crypto against the CPU in some instances.

In my case (crypto4xx / CPU is a PowerPC 464 800MHz - Hardware is a
Western Digital My Book Live - NAS) the "benchmark" results look
exceptionally poor:
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b         8.0 MiB/s         8.7 MiB/s
        aes-cbc        256b         8.7 MiB/s         8.7 MiB/s
        aes-xts        256b         5.3 MiB/s         7.9 MiB/s
        aes-xts        512b         7.9 MiB/s         7.9 MiB/s
(Hardware doesn't have cts/xts, but aes-cbc, aes-ctr and aes-gcm)

(for comparison, these are numbers that are produced by only the
800 MHz PowerPC CPU)
        aes-cbc        128b        15.8 MiB/s        16.3 MiB/s
        aes-cbc        256b        12.3 MiB/s        12.8 MiB/s
        aes-xts        256b        12.5 MiB/s        15.1 MiB/s
        aes-xts        512b        11.9 MiB/s        12.0 MiB/s


and (openssl speed -evp aes-128-cbc --elapsed -seconds 3) software
manages similar numbers:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc      12646.42k    16806.66k    18349.31k    18762.07k    18896.21k    18879.83k

However, when I format a partition on the NAS HDD with
cryptsetup + crypto4xx and use hdparm -i / dd

# hdparm -t /dev/mapper/aes-cbc-hw-test

/dev/mapper/aes-cbc-hw-test:
 Timing buffered disk reads:  96 MB in  3.05 seconds =  31.46 MB/sec

# dd if=/dev/mapper/aes-cbc-hw-test of=/dev/null bs=8M status=progress
5318377472 bytes (5.3 GB, 5.0 GiB) copied, 143 s, 37.2 MB/s^C
639+0 records in
638+0 records out
5351931904 bytes (5.4 GB, 5.0 GiB) copied, 144.246 s, 37.1 MB/s

whereas without crypto4xx:

# hdparm -t /dev/mapper/aes-cbc-hw-test

/dev/mapper/aes-cbc-hw-test:
 Timing buffered disk reads:  34 MB in  3.14 seconds =  10.82 MB/sec

# dd if=/dev/mapper/aes-cbc-hw-test of=/dev/null bs=8M status=progress
46+0 records in
45+0 records out
377487360 bytes (377 MB, 360 MiB) copied, 33.1952 s, 11.4 MB/s

This is 2-3 times the throughput that the CPU alone could do.

@Thara, Do you have a usb-3.0 + fast 3.0 usb-stick? If so, try
to format a partition on it for cryptsetup and try it there.

Cheers,
Christian



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux