On 05/06/2021 17:32, Ard Biesheuvel wrote:
Hello Thara,
On Fri, 4 Jun 2021 at 18:49, Thara Gopinath <thara.gopinath@xxxxxxxxxx> wrote:
Hi All,
Below are the performance numbers from running "crypsetup benchmark" on
CE algorithms in the mainline kernel. All numbers are in MiB/s. The
platform used is RB3 for sdm845 and MTPs for rest of them.
SDM845 SM8150 SM8250 SM8350
AES-CBC (128)
Encrypt / Decrypt 114/106 36/48 120/188 133/197
AES-XTS (256)
Encrypt / Decrypt 100/102 49/48 186/187 n/a
The CPU instruction based ones are apparently an order of magnitude
faster, and are synchronous so their latency should be lower.
So, as Eric already pointed out IIRC, there doesn't seem to be much
value in enabling this IP in Linux - it should not be the default
choice/highest priority, and it is not obvious to me whether/when you
would prefer this implementation over the CPU based one. Do you have
any idea how many queues it has, or how much data it can process in
parallel? Are there other features that stand out?
While I can't say much for the qce-crypto. I do know that "cryptsetup
benchmark" isn't the greatest for pitting the hardware accelerated
crypto against the CPU in some instances.
In my case (crypto4xx / CPU is a PowerPC 464 800MHz - Hardware is a
Western Digital My Book Live - NAS) the "benchmark" results look
exceptionally poor:
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 8.0 MiB/s 8.7 MiB/s
aes-cbc 256b 8.7 MiB/s 8.7 MiB/s
aes-xts 256b 5.3 MiB/s 7.9 MiB/s
aes-xts 512b 7.9 MiB/s 7.9 MiB/s
(Hardware doesn't have cts/xts, but aes-cbc, aes-ctr and aes-gcm)
(for comparison, these are numbers that are produced by only the
800 MHz PowerPC CPU)
aes-cbc 128b 15.8 MiB/s 16.3 MiB/s
aes-cbc 256b 12.3 MiB/s 12.8 MiB/s
aes-xts 256b 12.5 MiB/s 15.1 MiB/s
aes-xts 512b 11.9 MiB/s 12.0 MiB/s
and (openssl speed -evp aes-128-cbc --elapsed -seconds 3) software
manages similar numbers:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 12646.42k 16806.66k 18349.31k 18762.07k 18896.21k 18879.83k
However, when I format a partition on the NAS HDD with
cryptsetup + crypto4xx and use hdparm -i / dd
# hdparm -t /dev/mapper/aes-cbc-hw-test
/dev/mapper/aes-cbc-hw-test:
Timing buffered disk reads: 96 MB in 3.05 seconds = 31.46 MB/sec
# dd if=/dev/mapper/aes-cbc-hw-test of=/dev/null bs=8M status=progress
5318377472 bytes (5.3 GB, 5.0 GiB) copied, 143 s, 37.2 MB/s^C
639+0 records in
638+0 records out
5351931904 bytes (5.4 GB, 5.0 GiB) copied, 144.246 s, 37.1 MB/s
whereas without crypto4xx:
# hdparm -t /dev/mapper/aes-cbc-hw-test
/dev/mapper/aes-cbc-hw-test:
Timing buffered disk reads: 34 MB in 3.14 seconds = 10.82 MB/sec
# dd if=/dev/mapper/aes-cbc-hw-test of=/dev/null bs=8M status=progress
46+0 records in
45+0 records out
377487360 bytes (377 MB, 360 MiB) copied, 33.1952 s, 11.4 MB/s
This is 2-3 times the throughput that the CPU alone could do.
@Thara, Do you have a usb-3.0 + fast 3.0 usb-stick? If so, try
to format a partition on it for cryptsetup and try it there.
Cheers,
Christian