VIA Padlock: +30% XTS Performance by using ECB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

the VIA Padlock engine comes without native XTS-AES support, thus
compared to CBC-AES or ECB-AES XTS-AES performs quite bad on VIA CPUs
because it calls the Padlock ACE for each single AESenc() operation.
Using the Padlock's ECB-AES saves calls to the Padlock ACE and improves
the XTS-AES performance by 30% and more even in a naive proof-of-concept
implementation.
The idea comes from DiskCryptor which does this since v0.9.583.106.

Here are some performance measues for my VIA Nano U2250 done with
dm-crypt aes-xts-plain on top of a 1GB tmpfs-backed loop-device. The
table shows MB/s measured by dd. The first column shows the creation of
the loop-image in tmpfs, i.e. memory-bandwidth. The next 10 columns show
10 write runs on top of the dm-crypt device. The last column shows a
read run on a 10GB dm-crypt'ed disk-partition (read speed on the plain
partition is ~94MB/s). For the last column I also measured DiskCryptor's
read performance.

xts orig 325 | 38.1 38.4 38.4 38.4 38.4 38.4 38.4 38.4 38.4 38.4 | 34.2
xts PoC  322 | 48.6 49.1 49.1 49.1 49.1 49.2 49.2 49.2 49.2 49.2 | 49.2
DC                                                                 65.1

My proof-of-concept comes not even close to DiskCryptor at the moment
but already improves dm-crypt performance significant.


I attached 4 patches with the proof-of-concept code. They need to be
applied one after the other. The code is really just ugly-hacked
proof-of-concept (except the first patch maybe) with incomplete
error-handling and hardcoded ECB-AES usage. Even though it seems to
encode and decode correctly, I strongly recommend to avoid using it to
handle real data.

Utilizing ECB-AES required to unfold and duplicate the scatterlist-walk.
This does also duplicate the GF-Multiplications, which could probably be
avoided by using an internal buffer.

I have no idea where this should finally be implemented, since it slows
down XTS on non-accelerated CPUs. Maybe a seperate xts-aes-padlock
driver would make sense depending on how specific this is to VIA
Padlock, i.e. how it performs on other non-XTS-capable accelerators.


Please CC: me in replies, I'm not a member of the list. Mail-F'up2
should be set correctly.

regards
   Mario
-- 
File names are infinite in length where infinity is set to 255 characters.
                                -- Peter Collinson, "The Unix File System"

Attachment: xts-01-eliminate-goto.patch.gz
Description: Binary data

Attachment: xts-02-resolve-xts_round.patch.gz
Description: Binary data

Attachment: xts-03-unfold-loop.patch.gz
Description: Binary data

Attachment: xts-04-utilize-ecb.patch.gz
Description: Binary data

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux