Hi, On 2018/7/18 23:34, Ard Biesheuvel wrote: > On 18 July 2018 at 19:59, Arnd Bergmann <arnd@xxxxxxxx> wrote: >> On Wed, Jul 18, 2018 at 9:30 AM, Xiongfeng Wang >> <wangxiongfeng2@xxxxxxxxxx> wrote: >>> >>> I tested the performance of software implemented ciphers before and after >>> applying this patchset. The performance didn't change much except for >>> slight regression when writting. The detail information is as follows. >>> >>> The command I used: >>> cryptsetup -y -c aes-xts-plain -s 256 --hash sha256 luksFormat /dev/sdd1 >>> cryptsetup -y -c aes-cbc-essiv:sha256 -s 256 --hash sha256 luksFormat /dev/sdd1 >>> cryptsetup -y -c aes-cbc-benbi -s 256 --hash sha256 luksFormat /dev/sdd1 >>> >>> cryptsetup luksOpen /dev/sdd1 crypt_fun >>> time dd if=/dev/mapper/crypt_fun of=/dev/null bs=1M count=500 iflag=direct >>> time dd if=/dev/zero of=/dev/mapper/crypt_fun bs=1M count=500 oflag=direct >>> >>> Performance comparision: >>> -------------------------------------------------------- >>> algorithms | before applying | after applying >>> -------------------------------------------------------- >>> | read | write | read | write >>> -------------------------------------------------------- >>> aes-xts-plain | 145.34 | 145.09 | 145.89 | 144.2 >>> -------------------------------------------------------- >>> aes-cbc-essiv | 146.87 | 144.62 | 146.74 | 143.41 >>> -------------------------------------------------------- >>> aes-cbc-benbi | 146.03 | 144.74 | 146.77 | 144.46 >>> -------------------------------------------------------- >> >> Do you have any estimate of the expected gains for hardware >> implementations? >> >> Would it make sense to try out implementing aes-cbc-essiv >> on the ARMv8 crypto extensions? I see that Ard has done >> some prior work on aes-ccm in arch/arm64/crypto/aes-ce-ccm-* >> that (AFAICT) has a similar goal of avoiding overhead by >> combining the usual operations, so maybe the same can >> be done here. >> > > I am having trouble understanding what exactly this series aims to achieve. > > Calling into the crypto layer fewer times is a nice goal, but a disk > sector seems like a reasonable granularity for the dm layer to operate > on, and I don't think any hardware exists that operates on multi > sector sequences, where it would pay off to amortize the latency of > invoking the hardware over an entire bio. I don't know much about crypto hardware, but I think a crypto hardware can handle data more than one sector at one time. So I think passing the whole bio to the hardware at one time will decrease the overhead in passing each sector alternatively. Thanks, Xiongfeng > > So in summary, you need to explain to us why we need this. It is > really very easy to convince people if your changes make things go > faster. > > . > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel