On Thu, 3 Dec 2015, Baolin Wang wrote: > On 3 December 2015 at 03:56, Alasdair G Kergon <agk@xxxxxxxxxx> wrote: > > On Wed, Dec 02, 2015 at 08:46:54PM +0800, Baolin Wang wrote: > >> These are the benchmarks for request based dm-crypt. Please check it. > > > > Now please put request-based dm-crypt completely to one side and focus > > just on the existing bio-based code. Why is it slower and what can be > > adjusted to improve this? > > > > OK. I think I find something need to be point out. > 1. From the IO block size test in the performance report, for the > request based, we can find it can not get the corresponding > performance if we just expand the IO size. Because In dm crypt, it > will map the data buffer of one request with scatterlists, and send > all scatterlists of one request to the encryption engine to encrypt or > decrypt. I found if the scatterlist list number is small and each > scatterlist length is bigger, it will improve the encryption speed, This optimization is only applicable to XTS mode. XTS has its weaknesses and it is not recommended for encryption of more than 1TB of data ( http://grouper.ieee.org/groups/1619/email/msg02357.html ) You can optimize bio-based dm-crypt as well (use larger encryption chunk than 512 bytes when the mode is XTS). The most commonly used mode aes-cbc-essiv:sha256 can't be optimized that way. You have to do encryption and decryption sector by sector because every sector has different IV. Mikulas > that helps the engine palys best performance. But a big IO size does > not mean bigger scatterlists (maybe many scatterlists with small > length), that's why we can not get the corresponding performance if we > just expand the IO size I think. > > 2. Why bio based is slower? > If you understand 1, you can obviously understand the crypto engine > likes bigger scatterlists to improve the performance. But for bio > based, it only send one scatterlist (the scatterlist's length is > always '1 << SECTOR_SHIFT' = 512) to the crypto engine at one time. It > means if the bio size is 1M, the bio based will send 2048 times (evey > time the only one scatterlist length is 512 bytes) to crypto engine to > handle, which is more time-consuming and ineffective for the crypto > engine. But for request based, it can map the whole request with many > scatterlists (not just one scatterlist), and send all the scatterlists > to the crypto engine which can improve the performance, is it right? > > Another optimization solution I think is we can expand the scatterlist > entry number for bio based. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html