On 3 December 2015 at 03:56, Alasdair G Kergon <agk@xxxxxxxxxx> wrote: > On Wed, Dec 02, 2015 at 08:46:54PM +0800, Baolin Wang wrote: >> These are the benchmarks for request based dm-crypt. Please check it. > > Now please put request-based dm-crypt completely to one side and focus > just on the existing bio-based code. Why is it slower and what can be > adjusted to improve this? > OK. I think I find something need to be point out. 1. From the IO block size test in the performance report, for the request based, we can find it can not get the corresponding performance if we just expand the IO size. Because In dm crypt, it will map the data buffer of one request with scatterlists, and send all scatterlists of one request to the encryption engine to encrypt or decrypt. I found if the scatterlist list number is small and each scatterlist length is bigger, it will improve the encryption speed, that helps the engine palys best performance. But a big IO size does not mean bigger scatterlists (maybe many scatterlists with small length), that's why we can not get the corresponding performance if we just expand the IO size I think. 2. Why bio based is slower? If you understand 1, you can obviously understand the crypto engine likes bigger scatterlists to improve the performance. But for bio based, it only send one scatterlist (the scatterlist's length is always '1 << SECTOR_SHIFT' = 512) to the crypto engine at one time. It means if the bio size is 1M, the bio based will send 2048 times (evey time the only one scatterlist length is 512 bytes) to crypto engine to handle, which is more time-consuming and ineffective for the crypto engine. But for request based, it can map the whole request with many scatterlists (not just one scatterlist), and send all the scatterlists to the crypto engine which can improve the performance, is it right? Another optimization solution I think is we can expand the scatterlist entry number for bio based. > People aren't going to take a request-based solution seriously until > you can explain in full detail *why* bio-based is slower AND why it's > impossible to improve its performance. > >> For request based things, some sequential bios/requests can merged >> into one request to expand the IO size to be a big block handled by >> hardware engine at one time. > > Bio-based also merges I/O so that does not provide justification. > Investigate in much more detail the actual merging and scheduling > involved in the cases you need to optimise. See if blktrace gives you > any clues, or add your own instrumentation. You could even look at some > of the patches we've had in the list archives for optimising bio-based > crypt in different situations. > > Alasdair > -- Baolin.wang Best Regards -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html