Re: xor_blocks() assumptions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 02, 2023 at 11:44:35PM +0100, Lukasz Stelmach wrote:
> Hi,
> 
> I am researching possibility to use xor_blocks() in crypto_xor() and
> crypto_xor_cpy(). What I've found already is that different architecture
> dependent xor functions work on different blocks between 16 and 512
> (Intel AVX) bytes long. There is a hint in the comment for
> async_xor_offs() that src_cnt (as passed to do_sync_xor_offs()) counts
> pages. Thus, it is assumed, that the smallest chunk xor_blocks() gets is
> a single page. Am I right?
> 
> Do you think adding block_len field to struct xor_block_template (and
> maybe some information about required alignment) and using it to call
> do_2 from crypto_xor() may work? I am thinking especially about disk
> encryption where sectors of 512~4096 are handled.
> 

Taking a step back, it sounds like you think the word-at-a-time XOR in
crypto_xor() is not performant enough, so you want to use a SIMD (e.g. NEON,
SSE, or AVX) implementation instead.  Have you tested that this would actually
give a benefit on the input sizes in question, especially considering that SIMD
can only be used in the kernel if kernel_fpu_begin() is executed first?

It also would be worth considering just optimizing crypto_xor() by unrolling the
word-at-a-time loop to 4x or so.

- Eric



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux