Re: Data Processing Slowdown when LUKS used

Milan Broz <gmazyland@xxxxxxxxx> · Tue, 27 Nov 2018 10:51:47 +0100

Hi,

On 27/11/2018 01:00, H McCurdy wrote:
> Hi,
> 
> I have a mystery and am wondering if anyone has ideas.
> 
> Ubuntu 16.04.5 Kernel 4.15.0-39
> 
> We have incoming UDP data coming in at 2.42Gb/s.  The data is written
> to a 1TB SSD and also processed and some of the output is written to
> a 2nd 1TB SSD.  Every minute data from the 2nd SSD is packaged
> (including compression) and copied to a hardware encrypted HDD RAID.
> If the data is "interesting" several minutes of data from the first
> SSD are copied to the HDD RAID.
> 
> All is good if the 2 1TB SSD's are not encrypted.  If I use LUKS to
> encrypt the SSD's, it runs for about 40 minutes and then the "pipe"
> gets full and we lose packets.  ethtool reports rx_missed_errors.
> 
> I am trying to understand why simply turning on encryption would
> cause the problem and if anyone has ideas.

This is actually interesting problem. It should not behave this way.
I guess some interleaving fsync() could help here, but just as
a workaround.

Could you please send some reproducer script that can be run on some generic hw?
Does it work better with different drives? (rotational disks)

But the first thing would be - please try new kernel (4.19).
(And dm-devel mailing list would be better for this discussion
but we need more data first.)

> I have installed the aes kernel modules cryptd, aes_x86_64 and
> aesni_intel.

I think the encryption speed is not the problem here.

Easy to test - just use null cipher, so all dm-crypt overhead will be
there, but instead encryption, it is plain data copy.

You can even try it with LUKS by specifying "-c null" in format.
You have to use empty password. And never ever use that for real data :-)

> 
> The processor is an Intel i9-7920X (12 core, 24 threads).  The system
> load is only 8.  Using the program 'glances' I don't see where the
> disk i/o is getting fully loaded except for 2 seconds out of 60.
> 
> I don't expect a magic solution but am hoping for ideas since I've
> narrowed this down to LUKS vs no LUKS and have installed the AES
> modules. * * *I did, however, find this article and wonder if it
> means something.  Just by itself, I wonder if anyone would like to
> simply comment on the article for academic reasons. * 
> *https://www.researchgate.net/publication/312627780_Improving_dm-crypt_performance_for_XTS-AES_mode_through_extended_requests_first_results

Sigh, the first note would be that in the academic environment people
*should* mention all sources...
Figure 1 is apparently inspired from
https://mbroz.fedorapeople.org/talks/DeviceMapperBasics/img2.jpg
(sorry, couldn't resist :-)

The paper basically says that some particular hw is very bad
at encryption with 512 blocks, while packing them together
to batches works better.
I think that this can be done without patching dmcrypt, just
by implementing asynchronous crypto driver. (Page cache should
submit BIOs in page-sized IOs. It will not work in other
IO patterns though.)

But anyway, you can use 4k blocks directly now, see below.
Just remember that if this is configured over device that supports
only smaller sector size, sector write is no longer atomic and
you can see some data corruption during power fail.

(IOW: you should use 4k dm-crypt sectors on devices that uses 4k hw sectors.)

> Does it make sense to anyone here for me to attempt to increase the
> block size?  I think it's the -b option to cryptsetup, but that isn't
> clear to me.  A value of 8 would give me 4K, right?  (I would just
> try it but I currently don't have access to the box but likely will
> tomorrow even if I don't get an answer from anyone.)

No, -b is the device size.

Use kernel 4.12+ and --sector-size 4096 (It is supported for plain
mode and LUKS2, LUKS1 cannot use it).

Milan
_______________________________________________
dm-crypt mailing list
dm-crypt@xxxxxxxx
https://www.saout.de/mailman/listinfo/dm-crypt