This is pretty much as expected. It is also very fast for encrypted storage. If the numbers disturb you, then this is from lack of understanding on your side. You are probably unaware that encryption is a heavy-weight operation and that even with hardware support in the CPU, you will be limited by what your CPU crypto-accelerator can do. Incidentally, the "crypsetup benchmark" command would have shown you the upper limits for your particular hardware with far less work. Regards, Arno On Fri, Sep 22, 2017 at 00:34:41 CEST, Ivan Babrou wrote: > Hello, > > We were looking at LUKS performance and found some disturbing numbers on SSDs. > > * Linear write performance > > We took 2 identical disks, encrypted one of them, put XFS on both and > tested linear write speed with fio: > > [rewrite] > size=200g > bs=1m > rw=write > direct=1 > loops=10000 > > Without LUKS we are getting 450MB/s write, with LUKS we are twice as > low at 225MB.s > > * Linear read performance > > To avoid hitting any XFS bugs we just read 1GB from raw device and > from corresponding LUKS device, both with direct io. We try different > block sizes too. Here's the script we used: > > #!/bin/bash -e > > SIZE=$((1024 * 1024 * 1024)) > > for power in $(seq 12 30); do > BS=$((2 ** $power)) > COUNT=$(($SIZE / $BS)) > TIME_DIRECT=$(sudo dd if=/dev/sdd of=/dev/null bs=$BS count=$COUNT > iflag=direct 2>&1 | tail -n1 | awk '{ print $(NF-1) }') > TIME_LUKS=$(sudo dd if=/dev/mapper/luks-sdd of=/dev/null bs=$BS > count=$COUNT iflag=direct 2>&1 | tail -n1 | awk '{ print $(NF-1) }') > echo -e "${BS}\t${TIME_DIRECT}\t${TIME_LUKS}" > done > > And the output: > > 4096 59.5 52.6 > 8192 103 91.0 > 16384 158 139 > 32768 227 181 > 65536 287 228 > 131072 354 243 > 262144 373 251 > 524288 428 307 > 1048576 446 327 > 2097152 474 396 > 4194304 485 431 > 8388608 496 464 > 16777216 499 483 > 33554432 504 498 > 67108864 508 503 > 134217728 508 506 > 268435456 510 509 > 536870912 511 511 > 1073741824 512 512 > > Here are the results on the graph: https://i.imgur.com/yar1GSC.png > > If I re-do this test with 1GB file on actual filesystem: > > #!/bin/bash -e > > SIZE=$((1024 * 1024 * 1024)) > > for power in $(seq 12 30); do > BS=$((2 ** $power)) > TIME_DIRECT=$(sudo dd if=/mnt/sda/zeros of=/dev/null bs=$BS > iflag=direct 2>&1 | tail -n1 | awk '{ print $(NF-1) }') > TIME_LUKS=$(sudo dd if=/mnt/sdd/zeros of=/dev/null bs=$BS > iflag=direct 2>&1 | tail -n1 | awk '{ print $(NF-1) }') > echo -e "${BS}\t${TIME_DIRECT}\t${TIME_LUKS}" > done > > And the output: > > 4096 73.5 54.8 > 8192 123 86.2 > 16384 189 130 > 32768 251 176 > 65536 302 226 > 131072 345 239 > 262144 373 243 > 524288 395 287 > 1048576 435 297 > 2097152 438 373 > 4194304 457 410 > 8388608 464 429 > 16777216 469 448 > 33554432 474 459 > 67108864 477 463 > 134217728 478 467 > 268435456 480 469 > 536870912 480 470 > 1073741824 481 471 > > Here are the results on the graph: https://i.imgur.com/OQk6kDo.png > > If I do 1MB block reads from raw device (sda) and from LUKS block > device (sdd), then I see the following: > > ivan@36com1:~$ iostat -x -m -d 1 /dev/sd* | grep -E '^(Device:|sda|sdd)' > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 0.01 0.00 76.84 0.34 20.32 16.86 > 986.70 0.60 7.77 1.84 1337.82 0.64 4.94 > sdd 0.03 0.00 379.40 0.83 42.57 33.79 > 411.32 1.64 4.31 1.66 1214.03 0.31 11.87 > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 1002.00 0.00 501.00 0.00 > 1024.00 1.50 1.50 1.50 0.00 0.97 97.60 > sdd 0.00 0.00 655.00 0.00 327.50 0.00 > 1024.00 1.00 1.53 1.53 0.00 1.01 66.00 > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 999.00 0.00 499.52 0.00 > 1024.05 1.51 1.51 1.51 0.00 0.98 97.80 > sdd 0.00 0.00 650.00 0.00 325.00 0.00 > 1024.00 1.00 1.53 1.53 0.00 1.01 65.60 > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 983.00 0.00 491.48 0.00 > 1023.95 1.52 1.54 1.54 0.00 1.00 98.30 > sdd 0.00 0.00 648.00 0.00 324.00 0.00 > 1024.00 1.00 1.54 1.54 0.00 1.01 65.30 > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 979.00 0.00 490.00 0.00 > 1025.05 1.51 1.54 1.54 0.00 1.00 98.10 > sdd 0.00 0.00 646.00 0.00 323.00 0.00 > 1024.00 0.99 1.54 1.54 0.00 1.01 65.20 > ^C > > End results are 509MB/s and 360MB/s to read full 240GB. This is a > pretty hard hit. > > * Random write performance > > The following fio scenario was used: > > [rewrite] > size=10g > bs=64k > rw=randwrite > direct=1 > numjobs=20 > loops=10000 > > Raw block device gave us ~320MB/s, LUKS only does ~40MB/s. > > * In-memory results > > I made two 10GB loopback devices in tmpfs and formatted one of them as > LUKS. Plain device can read at 4.5GB/s, LUKS device can read at > 0.85GB/s. This is a big difference, but it doesn't really explain > results from physical SSD. > > We are running kernel 4.9, but 4.4 seems to have the same behavior. We > tried completely different SSD model and it had the same behavior > (352MB/s vs 274MB/s linear read). Spinning disks we have can do under > 200MB/s linear read and do not expose the issue. > > Are these numbers expected? Is there any way to improve this situation? > > Thanks! > _______________________________________________ > dm-crypt mailing list > dm-crypt@xxxxxxxx > http://www.saout.de/mailman/listinfo/dm-crypt -- Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: arno@xxxxxxxxxxx GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 ---- A good decision is based on knowledge and not on numbers. -- Plato If it's in the news, don't worry about it. The very definition of "news" is "something that hardly ever happens." -- Bruce Schneier _______________________________________________ dm-crypt mailing list dm-crypt@xxxxxxxx http://www.saout.de/mailman/listinfo/dm-crypt