On Wed, Apr 27, 2011 at 10:07:14AM -0500, Will Drewry wrote: > On Wed, Apr 27, 2011 at 2:19 AM, Milan Broz <mbroz@xxxxxxxxxx> wrote: > > Hi, > > > > On 04/26/2011 05:17 PM, Will Drewry wrote: > >> Recently, I've been benchmarking some different hardware crypto > >> accelerators and many of them appear to be tuned toward largish > >> requests (up to 16k) with a given key and a base IV. > > > > Please can you explicitly say which accelerators you are using and > > show some benchmarks? > > Sure, I was looking at the in-development tegra_aes kernel module. As > I was trying to understand the available performance ignoring > in-kernel overhead, I wanted to push as much data as possible into the > requests at a time as I tweaked how the different pieces worked. > > The difference was a change from 2.2 megabytes/s -> 3.2 megabytes/s > using a simplistic synchronous dd + drop_caches scaffold. Without > using conv=sync, the performance difference was not as noticeable (for > obvious reasons). A more comprehensive benchmark suite, like iozone, > would bubble up better results, though. So we are talking about a very slow CPU here. BTW, if you use the device in /dev/mapper/<something> directly, that should bypass all OS caches and completely encrypt/decrypt an each access. 3.2 MB/sec is pretty slow. A current desktop CPU gets something like 500MB/sec with aes-cbc-essiv:sha256. [...] > >> ?I've created a > >> very simple patch for dm-crypt that uses PAGE_SIZE blocks to aid in > >> the driver performance testing, but I lack the cryptographic > >> understanding to determine if there is significant exposure by > >> allowing a dm-crypt device to use a block size that exceeds the sector > >> size. > > > > As Arno said, there should be no real security problem for these block > > sizes. Basically we are just using CBC or XTS mode today. > > Cool. (Also, sorry for not replying there. I failed to subscribe in > advance of the mailing and missed the response :/ ) No problem. I appreciate getting feedback, but it does not have to be immediate ;-) > > For XTS-AES, definition explicitly says that data unit (= your block) > > size should not exceed 2^20 128bit blocks (128bit = AES cipher block). > > (And even here possible attacks are closely related to birthday > > bound, IOW you need to have enough blocks encrypted with the same key.) Ah, right, it was an XTS limit we discussed here some time ago. > > So I do not see real security problem here. But problems are elsewhere. > > Thanks - nice to have that confirmed. > > >> 1. Does anyone know if there will be significant exposure to the > >> plaintext if dm-crypt used larger block sizes? > > > > Should not be. > > > >> 2. Would an optional, configurable block-size (up to PAGE_SIZE) be of > >> interest? > > > > Short answer would be no :-) > > > > As I said, I would like to prove first that the problem is really in block > > size and not in related problem. > > > > Now the real problems: > > > > The whole device mapper and dmcrypt works as transparent block encryption > > and we are always operating on 512B sectors. > > > > Even if device is 4k blocks, this is hidden in underlying layer and > > DM just properly aligns data and propagates limits but > > still operates on 512B sectors. (It can be ineffective for some > > IO patterns, but it works). > > > > Changing encryption block size causes device to be incompatible with other > > systems (note stacked devices, a common thing here - LVM over dmcrypt) > > and IOs. You have to generate only aligned IO of your encryption block size. > > > > (or change dmcrypt significantly) > > > > IO hints is not enough - maybe example is better here: > > > > > > Testing device (some random data there, not important) > > # dmsetup table --showkeys > > x: 0 417792 crypt aes-cbc-essiv:sha256 aeb26d1f69eb6dddfb9381eed4d7299f091e99aa5d3ff06866d4ce9f620f7aca 0 8:16 0 > > > > Let's generate some direct IOs (to avoid page cache) > > Ah! I had forgotten about direct I/O. I was using fsync and > /proc/sys/vm/drop_caches to clear the page cache across each call. > direct is a whole other beast :/ Well, dm-crypt and LUKS sit conceptually directly on the block device and give you another block device. Caching and buffering, other than by the disk itself, are a bit higher in the stack. I don't know, but I think you do not need the "direct" flag when accessing /dev/mapper/<something>. At least some quick measurements with time head -c 100M /dev/mapper/c1 > /dev/null seem to agree. > > *Without* your patch: > > > > # dd if=/dev/mapper/x iflag=direct bs=512 count=32 | sha256sum > > eed6cf19ee9b2ecc5f4a6d1b251468fd9d691cbee67124de730078a1eda2c0c4 ?- > > > > # dd if=/dev/mapper/x iflag=direct bs=4096 count=4 | sha256sum > > eed6cf19ee9b2ecc5f4a6d1b251468fd9d691cbee67124de730078a1eda2c0c4 ?- > > > > # dd if=/dev/mapper/x iflag=direct bs=8192 count=2 | sha256sum > > eed6cf19ee9b2ecc5f4a6d1b251468fd9d691cbee67124de730078a1eda2c0c4 ?- > > > > As you can see, we get the same plain data with different IO sizes. > > > > Now *with* your patch (page size is 4096): > > > > # dd if=/dev/mapper/x iflag=direct bs=512 count=32 | sha256sum > > dd: reading `/dev/mapper/x': Invalid argument > > > > # dd if=/dev/mapper/x iflag=direct bs=4096 count=4 | sha256sum > > 4f4271e7799097b6e0ed66d81a8341163b8a5a06a2c57f50b930d429a7aa94d1 ?- > > > > # dd if=/dev/mapper/x iflag=direct bs=8192 count=2 | sha256sum > > 17cf9897059800f5b43af38766471048b872d20a0f565ee553a351b1a6251141 ?- > > > > So block size of 512B causes operation to fail (ok - IO hints). > > IO of block encryption size and multiple of encryption size returns > > apparently something different now. Could that be different sector counts messing with the IVs? > > This is probably not what we want... > > > > (Note that I did not even tested cross-encryption-block operations.) > > Hehe - not at all! Nooo, I would never get finished updating the FAQ to explain this ;-) > > Even if this is somehow solved, many other problems remains: > > > > - we need to extend mapping table parameters so the block size > > must be configurable (encrypted device image must be readable > > on system with different page size, I have e.g. Sparc with 8k page size. > > (This will be needed for other extensions so it is not real > > problem, just it need to be done first.) > > Certainly - I definitely wouldn't want it page-size bound in general, > and I suspect that most consumers of dm-crypt would still want a > single sector block size. Also with plain dm-crypt, any non-defaults would have to be given on each call (no metadata) and this would confuse people no end. > > - you need to store this block size info in header, > > for LUKS it means using new LUKS header version > > (requiring parameter on commandline is dangerous - it must be enforced) > > Ouch - so much for being lazy! > > > I would really better to not support this yet and first try to optimize > > crypto layer such way that it can process 512B blocks more > > efficiently (of course it will not fix bad hw but it can help batching > > sector encryption, maybe suing some hints, dunno). > > That makes perfect sense to me. I'll keep using this patch for easy > testing of known (non-direct io) test loads (or give in and just add > the drivers I'm playing with to the existing crypto test module :). You should also measure set-up time (new IV, key stays the same) of the hardware vs encryption speed. Maybe using 512B blocks is actually not that much slower. > Thanks for the thorough and thoughtful response! > will > > [I always seem to learn something new from mails to dm-*!] Thanks, nice to hear that! Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@xxxxxxxxxxx GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F ---- Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans If it's in the news, don't worry about it. The very definition of "news" is "something that hardly ever happens." -- Bruce Schneier _______________________________________________ dm-crypt mailing list dm-crypt@xxxxxxxx http://www.saout.de/mailman/listinfo/dm-crypt