On Fri, Oct 09, 2020 at 04:16:38PM -0400, Jeff Layton wrote: > On Thu, 2020-10-08 at 10:46 -0700, Eric Biggers wrote: > > > > First, you should avoid using "PAGE_SIZE" as the crypto data unit size, since > > PAGE_SIZE isn't the same everywhere. E.g. PAGE_SIZE is 4096 bytes on x86, but > > usually 65536 bytes on PowerPC. If encrypted files are created on x86, they > > should be readable on PowerPC too, and vice versa. That means the crypto data > > unit size should be a specific value, generally 4096 bytes. But other > > power-of-2 sizes could be allowed too. > > > > Ok, good point. > > Pardon my lack of crypto knowledge, but I assume we have to ensure that > we use the same crypto block size everywhere for the same inode as well? > i.e., I can't encrypt a 4k block and then read in and decrypt a 16 byte > chunk of it? That's basically correct. As I mentioned earlier: For AES-XTS specifically, *in principle* it's possible to encrypt/decrypt an individual 16-byte aligned region. But Linux's crypto API doesn't currently support sub-message crypto, and also fscrypt supports the AES-CBC and Adiantum encryption modes which have stricter requirements. > > Second, I'm not really understanding what the problem is with setting i_blkbits > > for IS_ENCRYPTED() inodes to the log2 of the crypto data unit size. Wouldn't > > that be the right thing to do? Even though it wouldn't have any meaning for the > > server, it would have a meaning for the client -- it would be the granularity of > > encryption (and decryption). > > > > It's not a huge problem. I was thinking there might be an issue with > some applications, but I don't think it really matters. The blocksize > reported by stat is sort of a nebulous concept anyway when you get to a > network filesystem. > > The only real problem we have is that an application might pass down an > I/O that is smaller than 4k, but we haven't been granted the capability > to do buffered I/O. In that situation, we'll need to read what's there > now (if anything) and then dispatch a synchronous write op that is gated > on that data not having changed. > > There's some benefit to dealing with as small a chunk of data as we can, > but 4k is probably a reasonable chunk to work with in most cases if > that's not possible. Applications can do reads/writes of any length regardless of what they see in stat::st_blksize. So you're going to have to support reads/writes with length less than the data unit size (granularity of encryption) anyway. You can choose whatever data unit size you want; it's a trade-off between the fixed overhead of doing each encryption/decryption operation, and the granularity of I/O that you want to support. I'd assume that 4096 bytes would be a good compromise for ceph, like it is for the other filesystems. It also matches PAGE_SIZE on most platforms. But it's possible that something else would be better. - Eric