On Mon, Apr 20, 2015 at 06:51:03PM +0200, Richard Weinberger wrote: > My thought was that compression is not far away from crypto an hence > a lot of ecryptfs could be reused. The problem with using eCryptfs as a base is that it assumes that the encryption is constant-sized --- i.e., that a 4096 plaintext block encrypts to a 4096 ciphertext block. This is *not* true for compression. The other problem with eCryptfs is that since the underlying file system doesn't know it's being stacked, you end up burning memory for both the plaintext and ciphertext versions of the file. This is one of the reasons why eCryptfs wasn't considered for future versions of Android; instead we've added encryption into the ext4 file system layer instead. (With most of the interesting bits in separate files, and where I've been communicating with the f2fs maintainer so that f2fs can add the same encryption feature into f2fs). For compression, what I'd recommend doing is something similar; do it at the file system level, but structure it such that it's relatively easy for other file systems to reuse "library code" for the core data transforms. However, allow the underlying file system to use its own specialized storage for things like flags, xattrs, etc., since it can be made more efficient. What I'd also suggest is that you support read-only compression (which is what MacOS did as well), and do it by using a chunksize of say, 32k or 64k, and at the very end of the file, store a pointer to the compressed chunk directory which is simply a header which describes the chunk size (and other useful bits, such as the compression algorith, *possibly* a space for a preset compression dictionary that would be shared across all of the chunks, if that makes sense, and then a list of offsets into the files which gives the starting offset for chunk #0, chunk #1, chunk #2, etc. This file would be created with some help from a userspace application; said userspace application would do the compression and write out the compressed file, and then call an ioctl which sets an attribute which (a) flushes the page cache from containing the compressed version of the file, and (b) marks the inode as read-only and containing compressed data. When the kernel reads from the file, it reads the compression header and directory, and then pages into the page cache a chunk at a time --- that is, if userspace requests a single 4k page, the kernel will read in whatever blocks are needed to decompress the 64k chunk containing that page, and populate the page cache with that 64k chunk. I've sketched this design out a few times, hoping to interest someone into implementing it for ext4, but this is the sort of thing that could be implemented as a library, and then easily spliced into mulitple file systems. Cheers, - Ted P.S. Note that one of the things about this design is that although it requires userspace support, it's *perfect* for files which are installed via a package, whether that be an RPM, dpkg, or apk. You just need to create a userspace library which takes the incoming file stream from the package file, and then writes out the compressed version of the file and marks the file as containing compressed data. It shouldn't be hard, once the userspace library is created, to modify rpm, dpkg, etc., to take advantage of this feature. And these package files are the ones which are *perfect* candidates for compression; they tend to be written once, and read many times, and in general they are read-only. (Yes, there are exceptions for config files, but rpm and dpkg already have a way of specifying which files are config files, which is important if you want to verify that the unpacked pacakge is consistent with what was installed originally.) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html