On Thu, 2018-01-25 at 21:30 -0500, Theodore Ts'o wrote: > On Thu, Jan 25, 2018 at 04:47:46PM -0800, James Bottomley wrote: > > > > > > How do you know the file is in this special format? Would it be a > > per filesystem flag (so every file) or selectable per-file by some > > other mechanism. If it's per-file, why not simply use the existing > > xattr mechanism? > > It would be using a per-file flag, just like we do with fscrypt. > Given that we only need a single bit of information, using an xattr > would be a inefficeint. > > > > > > > > > *) The pages are verified as they are read, so pages are verified > > > as they are read the storage device; this avoids a large latency > > > hit when the file is first opened or referenced. > > > > The cost of this is presumably one hash per page in the tree, so it > > costs quite a bit in terms of space. Presumably the hash tree is > > also dynamically resident meaning a page fault could now also > > potentially fault in the hash tree, leading to a lot of sub optimal > > I/O patterns? > > This is how dm-verity works, which is used on every single modern > Chrome OS and Android phone, with no complaints. (It doesn't work > that way on your phone, unless you've upgraded. :-) Well ... I'll upgrade when the phones get better ... > > > *) The design and code are done by file system developers, so it > > > doesn't have the locking problems of the IMA code. > > > > That's a bit unfair. My next question was going to be why not just > > make this an actual IMA mode (meaning you could choose to have a > > global hash or a tree hash). Does this mean that a-priori you've > > already ruled out IMA integration because you don't want to work > > with the developers? > > IMA has a lot of complexity, which I would rather not drag in as a > dependency. Also, having seen some of the ah, "discussions" that > Christoph and Dave Chinner have been having with the IMA folks, I'd > rather not taint this proposal with IMA's reputation. :-) > > I am completely open to an optional integration with IMA, but I would > prefer not to require CONFIG_IMA to be enabled in order to use > fs-verity. Thanks ... just checking. There's a lot of integration work going on in IMA and containers at the moment, so I'd like to preserve the investment. > > PKCS11 is the standard for cryptokeys. I presume you just mean a > > message signing standard like PKCS7 or RFC 2315? > > Sorry, I meant PKCS7. It would be a restricted PKCS7 mode, using a > detached signature. My plan was to reuse the existing code we > already have written for signed kernel modules. OK, so presumably the signature would be over the part of the tree at the end of the file (so the tree is already reduced to a hashable binary representation) and this is verified upon write, after which the hash tree is trusted. > > > Most of this feature could also be used with a non-cryptographic > > > checksum to provide data checksums for read-only files in a > > > general way for all file systems. It wouldn't be as flexible as > > > btrfs, but for files being stored for backup purposes, it should > > > work quite well. > > > > I assume the "write" part of this is that the file must be deleted > > and re-created? > > I'm not sure what you mean. Currently container images are simple tar files and one of the main value adds of docker as a tool is the simplicity of the image creation process. That process depends on standard tools like tar to create the image, so I was trying to fit this proposal into that process. > If you have an existing file that you > want to protect using fs-verity, it's a matter of appending the > fs-verity information onto the end of the file, and then setting the > fs-verity flag, at which point all of the fs-verity information > disappears from the perspective of stat(2) and read(2) system calls. > The verity bit can be examined using FS_IOC_GETFLAGS, and more > details about which key was used to sign the file could be examined > via some ioctl interface. OK, so that worries me a bit more. I assume we could use this ioctl to recreate the file by extracting the tree and thence do a conversion to tar format so that the untarred file has the signed hash, but just doing a tar of the directory won't work, so docker save is going to have to be seriously altered to work with this. > In general, though, it's expected that userspace won't care about > such details. Any reads that don't verify will return an error, and > if the key used to sign fs-verity information is not trusted by the > kernel, the open will return an error. So all userspace or a > security policy would need to take of is that file does have the > verity bit set. Right, I get that once the file is created. What I'm concerned about is the process for creating and handling images of the filesystem tree. James