On Wed, Aug 26, 2020 at 02:56:45PM -0400, Chuck Lever wrote: > > > On Aug 26, 2020, at 2:31 PM, Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > > > On Wed, Aug 26, 2020 at 10:13:43AM -0700, Chuck Lever wrote: > >> Hi Eric- > >> > >> I'm trying to construct a viable IMA metadata format (ie, what > >> goes into security.ima) to support Merkle trees. > >> > >> Rather than storing an entire Merkle tree per file, Mimi would > >> like to have a metadata format that can store the root hash of > >> a Merkle tree. Instead of reading the whole tree, an NFS client > >> (for example) would generate the parts of the file's fs-verity > >> Merkle tree on-demand. The tree itself would not be exposed or > >> transported by the NFS protocol. > > > > This won't work because you'd need to reconstruct the whole Merkle tree when > > reading the first byte from the file. Check the fs-verity FAQ > > (https://www.kernel.org/doc/html/latest/filesystems/fsverity.html#faq) where I > > explained this in more detail (fourth question). > > We agree there are inefficiencies with the proposed scheme. The > Merkle tree would be rehydrated at measurement time, and used at > read time to verify the results of each subsequent NFS READ. > > We assume that parts of the tree and parts of the file content > can be evicted from the client's memory at any time. So verifying > READ results may require rehydration of some or all of the Merkle > tree. If we're careful, eviction might avoid the higher levels of > the tree to prevent the need to read the whole file again. > > So, maybe we want to store the first level or two of the tree as > well? Obviously there is a limit to how much can be stored in an > extended attribute. That's going to be very inefficient, and difficult to handle the caching, preferential eviction, and constant tree rebuilding. IMO, the only model that really makes sense is one where the full tree is stored persistently. Have you considered options for how that could be done in NFS? What NFS protocol modifications (if any) are in scope? > >> Following up with the recent thread on linux-integrity, starting > >> here: > >> > >> https://lore.kernel.org/linux-integrity/1597079586.3966.34.camel@xxxxxxxxxxxxxxxxxxxxx/t/#u > >> > >> I think the following will be needed. > >> > >> 1. The parameters for (re)constructing the Merkle tree: > >> - The name of the digest algorithm > >> - The unit size represented by each leaf in the tree > >> - The depth of the finished tree > >> - The size of the file > >> - Perhaps a salt value > >> - Perhaps the file's mtime at the time the hash was computed > >> - The root hash > > > > Well, the xattr would need to contain the same information as > > struct fsverity_enable_arg, the argument to FS_IOC_ENABLE_VERITY. > > > >> 2. A fingerprint of the signer: > >> - The name of the digest algorithm > >> - The digest of the signer's certificate > >> > >> 3. The signature > >> - The name of the signature algorithm > >> - The signature, computed over 1. > > > > I thought there was a desire to just use the existing "integrity.ima" > > signature format. > > I am very interested in using EVM_IMA_DIGSIG. However, there appears > to be a consensus that for cases like NFS, every readpage result needs > to be verified, just as fs-verity does it. > > I suppose measurement for an NFS file could involve verifying a > saved linear hash while at the same time constructing a Merkle tree > on the client? fs-verity is mostly just a way of hashing a file. Can't IMA just continue to do its signatures in the same way, and just swap out the traditional full file hash with the fs-verity file hash (when it's enabled)? fs-verity does support its own signature mechanism, because people wanted a simple knob to set that makes the kernel verify and enforce signatures for all fs-verity files. But it's not mandatory to use that. - Eric