> On Sep 16, 2019, at 12:10 PM, Theodore Y. Ts'o <tytso@xxxxxxx> wrote: > > On Sun, Sep 15, 2019 at 05:42:10PM -0400, Mimi Zohar wrote: >>>> My thought was to use an ephemeral Merkle tree for NFS (and >>>> possibly other remote filesystems, like FUSE, until these >>>> filesystems support durable per-file Merkle trees). A tree would >>>> be constructed when the client measures a file, but it would not >>>> saved to the filesystem. Instead of a hash of the file's contents, >>>> the tree's root signature is stored as the IMA metadata. >>>> >>>> Once a Merkle tree is available, it can be used in exactly the >>>> same way that a durable Merkle tree would, to verify the integrity >>>> of individual pages as they are used, evicted, and then read back >>>> from the server. >>>> >>>> If the client needs to evict part or all of an ephemeral tree, it >>>> can subsequently be reconstructed by measuring the file again and >>>> verifying its root signature against the stored IMA metadata. > > Where would the client store the ephemeral tree? If you're thinking > about storing in memory, calculating the emphemeral tree would require > dragging the entire file across the network, which is going to be just > as bad as using IMA --- plus the CPU cost of calculating the Merkle > tree, and the memory cost of storing the ephemeral Merkle tree. A client would store ephemeral Merkle trees in memory. The most interesting use case to me is protecting executables and DLLs. These will tend to be limited in size, so the cost of Merkle tree construction should be nicely bounded in the typical case. An additional cost would arise if the in-memory tree were to be evicted. We hope that is an infrequent event. If the tree is partially evicted, only some of the file needs to be read back to re-construct it, since we would still have in-memory hashes stored in the interior nodes of the tree that enable the client to verify the portion of the tree that needs to be re-constructed. The short-term purpose of these trees is to add the value of better integrity protection for file systems that find it difficult to store per-file Merkle trees durably. We expect that situation will be temporary for many file systems, though not all. The price that is paid for this extra protection is that it will perform like traditional IMA, as you observed above. This is probably a different cost than reading from flash on a mobile device: a typical NFS client will be less memory- and CPU-constrained than a mobile device, and the cost of reading over NFS on a fast network from the server's cache is not high. The trade-offs here are going to be different. > I suspect that for most clients, it wouldn't be worth it unless the > client can store the ephemeral tree *somewhere* on the client's local > persistent storage, or maybe if it could store the Merkle tree on the > NFS server (maybe via an xattr which contains the pathname to the > Merkle tree relative to the NFS mount point?). The trees could be cached locally for exceptionally large files (eg files larger than the client's physical memory). For smaller files, which I expect will be the typical case, the cost of reading a file will be about the same as reading a Merkle tree. As mentioned in my proposal, the eventual goal is to extend the NFS protocol to store the Merkle tree durably on the server. We will get there eventually. Changing the protocol is a slow process, particularly because it involves consensus among NFS implementers who work on other operating systems besides Linux. >>>> So the only difference here is that the latency-to-first-byte >>>> benefit of a durable Merkle tree would be absent. > > What problem are you most interested in solving? And what cost do you > think the user will be willing to pay in order to solve that problem? NFS users would get full protection of their files from storage to point-of-use, at the same cost as IMA, until some point in the future when NFS can store the trees durably. The same would apply to other filesystems that find storing a full Merkle tree to be a challenge. >> I like the idea, but there are a couple of things that need to happen >> first. Both fs-verity and IMA appended signatures need to be >> upstreamed. > > Eric has sent the pull request fs-verity today. > >> The IMA appended signature support simplifies >> ima_appraise_measurement(), paving the way for adding IMA support for >> other types of signature verification. How IMA will support fs-verity >> signatures still needs to be defined. That discussion will hopefully >> include NFS support. > > As far as using the Merkle tree root hash for the IMA measurement, > what sort of policy should be used for determining when the Merkle > tree root hash should be used in preference to reading and checksuming > the whole file when it is first opened? It could be as simple as, "if > this is a fs-verity, use the fs-verity Merkle root". Is that OK? > > - Ted -- Chuck Lever