Re: [PATCH 10/13] fsverity: pass the zero-hash value to the implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 24, 2024 at 07:19:50PM +0000, Eric Biggers wrote:
> On Wed, Apr 24, 2024 at 12:02:46PM -0700, Darrick J. Wong wrote:
> > On Thu, Apr 04, 2024 at 10:57:50PM -0400, Eric Biggers wrote:
> > > On Fri, Mar 29, 2024 at 05:35:17PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > > > 
> > > > Compute the hash of a data block full of zeros, and then supply this to
> > > > the merkle tree read and write methods.  A subsequent xfs patch will use
> > > 
> > > This should say "hash of a block", not "hash of a data block".  What you
> > > actually care about is the hash of a Merkle tree block, not the hash of a data
> > > block.  Yet, there is no difference in how the hashes are calculated for the two
> > > types of blocks, so we should simply write "hash of a block".
> > 
> > I think I could go further with the precision of the description --
> > 
> > "Compute the hash of one filesystem block's worth of zeroes.  Any merkle
> > tree block containing only this hash can be elided at write time, and
> > its contents synthesized at read time."
> > 
> > I don't think this is going to happen very often above the leaf levels
> > of the merkle tree, but as written there's nothing to prevent the
> > elision of internal nodes.  Also note that the elision can happen for
> > internal nodes even when merkle tree blocksize != i_blocksize.
> > 
> > > > diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
> > > > index de8798f141d4a..195a92f203bba 100644
> > > > --- a/fs/verity/fsverity_private.h
> > > > +++ b/fs/verity/fsverity_private.h
> > > > @@ -47,6 +47,8 @@ struct merkle_tree_params {
> > > >  	u64 tree_size;			/* Merkle tree size in bytes */
> > > >  	unsigned long tree_pages;	/* Merkle tree size in pages */
> > > >  
> > > > +	u8 zero_digest[FS_VERITY_MAX_DIGEST_SIZE]; /* hash of zeroed data block */
> > > 
> > > Similarly, "block" instead of "data block".
> > 
> > How about "the hash of an i_blocksize-sized buffer of zeroes" for all
> > three?
> 
> It's the Merkle tree block size, not the filesystem block size.  Or did you
> actually intend for this to use the filesystem block size?

I actually did intend for this to be the fs block size, not the merkle
tree block size.  It's the bottom level that I care about shrinking.
Let's say that data[0-B] are the data blocks:

root
 +-internal0
 |   +-leaf0
 |   |   +-data0
 |   |   +-data1
 |   |   `-data2
 |   `-leaf1
 |       +-data3
 |       +-data4
 |       `-data5
 `-internal1
     +-leaf2
     |   +-data6
     |   +-data7
     |   `-data8
     `-leaf3
         +-data9
         +-dataA
         `-dataB

(thanks to https://arthursonzogni.com/Diagon/#Tree )

If data[3-5] are completely zeroes (unwritten blocks, sparse holes,
etc.) then I want to skip writing leaf1 of the merkle tree to disk.

If it happens that the hashes of leaf[0-1] match hash(data3) then it's
frosting on top (as it were) that we can also skip internal0.  However,
the merkle tree has a high fanout factor (4096/32==128 in the common
case), so I care /much/ less about eliding those levels.

> In struct merkle_tree_params, the "block size" is always the Merkle tree block
> size, so the type of block size seems clear in that context.  My complaint was
> just that it used the term "data block" to mean a block that is not necessarily
> a file contents block (which is what "data block" means elsewhere).

Hm.  Given the confusion, would it help if I said that zero_digest
should only be used to elide leaf nodes of the merkle tree that hash the
contents of file content blocks?  Or is "the hash of an
i_blocksize-sized buffer of zeroes" sufficient?

What do you think of the commit message saying:

"Compute the hash of one filesystem block's worth of zeroes.  Any merkle
tree leaf block containing only this hash can be elided at write time,
and its contents synthesized at read time.

"Let's pretend that there's a file containing six data blocks and whose
merkle tree looks roughly like this:

root
 +--leaf0
 |   +--data0
 |   +--data1
 |   `--data2
 `--leaf1
     +--data3
     +--data4
     `--data5

"If data[0-2] are sparse holes, then leaf0 will contain a repeating
sequence of @zero_digest.  Therefore, leaf0 need not be written to disk
because its contents can be synthesized."

--D

> 
> - Eric
> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux