Re: [PATCH] generic/692: Generalize the test for non-4k merkle tree block sizes

Eric Biggers <ebiggers@xxxxxxxxxx> · Wed, 11 Jan 2023 12:51:52 -0800

On Wed, Jan 11, 2023 at 11:23:14PM +0530, Ojaswin Mujoo wrote:
> Due to the assumtion of Merkle tree block size being 4k, the size calculated
> for second test was taking way too long to hit EFBIG in case of bigger block
> sizes like 64k. Fix this by genralizing the calculation.
> 
> Signed-off-by: Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx>
> ---
>  tests/generic/692 | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/generic/692 b/tests/generic/692
> index d6da734b..0a354802 100755
> --- a/tests/generic/692
> +++ b/tests/generic/692
> @@ -54,15 +54,24 @@ _fsv_enable $fsv_file |& _filter_scratch
>  # (MAX) to be in the middle of L0 -- ideally near the beginning of L0 so that we
>  # don't have to write many blocks before getting an error.
>  #
> -# With SHA-256 and 4K blocks, there are 128 hashes per block.  Thus, ignoring
> -# padding, L0 is 1/128 of the file size while the other levels in total are
> -# 1/128**2 + 1/128**3 + 1/128**4 + ... = 1/16256 of the file size.  So still
> +# For example, with SHA-256 and 4K blocks, there are 128 hashes per block. Thus,
> +# ignoring padding, L0 is 1/128 of the file size while the other levels in total
> +# are 1/128**2 + 1/128**3 + 1/128**4 + ... = 1/16256 of the file size. So still
>  # ignoring padding, for L0 start exactly at MAX, the file size must be s such
> -# that s + s/16256 = MAX, i.e. s = MAX * (16256/16257).  Then to get a file size
> +# that s + s/16256 = MAX, i.e. s = MAX * (16256/16257). Then to get a file size
>  # where MAX occurs *near* the start of L0 rather than *at* the start, we can
>  # just subtract an overestimate of the padding: 64K after the file contents,
> -# then 4K per level, where the consideration of 8 levels is sufficient.
> -sz=$(echo "scale=20; $max_sz * (16256/16257) - 65536 - 4096*8" | $BC -q | cut -d. -f1)
> +# then 4K per level, where the consideration of 8 levels is sufficient. Below
> +# code generalizes this logic for all merkle tree sizes.
> +bs=$FSV_BLOCK_SIZE
> +hash_size=32   # SHA-256
> +hash_per_block=$(echo "scale=20; $bs/($hash_size)" | $BC -q)
> +a=$(echo "scale=20; 1/($hash_per_block^2)" | $BC -q)
> +r=$(echo "scale=20; 1/$hash_per_block" | $BC -q)
> +treesize_without_l1=$(echo "scale=20; $a/(1-$r)" | $BC -q)
> +sz=$(echo "scale=20; $max_sz/(1+$treesize_without_l1)" | $BC -q)
> +# adjust $sz so we are more likely to hit EFBIG while building level 1
> +sz=$(echo "scale=20; $sz - 65536 - $bs*8" | $BC -q | cut -d. -f1)
>  _fsv_scratch_begin_subtest "still too big: fail on first invalid merkle block"
>  truncate -s $sz $fsv_file
>  _fsv_enable $fsv_file |& _filter_scratch

Thanks!  I'd like to improve the explanation of the calculation, and fix up a
few other things, so I ended up just sending out an updated version of this
patch --- I hope that's okay with you.  Can you take a look?
https://lore.kernel.org/r/20230111204739.77828-1-ebiggers@xxxxxxxxxx

- Eric