Re: [PATCH 03/18] fsverity: convert verification to use byte instead of page offsets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 01, 2024 at 12:33:14AM -0700, Christoph Hellwig wrote:
> > +	const u64 end_pos = min(pos + length, vi->tree_params.tree_size);
> > +	struct backing_dev_info *bdi = inode->i_sb->s_bdi;
> > +	const u64 max_ra_bytes = min((u64)bdi->io_pages << PAGE_SHIFT,
> > +				     ULONG_MAX);
> > +	const struct merkle_tree_params *params = &vi->tree_params;
> 
> bdi->io_pages is really a VM readahead concept.  I know this is existing
> code, but can we rething why this is even used here?

I would get rid of it entirely for the merkle-by-block case, since we'd
have to walk the xattr tree again just to find the next block.  XFS
ignores the readahead value entirely.

I think this only makes sense for the merkle-by-page case, and only
because ext4 and friends are stuffing the merkle data in the posteof
parts of the file mapping.

And even then, shouldn't we figure out the amount of readahead going on
and only ask for enough readahead of the merkle tree to satisfy that
readahead?

> > +	unsigned int offs_in_block = pos & (params->block_size - 1);
> >  	int retval = 0;
> >  	int err = 0;
> >  
> > +	 * Iterate through each Merkle tree block in the requested range and
> > +	 * copy the requested portion to userspace. Note that we are returning
> > +	 * a byte stream.
> >  	 */
> > +	while (pos < end_pos) {
> > +		unsigned long ra_bytes;
> > +		unsigned int bytes_to_copy;
> > +		struct fsverity_blockbuf block = { };
> >  
> > +		ra_bytes = min_t(unsigned long, end_pos - pos, max_ra_bytes);
> > +		bytes_to_copy = min_t(u64, end_pos - pos,
> > +				      params->block_size - offs_in_block);
> > +
> > +		err = fsverity_read_merkle_tree_block(inode, &vi->tree_params,
> > +						      pos - offs_in_block,
> > +						      ra_bytes, &block);
> 
> Maybe it's just me, but isn't passing a byte offset to a read...block
> routine a bit weird and this should operate on the block number instead?

I would think so, but here's the thing -- the write_merkle_tree_block
functions get passed pos and length in units of bytes.  Maybe fsverity
should clean be passing (blockno, blocksize) to the read and write
functions?  Eric said he could be persuaded to change it:

https://lore.kernel.org/linux-xfs/20240307224903.GE1799@sol.localdomain/

> > +		if (copy_to_user(buf, block.kaddr + offs_in_block, bytes_to_copy)) {
> 
> And the returned/passed value should be a kernel pointer to the start
> of the in-memory copy of the block?
> to 

<shrug> This particular callsite is reading merkle data on behalf of an
ioctl that exports data.  Maybe we want the filesystem's errors to be
bounced up to userspace?

> > +static bool is_hash_block_verified(struct inode *inode,
> > +				   struct fsverity_blockbuf *block,
> >  				   unsigned long hblock_idx)
> 
> Other fsverify code seems to use the (IMHO) much more readable
> two-tab indentation for prototype continuations, maybe stick to that?

I'll do that, if Eric says so. :)

> >
> >  {
> > +	struct fsverity_info *vi = inode->i_verity_info;
> > +	struct page *hpage = (struct page *)block->context;
> 
> block->context is a void pointer, no need for casting it.

Eric insisted on it:
https://lore.kernel.org/linux-xfs/20240306035622.GA68962@sol.localdomain/

> > +	for (; level > 0; level--)
> > +		fsverity_drop_merkle_tree_block(inode, &hblocks[level - 1].block);
> 
> Overlh long line here.  But the loop kinda looks odd anyway with the
> exta one off in the body instead of the loop.

I /think/ that's a side effect of reusing the value of @level after the
first loop fails as the initial conditions of the unwind loop.  AFAICT
it doesn't leak, but it's not entirely straightforward.

--D




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux