On Wed, Apr 24, 2024 at 10:08:58PM +0000, Eric Biggers wrote: > On Wed, Apr 24, 2024 at 02:25:23PM -0700, Darrick J. Wong wrote: > > > For checking whether the bitmap is in use, it's simpler and more efficient to > > > just directly check whether vi->hash_block_verified is NULL, as the existing > > > code does. Only the code that allocates the bitmap actually needs to be aware > > > of the details of when the bitmap gets enabled. > > > > > > fsverity_caches_blocks() has a similar issue, where it could just be replaced > > > with checking vops->read_merkle_tree_block != NULL directly (or equivalently > > > vops->drop_merkle_tree_block, which works well in > > > fsverity_drop_merkle_tree_block() since that's the function pointer it's > > > calling). The WARN_ON_ONCE() should be done in fsverity_create_info(), not > > > inlined multiple times into the verification code. > > > > Ok, I'll move the WARN_ON_ONCE there: > > > > /* > > * If the filesystem implementation supplies Merkle tree content on a > > * per-block basis, it must implement both the read and drop functions. > > * If it supplies content on a per-page basis, neither should be > > * provided. > > */ > > if (vops->read_merkle_tree_block) > > WARN_ON_ONCE(vops->drop_merkle_tree_block == NULL); > > else > > WARN_ON_ONCE(vops->drop_merkle_tree_block != NULL); > > Checking ->read_merkle_tree_page would make sense too, right? I.e. we should > enforce that one of the following two cases holds: > > 1. read_merkle_tree_page != NULL && > read_merkle_tree_block == NULL && > drop_merkle_tree_block == NULL > > 2. read_merkle_tree_page == NULL && > read_merkle_tree_block != NULL && > drop_merkle_tree_block != NULL <nod> if (vops->read_merkle_tree_page) WARN_ON_ONCE(vops->read_merkle_tree_block != NULL || vops->drop_merkle_tree_block != NULL); else WARN_ON_ONCE(vops->read_merkle_tree_block == NULL || vops->drop_merkle_tree_block == NULL); > > > > + bytes_to_copy = min_t(u64, end_pos - pos, > > > > + params->block_size - offs_in_block); > > > > + > > > > + err = fsverity_read_merkle_tree_block(inode, &vi->tree_params, > > > > + pos - offs_in_block, ra_bytes, &block); > > > > + if (err) { > > > > fsverity_err(inode, > > > > - "Error %d reading Merkle tree page %lu", > > > > - err, index); > > > > + "Error %d reading Merkle tree block %llu", > > > > + err, pos); > > > > break; > > > > > > The error message should go into fsverity_read_merkle_tree_block() so that it > > > does not need to be duplicated in its two callers. This would, additionally, > > > eliminate the need to introduce the 'err' variable in verify_data_block(). > > > > Do you want that to be a separate cleanup patch? > > I think it makes sense to do it in this patch, since this patch introduces > fsverity_read_merkle_tree_block(). This patch also adds the 'err' variable back > to verify_data_block(), which may mislead people into thinking it's the actual > error code that users see (this has happened before) --- that could be avoided. Let's rename it "bad" in verify.c then. I'm assuming that you /do/ want the error code in fsverity_read_merkle_tree? > > > > +int fsverity_read_merkle_tree_block(struct inode *inode, > > > > + const struct merkle_tree_params *params, > > > > + u64 pos, unsigned long ra_bytes, > > > > + struct fsverity_blockbuf *block) > > > > +{ > > > > + const struct fsverity_operations *vops = inode->i_sb->s_vop; > > > > + unsigned long page_idx; > > > > + struct page *page; > > > > + unsigned long index; > > > > + unsigned int offset_in_page; > > > > + > > > > + if (fsverity_caches_blocks(inode)) { > > > > + block->verified = false; > > > > + return vops->read_merkle_tree_block(inode, pos, ra_bytes, > > > > + params->log_blocksize, block); > > > > + } > > > > + > > > > + index = pos >> params->log_blocksize; > > > > > > Should the fourth parameter to ->read_merkle_tree_block be the block index > > > (which is computed above) instead of log_blocksize? XFS only uses > > > params->log_blocksize to compute the block index anyway. > > > > I don't know. XFS is infamous for everyone having a different opinion > > and developers being unable to drive a consensus and being able to get a > > patchset to completion. So: > > > > Andrey wrote an implementation that used the buffer cache and used block > > indexes to load/store from the xattr structure. I didn't like that > > because layering violation. > > > > willy suggested hanging an xarray off struct xfs_inode and using that to > > cache merkle tree blocks. For that design, we want the xarray indexes > > in units of blocks to conserve space, which means passing in (pos, > > log_blocksize) or a direct block index. > > > > Christoph then suggested using a per-AG rhashtable to reduce per-inode > > memory overhead, in which case the lookup key can be either (inumber, > > pos) or (inumber, block index). This is a better design if there aren't > > really going to be that many xfs inodes with verity enabled, though we > > lose per-inode sharding of the merkle blocks. > > > > Dave thinks that verity inodes could constitute the majority of xfs > > inodes some day and it should be in the inode instead. > > > > Andrey and I do not have crystal balls, we have no idea how this > > dchinner/hch disagreement will play out. Earlier I got the sense that > > you wanted to move towards expressing all the merkle tree info in units > > of bytes. > > > > In a lot of ways it would be preferable to move to block indexes > > instead, since there's never going to be a meaningful access to merkle > > position 1337. But you're the maintainer, so it's up to you. :) > > It makes sense to cache the blocks by index regardless of whether you get passed > the index directly. So this is just a question of what the > ->read_merkle_tree_block function prototype should be. > > Currently the other fsverity_operations functions are just aligned with what the > filesystems actually need: ->read_merkle_tree_page takes a page index, whereas > ->write_merkle_tree_block takes a pos and size. xfs_fsverity_read_merkle() > needs pos, index, and size, and fsverity_read_merkle_tree_block() computes the > index already. So that's why I'd probably go with pos, index, and size instead > of pos, log2_blocksize, and size. There's no right answer though. How about just index and size? I think XFS will work just fine without the byte position since it's using that as a key for the cache and the ondisk xattrs anyway. Even if we can't decide on which data structure to use for caching. ;) --D > > - Eric >