On Wed, Apr 24, 2024 at 02:25:23PM -0700, Darrick J. Wong wrote: > > For checking whether the bitmap is in use, it's simpler and more efficient to > > just directly check whether vi->hash_block_verified is NULL, as the existing > > code does. Only the code that allocates the bitmap actually needs to be aware > > of the details of when the bitmap gets enabled. > > > > fsverity_caches_blocks() has a similar issue, where it could just be replaced > > with checking vops->read_merkle_tree_block != NULL directly (or equivalently > > vops->drop_merkle_tree_block, which works well in > > fsverity_drop_merkle_tree_block() since that's the function pointer it's > > calling). The WARN_ON_ONCE() should be done in fsverity_create_info(), not > > inlined multiple times into the verification code. > > Ok, I'll move the WARN_ON_ONCE there: > > /* > * If the filesystem implementation supplies Merkle tree content on a > * per-block basis, it must implement both the read and drop functions. > * If it supplies content on a per-page basis, neither should be > * provided. > */ > if (vops->read_merkle_tree_block) > WARN_ON_ONCE(vops->drop_merkle_tree_block == NULL); > else > WARN_ON_ONCE(vops->drop_merkle_tree_block != NULL); Checking ->read_merkle_tree_page would make sense too, right? I.e. we should enforce that one of the following two cases holds: 1. read_merkle_tree_page != NULL && read_merkle_tree_block == NULL && drop_merkle_tree_block == NULL 2. read_merkle_tree_page == NULL && read_merkle_tree_block != NULL && drop_merkle_tree_block != NULL > > > + bytes_to_copy = min_t(u64, end_pos - pos, > > > + params->block_size - offs_in_block); > > > + > > > + err = fsverity_read_merkle_tree_block(inode, &vi->tree_params, > > > + pos - offs_in_block, ra_bytes, &block); > > > + if (err) { > > > fsverity_err(inode, > > > - "Error %d reading Merkle tree page %lu", > > > - err, index); > > > + "Error %d reading Merkle tree block %llu", > > > + err, pos); > > > break; > > > > The error message should go into fsverity_read_merkle_tree_block() so that it > > does not need to be duplicated in its two callers. This would, additionally, > > eliminate the need to introduce the 'err' variable in verify_data_block(). > > Do you want that to be a separate cleanup patch? I think it makes sense to do it in this patch, since this patch introduces fsverity_read_merkle_tree_block(). This patch also adds the 'err' variable back to verify_data_block(), which may mislead people into thinking it's the actual error code that users see (this has happened before) --- that could be avoided. > > > +int fsverity_read_merkle_tree_block(struct inode *inode, > > > + const struct merkle_tree_params *params, > > > + u64 pos, unsigned long ra_bytes, > > > + struct fsverity_blockbuf *block) > > > +{ > > > + const struct fsverity_operations *vops = inode->i_sb->s_vop; > > > + unsigned long page_idx; > > > + struct page *page; > > > + unsigned long index; > > > + unsigned int offset_in_page; > > > + > > > + if (fsverity_caches_blocks(inode)) { > > > + block->verified = false; > > > + return vops->read_merkle_tree_block(inode, pos, ra_bytes, > > > + params->log_blocksize, block); > > > + } > > > + > > > + index = pos >> params->log_blocksize; > > > > Should the fourth parameter to ->read_merkle_tree_block be the block index > > (which is computed above) instead of log_blocksize? XFS only uses > > params->log_blocksize to compute the block index anyway. > > I don't know. XFS is infamous for everyone having a different opinion > and developers being unable to drive a consensus and being able to get a > patchset to completion. So: > > Andrey wrote an implementation that used the buffer cache and used block > indexes to load/store from the xattr structure. I didn't like that > because layering violation. > > willy suggested hanging an xarray off struct xfs_inode and using that to > cache merkle tree blocks. For that design, we want the xarray indexes > in units of blocks to conserve space, which means passing in (pos, > log_blocksize) or a direct block index. > > Christoph then suggested using a per-AG rhashtable to reduce per-inode > memory overhead, in which case the lookup key can be either (inumber, > pos) or (inumber, block index). This is a better design if there aren't > really going to be that many xfs inodes with verity enabled, though we > lose per-inode sharding of the merkle blocks. > > Dave thinks that verity inodes could constitute the majority of xfs > inodes some day and it should be in the inode instead. > > Andrey and I do not have crystal balls, we have no idea how this > dchinner/hch disagreement will play out. Earlier I got the sense that > you wanted to move towards expressing all the merkle tree info in units > of bytes. > > In a lot of ways it would be preferable to move to block indexes > instead, since there's never going to be a meaningful access to merkle > position 1337. But you're the maintainer, so it's up to you. :) It makes sense to cache the blocks by index regardless of whether you get passed the index directly. So this is just a question of what the ->read_merkle_tree_block function prototype should be. Currently the other fsverity_operations functions are just aligned with what the filesystems actually need: ->read_merkle_tree_page takes a page index, whereas ->write_merkle_tree_block takes a pos and size. xfs_fsverity_read_merkle() needs pos, index, and size, and fsverity_read_merkle_tree_block() computes the index already. So that's why I'd probably go with pos, index, and size instead of pos, log2_blocksize, and size. There's no right answer though. - Eric