On Thu, Jan 09, 2025 at 08:44:03AM +0100, Christoph Hellwig wrote: > On Wed, Jan 08, 2025 at 11:39:08PM -0800, Darrick J. Wong wrote: > > > > > > > > Maybe we can used it for $HANDWAVE is not a good idea. > > > > > > > Hash based verification works poorly for mutable files, so we'd > > > > rather have a really good argument for that. > > > > > > hmm, why? Not sure I have an understanding of this > > > > Me neither. I can see how you might design file data block checksumming > > to be basically an array of u32 crc[nblocks][2]. Then if you turned on > > stable folios for writeback, the folio contents can't change so you can > > compute the checksum of the new data, run a transaction to set > > crc[nblock][0] to the old checksum; crc[nblock][1] to the new checksum; > > and only then issue the writeback bio. > > Are you (plural) talking about hash based integrity protection ala > fsverity or checksums. While they look similar in some way those are > totally different things! If we're talking about "simple" data > checksums both post-EOF data blocks and xattrs are really badly wrong, > as the checksum need to be assigned with the physical block due to > reflinks, not the file. The natural way to implement them for XFS > if we really wanted them would be a new per-AG/RTG metabtree that > is indexed by the agblock/rgblock. Agreed. For simple things like crc32 I would very much rather we stuff them in a per-group btree because we only have to store the crc once in the filesystem and now it protects all owners of that block. In theory the double-crc scheme would work fine for untorn data block writes, I think. I only see a reason for per-file hash structures in the dabtree if the hashes themselves have some sort of per-file configuration (like distributor-signed merkle trees or whatever). I asked Eric Biggers if he had any plans for mutable fsverity files and he said no. > > But I don't think that works if you crash. At least one of the > > checksums might be right if the device doesn't tear the write, but that > > gets us tangled up in the untorn block writes patches. If the device > > does not guarantee untorn writes, then you probably have to do it the > > way the other checksumming fses do it -- write to a new location, then > > run a transaction to store the checksum and update the file mapping. > > Yes. That's why for data checksums you'd always need to either write > out of place (as with the pending zoned allocator) or work with intent / > intent done items. That's assuming you can't offload the atomicy to the > device by uisng T10 PI or at least per-block metadata that stores the > checksum. Which would also remove the need for any new file system > data struture, but require enterprise hardware that supports PI or > metadata. <nod> --D