On 24/05/2020 21:56, David Sterba wrote: > On Thu, May 14, 2020 at 11:24:15AM +0200, Johannes Thumshirn wrote: >> +User-data >> +~~~~~~~~~ >> + >> +The checksums for the user or file-data are stored in a separate b-tree, the >> +checksum tree. As this tree in itself is authenticated, only the data stored >> +in it needs to be authenticated. This is done by replacing the checksums >> +stored on disk by the cryptographically secure keyed hash algorithm used for >> +the super-block and other meta-data. So each written file block will get >> +checksummed with the authentication key and without supplying the correct key >> +it is impossible to write data on disk, which can be read back without >> +failing the authentication test. If this test is failed, an I/O error is >> +reported back to the user. > > With same key K and same contents of data block B, the keyed hash on two > different filesystems is the same. Ie. there's no per-filesystem salt > (like a UUID) or per-transaction salt (generation, block address). Correct. > > For metadata the per-transaction salt is inherently there as the hash is > calculated with the header included (containing the increasing > generation) and the filesystem UUID (available via blkid) or chunk tree > UUID (not so easy to user to read). > > So there's an obvious discrepancy in the additional data besides the > variable contents of the data and metadata blocks. > > The weakness of the data blocks may aid some attacks (I don't have a > concrete suggestion where and how exatly). Yes but wouldn't this also need a hash that is prone to a known plaintext attack or that has known collisions? But it would probably help in brute-forcing the key K of the filesystem. OTOH fsid, generation and the chunk-tree UUID can be read in plaintext from the FS as well so this would only mitigate a rainbow table like attack, wouldn't it? > > Suggested fix is to have a data block "header", with similar contents as > the metadata blocks, eg. > > struct btrfs_hash_header { > u8 fsid[BTRFS_FSID_SIZE]; > u8 chunk_tree_uuid[BTRFS_UUID_SIZE]; > __le64 generation; > }; > > Perhaps also with some extra item for future extensions, set to zeros > for now. > This addition would be possible, yes. But if we'd add this header to every checksum in the checksum tree it would be an incompatible on-disk format change. We could add this only for authenticated filesystems though, but would this deviation make sense? I need to think more about it (and actually look at the code to see how this could be done).